Re: schema.org and proto-data, was Re: schema.org as reconstructed from the human-readable information at schema.org

Well, sure, getting more information into easy-to-consume form is a great
idea, and there are paths towards this goal.

However, my question was whether consuming the data that is already in
schema.org fields requires the resources of a major search company.   I
would certainly hope not, but some posts here seemed to point that way.

peter



On Wed, Oct 30, 2013 at 1:57 AM, Christian Bizer <chris@bizer.de> wrote:

> Hi Peter,****
>
> ** **
>
> while I agree that better documentation and examples are always a plus, I
> think the problem lies elsewhere.****
>
> ** **
>
> Let’s take the example of  JobPostings again. Schema.org defines lots of
> nice properties for describing job postings including “skills”,
> “qualifications”, and “responsibilities”. But these properties are not used
> by the data providers which describe job postings mostly (50% of the sites
> that we examined) using the properties “title”, “jobLocation”, and
> “description”.****
>
> ** **
>
> I think that the reason for this are the schemata used by most of today’s
> HR databases. All of these databases are likely to have a job title and job
> description field, but many won’t have skills, qualifications, and
> responsibilities fields and also the departments of the companies deliver
> job postings as free-text to the HR department and not nicely split into
> different fields.****
>
> ** **
>
> So what do you do as a webmaster in charge of publishing your companies
> job postings on the Web?****
>
> ** **
>
> You edit the PHP-script or other script that produces the HTML pages and
> add Schema.org markup. This is a 10 minutes job.****
>
> Convincing all the departments of your company to deliver job postings to
> you in a different, more structured format would be a large project and the
> departments are likely not to cooperate as they don’t see the benefits of
> the whole endeavor.****
>
> ** **
>
> So the problem is not missing documentation or that the webmaster is
> stupid, but that the webmaster currently cannot do anything about it.****
>
> ** **
>
> I think the adoption path of the more specialized properties will be as
> follows:****
>
> ** **
>
> **1.       **Many websites roughly markup their content using a minimal
> set of schema.org terms. This is happening now.****
>
> **2.       **The major search engines like Google extract “skills”,
> “qualifications”, and “responsibilities” from the free-text of the
> description field using NLP techniques and start providing sophisticated
> job search features (similar to the features provided by specialized job
> portals today).****
>
> **3.       **The departments of our example company recognize that the
> search engines make errors in guessing the features from the free-text and
> that their job postings are thus harder to find than the job postings of a
> competitor.****
>
> **4.       **Thus, they ask the HR or IT department what to about this
> and a process is started inside the company to capture job postings in a
> more structured way and to extent the current HR database with the required
> fields for this.****
>
> ** **
>
> So the major driver for getting more structured data onto the Web are
> mainstream applications consuming it. The rich snippets provided by search
> engines today are a nice start, but I honestly hope that the major search
> engines are already working on features such as improved job search and
> that such features will be deployed soon.****
>
> ** **
>
> Especially for the job market, this is beneficial for everybody. Job
> seekers get better market transparency as they don’t need to visit
> different job portals anymore, but can find all job postings in a single
> portal (the search engine). For companies offering jobs this is also better
> as their add reaches more people and as they don’t need to pay portals like
> Monster or StepStone thousands of dollar for the add anymore.****
>
> ** **
>
> Cheers,****
>
> ** **
>
> Chris****
>
> ** **
>
>
>

Received on Wednesday, 30 October 2013 18:50:45 UTC