W3C home > Mailing lists > Public > public-schemaorg@w3.org > June 2017

Re: [schemaorg/schemaorg] Guidance or vocab needed regarding Real Estate (property purchase, rental etc.) (#241)

From: Martin Hepp <mfhepp@gmail.com>
Date: Sat, 24 Jun 2017 12:09:35 +0200
Cc: schemaorg/schemaorg <schemaorg@noreply.github.com>, Mention <mention@noreply.github.com>, "schema.org Mailing List" <public-schemaorg@w3.org>
Message-Id: <C1F3D2F2-8675-4B68-99C2-B1C9713B3D2A@gmail.com>
To: schemaorg/schemaorg <reply+000a3e062daf900fdda312b37429971050f61a9c74a2498192cf0000000115650ef992a16>
Dave:

> Anyways, glad my schema markup is coming along... but still frustrated the committee and team keeps back-burnering this as it can be really powerful. For a response to say look at this (used for hotels, or rentals, etc) just shouldn't be used for general real estate - it's too large a category to not have been done right up front is all I am sharing. So all in favor of seeing a final approved spec go live. The MLS contains hundreds of valuable data points and we'd love to help graph the most important ones! Happy to provide additional data points here too of those most used. Whatever I can do to help. Thanks everyone.
> 

I would like to clarify a few points in here: 

1. Extending a Web-scale vocabulary like schema.org that shall be applicable for all kinds of cultural contexts and thousands of applications is quite a different thing than designing a conceptual data model in an industry ommittee-fashion, where you approximately know the usages of the data and the user community.

The most important differences are that 
a) every new element in schema.org adds complexity and cost to the users of schema.org, in terms of searching the documentation, maintaining examples and validators, etc.
b) even a simply addition can easily have side-effects across other parts of the vocabulary, like creating redundancies (e.g. we already have an element for the same meaning with just a different name), conflicts in the conceptual model (e.g. not reusing QuantitativeValue for quantititative properts or breaking the clear Agent-Promise-Object separation in the underlying GoodRelations model for offers), or blocking generic names for too narrow domain-specific purposes in the global name space.

It is non-trivial to develop domain extensions from existing standards, and it is very different from dumping in an existing data standard, however elaborated that is.

2. Developing a fully-fledged extension proposal is a lot of effort. Having lead the accommodation and auto extensions and contributed to many others, I can tell you that a modest extension of ca. 20 types and properties is easily one person-year of effort, if you factor in all work on documentation and examples and the tedious (but important) process of discussing and polishing the submission. It also takes time - 1 - 1.5 years are reasonable.

Now, the resources for this must come from somewhere. Most work on schema.org is done by volunteers or external domain experts. Sometimes they can work on extensions as part of ongoing projects, but most of the time, they cannot, and the people who have the expertise in this are often not the ones who want an extension badly.

If the real estate industry wants this to progress, they could dedicate resources in terms of man-power or money to respective projects.

As a starting point and to get used to the process: There are hundreds of schema.org elements that lack examples in at least some syntax. It is fairly straightforward to create such examples, make sure they validate, and submit them as pull requests in Github.

3. Many people want to extend schema.org because they want Google and other major search engines to give some preferential treatment to their content.
This is a fallacy: An addition to schema.org is not directly linked to any effects in major search engines. You need to find stakeholders inside the big search engines who want to use your data structure, and you need a critical mass of data. The effort does not stop with adding a new type or property to schema.org, it's not even halfway, and neither a precondition (see below).

Also, a new property or type is only needed if Google or any other major search engine *needs the distinction for a purpose*, not just because it gives a nicer model.

If, for instance, Google had rich snippet type for real estate, it does not need 20+ types for specific forms of apartments or houses if the rendering algorithm for all of those is the same and if the relevance for a query can be determined from textual content.

Extensions to schema.org are also not necessary if the information can be extracted in other ways, e.g. from HTTP prototol meta-data, media-object meta-data (e.g. in videos), or from tabular content in HTML.

4. You do not have to put everything into schema.org's core or hosted extensions:

a) Anybody can develop and host an external schema.org extension. If this creates a critical mass of data, it's likely that major search engines will make use of that.
See 
    http://schema.org/docs/extension.html
    http://blog.schema.org/2016/02/gs1-milestone-first-schemaorg-external.html

b) You can use unique identifiers for additional properties from external standards using the respective feature for schema:PropertyValue, which is
    http://schema.org/propertyID

Like so:

<!-- Product: Property ID for clarifying the meaning of a property: Code from eCl@ss Standard -->
<!-- The Property code 02-AAM226 is for "USB interface present" in eCl@ss 8.1 -->
<div itemscope itemtype="http://schema.org/Product">
  <img itemprop="image" src="camera123.jpg" alt="" />
  <span itemprop="name">Digital Camera 123</span>
  <div itemprop="additionalProperty" itemscope itemtype="http://schema.org/PropertyValue">
      <span itemprop="name">USB Interface</span>:
      <meta itemprop="value" content="True">Yes
      <meta itemprop="propertyID" content="eclass81:02-AAM226">
  </div>
</div>

The only thing you need to do is define a unique prefix for each version of your standard, like mls<nn>:.

The same can be done for types with schema:additionalType: You can either define an HTTP or HTTPS namespace for your types, like

   http://mls.org/vocabulary/

then use the types to be more explicit about an entity, like so

<div itemscope itemtype="http://schema.org/ABC">
    <div itemprop="additionalProperty" itemscope itemtype="http://mls.org/vocabulary/CondoWithBeachfrontView">

Or, you can use URNs and define and register a URN namespace identifier (NID), see 

    https://tools.ietf.org/html/rfc6648#section-4

Then you can use

<div itemscope itemtype="http://schema.org/ABC">
    <div itemprop="additionalProperty" itemscope itemtype="urn:mls:CondoWithBeachfrontView">

without the need to actually host definitions of your standard elements on a Web server (but also no opportunity to upgrade thereto later).

By these straightforward and fully implemented mechanisms in schema.org, you can flexibly integrate any external type and property into you markup without validation errors etc. 

If there is enough data of that kind, search engines will care. If there isn't, they won't even if the types are in schema.org. No need to push things into schema.org in the hope of SEO effects. The amount of actual data is the bottleneck, and nothing else.

This email is not meant hostile in any way, and I agree schema.org could do better in explaining such things. But there is just so little time and so much work to do.

It really does not take much to contribute to schema.org by more examples, spotting mistakes, etc. AND submitting fixes as Github pull requests.

So if your statement "Happy to provide additional data points here too of those most used. Whatever I can do to help. " can be read as such, that would be a good starting point.

Best wishes

Martin
Received on Saturday, 24 June 2017 10:10:08 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:12:35 UTC