W3C home > Mailing lists > Public > public-vocabs@w3.org > October 2013

RE: schema.org and proto-data, was Re: schema.org as reconstructed from the human-readable information at schema.org

From: John Flynn <jflynn12@verizon.net>
Date: Wed, 30 Oct 2013 15:46:36 -0400
To: "'LeVan,Ralph'" <levan@oclc.org>, "'Peter Patel-Schneider'" <pfpschneider@gmail.com>, "'Christian Bizer'" <chris@bizer.de>
Cc: "'Guha'" <guha@google.com>, "'Martin Hepp'" <martin.hepp@unibw.de>, "'W3C Vocabularies'" <public-vocabs@w3.org>, <jflynn12@verizon.net>
Message-id: <01d201ced5a8$bee27fe0$3ca77fa0$@net>
This seems to indicate there is an opportunity for some entity to do the
necessary processing of schema.org markup and make the results available to
consumers as a product.
 
Several months ago I asked the following questions and the reply was that
the data was going to be released at an upcoming conference. Is this data
available yet?
 
Is data available for these questions related to schema.org?
- How many property instances have currently been marked up?
- What is the distribution of property instances across types?
-  What is the rate of growth of property instances (by month)?
- What is the total number of web sites that have created property instance
markup?
- What is the rate of growth of the number of web sites that have created
property instance markup?
 
John
 
From: LeVan,Ralph [mailto:levan@oclc.org] 
Sent: Wednesday, October 30, 2013 2:58 PM
To: Peter Patel-Schneider; Christian Bizer
Cc: Guha; Martin Hepp; W3C Vocabularies
Subject: RE: schema.org and proto-data, was Re: schema.org as reconstructed
from the human-readable information at schema.org
 
"Consume" is such a slippery word.  If all you want to do it read a lot of
possibly malformed HTML, then you can do that with a lot of tools.  But, the
more data you hope to intelligently extract from that HTML, the more
intellectual resources you're going to have to throw at interpreting the
vast amount of garbage you get.  So, the big boys are likely to do better
than you or I at the vast majority of the cruft out there.  But, we can do
pretty well with either low expectations or by getting our data from domains
that map well to our own models/business needs.
 
Your mileage may vary.
 
Ralph
 
 
Received on Wednesday, 30 October 2013 19:47:22 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:32 UTC