Re: Schema.org considered helpful from Leif Warner on 2011-06-17 (public-lod@w3.org from June 2011)

From: Leif Warner <abimelech@gmail.com>
Date: Fri, 17 Jun 2011 10:52:14 -0700
To: public-lod@w3.org
Message-ID: <BANLkTiknD4q8FnnQOyTzQam_T0EZPiTWWg@mail.gmail.com>
You've lost me there - their own example they give on schema.org for RDFa is
less verbose than the microdata, and could be made even less so.
http://schema.org/docs/datamodel.html
What costs are you talking about being incurred?  Microdata just looks like
RDFa with a couple renames, explicit item scope, and support for prefixes
removed.
-Leif Warner

On Fri, Jun 17, 2011 at 12:52 AM, Steve Harris <steve.harris@garlik.com>wrote:

> I'm sure that some of these points were relevant at some level, but I
> suspect that's not the key reason.
>
> At some point, the team working on the internal project would have to go to
> the divisional CTO and/or CIO in charge of operations and ask permission to
> deploy the code on the production systems. They don't give a damn how
> interesting the technology is, just want to know how much it's going to cost
> in bps of bandwidth, bytes of storage, and microseconds of CPU per page. The
> answer for RDFa is probably an order of magnitude higher than the
> schema.org format, and could equate to tens of millions of dollars per
> year of extra cost, and will show little to no extra revenue (schema.orgv's RDFa), even in the medium term. No chance.
>
> - Steve
>
> On 2011-06-17, at 01:02, Mischa Tuffield wrote:
>
> Hello,
>
> *excuse a little top-posting before comments coming inline ...
>
> Great email Harry, I agree with your sentiment that schema.org shouldn't
> be perceived as a massive thread to the SW community. If anything I find and
> welcome the move, surely it will widen the audience of web-developers
> interested in creating and authoring structure data to the web? A lot of
> people write code, and work for companies who are heavily reliant on
> pleasing Search Engines - SEO is big business. Let users get on with
> building stuff with microdata/schema.org, and who knows they might even
> come round to using the various W3C SW specs when they find their needs
> change, when they find they want to interoperate with data whose primary
> focus isn't for human consumption or SEO.
>
> RDF satisfies more than one use-case, it is more than a SEO tool.
> Personally, I make daily use of RDF, http, SPARQL (to name a few) within the
> software platform we have built at Garlik (note that I have been too lazy to
> use other email address) and it makes sense to us as a business, as we make
> good use of developing software without being constrained by a database
> schema in a relational database and we can pull in data arbitrarily. In
> summary, RDF via  GoodRelations in RDFa has shown that the work has made an
> impact in the world of Search Engines, RDF/SPARQL is being used to power
> applications in a number of companies big and small, RDF is being outputted
> by major commercial sales houses, non-computer scientists are using it to
> represent their scientific data, governments are using in the shape of
> linked data/SPARQL, this is all good stuff ... more than one use-case -
> fundamentally engrained with the notion of interoperability and the
> standardised representation of data (awesome stuff!).
>
> I am not trying to have a dig here about microdata or schema.org, or the
> technology stack which builds on the aforementioned, I simply don't know
> enough about it to comment. I do know that the SW technology stack is
> growing strong though, and it is an open technology stack - being an
> optimist I feel that open stuff will prevail.
>
> <snip itemtype="http://example.com/Annotation"/>
> <!-- hehe -->
> *
> *
> On 16 Jun 2011, at 22:09, Harry Halpin wrote:
>
> I've been watching the community response to schema.org for the last
> bit of time. Overall, I think we should clarify why people are upset.
> First, there should be no reason to be upset that the major search
> engines went off and created their own vocabularies. According to the
> argument of decentralized extensibility, schema.org *exactly* what
> Google/Yahoo!/Microsoft are supposed to be doing. It's a
> straightfoward site that clearly for how the average Web developer can
> use structured data in markup to solve real-world use-cases and
> provides examples.  That's the entire vision of the Semantic Web, let
> a thousand ontologies bloom with no central control.
>
>
> Indeed, I do feel that schema.org has been very explicit about how people
> with the given use-case can use their work to solve a real-world problem.
> Many people make work out of getting their employer some awesome search
> engine love. I went to a news related metadata talk (an rNews one -
> fantastic work by the way), and chatting to people from their industry I
> noticed how important it was to them. The use-case seemed to boil down to a
> standard way to annotate new stories/documents to please search engines to
> push eyeballs their way... this is great but I am convinced it is not the
> only contribution the SW tech stack has to give to the world. I recall
> someone had stats re: numbers of webpages vs numbers of rows in databases in
> the world...
>
>
> The reason people are upset are that they didn't use RDFa, but instead
> used microdata. One *cannot* argue that Google is ignoring open
> standards. RDFa and microdata are *both* Last Call W3C Working Drafts
> now. RDFa 1.0 is a spec but only for XHTML 1.0, which is not what most
> of the Web uses. Microdata does have RDF parsing bugs, but again, most
> developers outside the Semantic Web probably don't care - they want
> JSON anyways.
>
> Form what I understand from tevents  where Rich Snippets team has
> presented is that RDFa is simply too complicated for ordinary web
> developers to use. Google has been deploying Rich Snippets for two
> years, claim to have user-studies  and have experience with a large
> user-base. This user-driven feedback should be taken on board by both
> relevant WGs obviously, HTML and RDFa. Designing technology without
> user-feedback leads to odd results (for proof, see many of the fun and
> exiciting "httpRange-14" discussions). Which is also why many
> practical developers do not use the technology.
>
> But realistically, it's not the RDFa WG's job to do user-studies and
> build compelling user-experiences in products. They are only a few
> people. Why has the *hundreds* of people in the Semantic Web community
> not done such work?
>
>
> I think it is probably due to the fact that no one in the Semantic Web
> community runs a search engine!
>
>
> The fact of the matter is that the Semantic Web academic community has
> had their priorities skewed to the wrong direction. Had folks been
> spending time doing usability testing and focussing on user-feedback
> on common problems (such as the rather obvious "vocabulary hosting"
> problem) rather than focussing on things with little to no support
> with the world outside academia, then we probably would not be in the
> situation we are in today. Today, major companies such as Microsoft
> (oData) and Google (microdata) are jumping on the "open data"
> bandwagon but finding the RDF stack unacceptable. Some of it may be a
> "not invented here" syndrome, but as anyone who has actually looked at
> RDF/XML can tell you, some of it is hard-to-deny technical reasoning
> by companies that have decided that "open data" is a great market but
> do not agree with the technical choices made by the  Semantic Web
> stack.
>
>
> Here is where I am not sure I 100% agree with you. Lots of good work has
> come out of academia, user-studies are one thing, and agreed UX hasn't been
> a forte in our community - but I don't think this was the problem. I
> personally don't imagine that schema.org was designed like it is due to
> the fact that they have noticed our community bang on about that number14
> for so long. I think you hinted at what the real issue was above...
>
> A lot of the SW tech stack I follow has both in the past and at the present
> enjoyed tremendous academic support. For one, Garlik (where I work) has a
> core technology team from Southampton Uni, mostly from the AKT (when I was
> ickle [1] <-- lots of familiar faces in there) an EPSRC (UK funding thing)
> project which was set out to build SW tech, it worked well, and there are
> plenty of others out there to see too am sure.
>
> So, my disagreement goes, yes so it could be seen that none of the search
> engines have found the RDF stack acceptable (RDFa GR seems to have struck a
> good cord), but lots of other people have, i.e. not everyone is trying to
> tackle the problem of web-search. And the big search engines all have their
> priorities and none of them boil down to sharing data. Academic output
> hasn't been focused on UI and UX in the SW field, but it has lead to the
> solid, open set of standards which lots and lots of people are building on
> top of - lets not forget how much XMP there is in the world. I don't think
> it is the Search Engines using their vast usability experts to design a
> standard for representing generic data, this is not their core business,
> they built something which would suit their use-case: making it easy for
> web-developers (probably with HTML/CSS/JS/UI skills) to add in metadata to
> their pages, so that the search engines can best server their users.
>
>
> This is not to say good things can't come out of the academic
> community - the *internet* came out of the academic community. But
> seriously, at some point (think of the role of Netscape in getting the
> Web going with the magic of images) commercial companies enter the
> game. We should be happy now search engines are seeing value in
> structured data on the Web.
>
>
> Yes, and trust that our technology stack is built on solid foundations, has
> a great vision, and is being built by lots of lovely people, and companies
> have been involved for a while (he says...)
>
>
> I would suggest the Semantic Web community take on-board the
> "microdata" challenge in two different ways. First of all, start
> focussing on user-studies and user experience (not just visual
> interfaces, the Semantic Web has more than its share of user-hostile
> visual interfaces). It's harder to publish academic papers on these
> topics but possible (see SIGCHI), and would help a lot with actual
> deployment. Second, we should start focussing more on actual empirical
> data-driven feedback, both on what parts of RDF are being used and
> common mistakes. With indexes such as the Billion Triple Challenge and
> Sindice's index, we can actually do that with the Semantic Web. Third,
> why not actually try to get RDF - or "open data more broadly" into the
> browser in usable manner? Tabulator may be a step in the right
> direction, but the user experience needs work. Fourth, why not start a
> company and try to deliver products to actual end-users and give that
> feedback to the wider community and W3C WGs (and if you already work
> for an actual SemWeb company, please send your feedback from user
> studies to the WG before Last Call)? I believe the Semantic Web
> research community - which still has tons of funding and lots of
> passion - can make the Web better.
>
>
> Use-cases are the key, and am sure there are plenty of them kicking about
> as otherwise there wouldn't be so many people working so hard to ensure we
> have this open-technology stack in place.
>
> Indeed Harry you are making the Web better I know it, good on you! But as
> is the rest of the SW community, if anything I have enjoyed seeing how
> passionate people are open-standards.
>
> Good night all,
>
> Mischa
>
> P.S. All views posted here are of my own personal opinion.
>
> [1] http://www.aktors.org/people/students/
>
>
>
> Schema.org is not a threat. It's an opportunity to step up. Good luck
> everyone!
>
>           cheers,
>              harry
>
> P.S.: Note this opinions are purely personal and held as an individual.
>
>
>
> --
> Steve Harris, CTO, Garlik Limited
> 1-3 Halford Road, Richmond, TW10 6AW, UK
> +44 20 8439 8203  http://www.garlik.com/
> Registered in England and Wales 535 7233 VAT # 849 0517 11
> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>
>
Received on Friday, 17 June 2011 17:52:46 UTC