- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Sun, 15 Feb 2015 14:48:32 -0500
- To: public-vocabs@w3.org
- Message-ID: <54E0F810.5010403@openlinksw.com>
On 2/15/15 12:19 PM, Dan Brickley wrote: > On 15 February 2015 at 14:47, Bo Ferri <zazi@smiy.org> wrote: > >> (sorry, I can't resist ;) ) >> >> interesting and neat idea. Nevertheless nothing new at all (you know this! >> better than me ;) ). So how does this relate to the already existing, open >> approach of "simply" publishing a(n) ontology/vocabulary with a PURL and >> make use of it. Do we really need everything under the (cooperate (?)) >> schema.org umbrella? You know* that the "one vocabulary rule them all" >> approach (even with extension mechanism) doesn't scale and couldn't make any >> domain (and webmaster who should apply it) happy (this is the world out >> there). > It would be a great thing if there were 100s or 1000s of RDF-based > vocabularies out there, with lots of publishers and consumers making > use of them all. There are, as exemplified by the likes of LOV [1] and lots of "deep web" content constructed using RDF (which includes all the documents constructed using terms from schema.org plus those from the massive Linked Open Data cloud). Did you not really mean to say: vocabularies used *specifically* by Web Masters and HTML+Javascript developers ? > Schema.org is a means to an end rather than an end in > itself - a practical project to help bootstrap this whole thing out of > the slow motion progress we've been making these last ~17+ years. It's an addition to the mix. A very good one at that, with very good results. As you know, we have to put things in to perspective i.e., the Web isn't a "zero sum" affair. Over the last 17 years a lot has been contributed, and the visibility of these contributions depends very much on the context-lenses through which they are viewed. For instance, from my vantage point, we have: 1. Massive Linked Open Data cloud -- oriented towards those that publish and consume so-called 5-Star Linked Open Data. 2. Massive Linked Data cloud -- which adds content from Web Masters and HTML+Javascript developers to the Linked Open Data Cloud. Net effect, we have a massive Web of Linked Data comprised of relations that have varying degrees of semantic fidelity i.e., the degree to which different kinds of user agents (human and/or machines) are able to comprehend the nature of relations (associations, attributes, properties) connecting two things -- where each relationship participant, including the relationship type (relation) itself, are identified by an HTTP URI. > > What we in the RDF community have seen since work began on > http://www.w3.org/TR/rdf-schema/ in 1997, is that while it is great to > have the option of an entirely decentralized composition mechanism, > there are also very practical costs for vocabularies being so weakly > coordinated. Yes! > > > Triples/graphs are not the easiest things to work with at the best of > times. When they aren't understood, since "graphs" and "triples" are colloquialism that should be at the backdoor of this narrative. We have sentences (content) and documents being enhanced via the use of hyperlinks. The fact that one can use a graph to represent the nature of a sentence and/or a set of sentences that share a common predicate is completely lost in this colloquial use of "graph" and the implicit triangulation to "graph theory". Just as bad as the "strings for things" slogan that sounds nice but really makes little or no sense, since the real issue is all about moving from identifying entities using string identifiers (which can only be interpreted in some kind of silo) to reference identifiers (which by way of HTTP based hyperlinks can be interpreted globally via the ubiquitous World Wide Web). The real minus of the last 17 years of RDF is the fact that the first 5-10 years where built around very poor narratives. On a good day, to most, you had draconian goobledegook (sorry, but I have not other word choice here) thanks to RDF/XML. This was exacerbated by the time it took to move from RDF/XML soley, to the notion of varied notations for creating RDF document content (approximately 13 years of self-inflicted wounds on the marketing and messaging fronts). > The data model is so permissively flexible that creating > applications against it is difficult. "Data Model" is part of the problem. RDF is better understood as a Language [1]. The "Data Model" notion comes from a realm (i.e., SQL RDBMS) that has its own problems (conceptually and technically) which are now bubbling to the surface. You can't do anything with something you don't understand. RDF was well designed but atrociously described and promoted, by the W3C. > These difficulties were in some > situations (e.g. web search, schema.org's origins) made worse by the > chaotic state of the vocabulary environment. The chaos comes from the confusion that swirls around so-called "data model" and "syntaxes" . If one speaks about RDF as a Language i.e., a system of signs, syntax, and role semantics, for encoding and decoding information [data in context], the artificial confusion dissipates [2][3]. > > The schema.org extensions discussion I think makes clear that there is > a spectrum here. At one extreme are vocabularies are developed without > any communication or coordination whatsoever. Less extreme is for some > weak coordination and linking between vocabularies, e.g. foaf:focus is > defined in terms of skos:Concept > (http://xmlns.com/foaf/spec/#term_focus) or linked data vocabularies > that relate their terms to others with sub/supertype, equivalence etc > relationships, even if the designs are essentially independent. At the > other extreme would be a single vocabulary that attempted to model > everything in a monolithic way. It is important to understand that > schema.org is not so rigid. Correct, it isn't rigid, but once again there's a meme related problem, just as there was in the early days of RDF. Bo's concerns (I believe) has more to do with interpretation of the extensions narrative. I think it needs subtle tweaks along the lines of making its goals clearer, bearing in mind that Guha (and you) do actually support the notion of a generic and loosely coupled vocabulary with broad appeal. One that's of practical use to Web Masters and HTML+Javascript developers. Bearing in mind my comments above, I think Bo's concerns can be addressed by way of intention clarification i.e., schema.org can be extended in a variety of ways rather than one way: 1. Using the approach in Guha's post -- oriented towards Web Masters and HTML+Javascript developers 2. Using relations -- an approach natural to the Web but predominantly practiced by developers and publishers of Linked Open Data, at the current time. At OpenLink we practice #2, it doesn't require permission from anyone or consensus with everyone, we just get on with it [4][5][6]. > Schema.org is by practical necessity very > pragmatic, and e.g. supports for example both library-oriented and > bookshop-like ways of describing books. And so are other endeavors. There is nothing that uniquely pragmatic about Schema.org relative to RDF based Linked Open Data in general. What you have is the combined market might of Google, Microsoft, Yahoo!, and Yandex and a captive SEO community (comprised of Web Masters and HTML+Javascript developers). A good thing! Seriously, I wouldn't claim "pragmatic" (in a generic sense) as the distinguishing characteristic (attribute, property etc..) here, relative to other RDF related endeavors. As stated above, I see an endeavor that targets a captive audience, which of course is a form of pragmatism, but not one that implies other efforts are purely theoretical etc.. > The pieces of schema.org that > came from Good Relations have some ways of talking about mass produced > objects via prototypes (http://schema.org/ProductModel) which could > also be applied to books, since books are mass produced; but we also > have adopted a use of isPartOf as an alternate model for addressing > FRBR-like use cases, without the rigidity of FRBR. The details don't > matter here - my point is that even within a single large vocabulary > you naturally have a kind of pluralism. Yes, and that's what needs a little more emphasis in communications about schema.org usage and strategic goals. > Any vocabulary at schema.org's > scale will have situations where there are several ways of saying the > same thing, depending on perspective and context. That's even the case with small vocabularies. In short, always the case with Language. > > The gap that the extension model fills is between the relative chaos > of very loosely coupled linked data vocabularies (independent designs, > documentation, versioning, modeling styles) -vs- the relatively highly > integrated approach of core schema.org. We want a bit more chaos than > core schema.org but a lot less chaos than the total free-for-all of > the classic Semantic Web. Why not the classic World Wide Web? That broadly used public HTTP network isn't devoid of relations and semantics that aid understanding the nature of the relations that make up its tapestry. I see the gap being addressed as one that goes beyond basic search engine discoverability, enabling Web Masters and HTML+Javascript developers to gradually encode and consume schema.org content associated with more specialist domains. Schema.org addresses the needs of a community that wasn't optimally served by the generic Semantic Web meme. A lot of that (as already stated) has all to do with the incentives that arise naturally from the visible support of Google, Yandex, Yahoo!, and Microsoft (via Bing!). That's massive, and its negates the prescriptive specification problem that's dogged RDF from the onset. Ironically, if RDF was correctly pitched as a formalization of what was already in use, we would have reduced 17 years to something like 5, no kidding! For instance, Imagine if <link/> and "Link:" had been incorporated into the RDF narrative as existing notations for representing entity relations? Basically, Web Masters, HTML+Javascript developers, and the Microformats (now IndieWeb folks) would have be far less confused and resistant to the RDF -- especially as would have prevented the massive RDF/XML blob of confusion that ultimately obscured everything. > For those who would rather pick and choose > from the entire range of diverse vocabularies, the LOV project (see > lov.okfn.org) offers a very useful directory. Exactly! > For those who want a bit > more consistency in terms of documentation / navigation / usage > examples, a common underlying core, and richer domain-oriented > extensions, we have schema.org and its approaches to extension as > discussed here. I don't really agree with that characterization. It's unnecessarily pejorative about alternatives to schema.org prescriptions. Again, all of these initiatives are pieces of a massive puzzle, so we have to be able to communicate about these pieces without knocking other complimentary parts. > > Schema.org is an exploration of the idea that we'll get further, > faster by sharing a substantially sized common vocabulary as well as > underlying graph data model. To me it's a demonstration of what happens when a specification is backed by key industry players. Basically, rather than saying "Hey! You over there, you MUST work this way, just because we say so .." we have an approach that targets a massive audience (Web Masters and HTML+Javascript Developers) with in-built incentives i.e., they all want to optimize their content for the search engine technologies from Google, Microsoft, Yahoo!, and Yandex. > But it remains a part of that larger > RDF-based framework and can be freely mixed with independently managed > vocabularies. Amen!! Links: [1] http://lov.okfn.org/dataset/lov/vocabs/schema -- Looking at schema.org via LOV's context-lenses [2] http://www.slideshare.net/kidehen/understanding-29894555/55 -- Natural Language & Data [3] http://www.jfsowa.com/pubs/fflogic.htm -- Fads and Fallacies about Logic [4] http://www.openlinksw.com/data/turtle/ -- OpenLink Ontologies collection [5] http://kidehen.blogspot.com/2015/01/social-networking-profiles-for-everyone.html -- Social Network Profile Publishing for Everyone [6] http://kidehen.blogspot.com/2015/01/review-publishing-for-everyone.html -- Review Publishing for Everyone [7] http://kidehen.blogspot.com/2014/02/class-equivalence-based-reasoning.html -- Class Equivalence Inference & Reasoning that leverages Schema.org. -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Sunday, 15 February 2015 19:49:01 UTC