- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Sat, 16 May 2009 13:04:12 -0500
2009/5/16 Laurens Holst <laurens.nospam at grauw.nl>: > Tab Atkins Jr. schreef: >> Once you remove discovery as a strong requirement, then you remove the >> need for large urls, and that removes the need for CURIEs, or any >> other form of prefixing. ?You still want to uniquify your identifiers >> to avoid accidental clashes, but that's not that hard, nor is it >> absolutely necessary. ?The system can be robust and usable even with a >> bit of potential ambiguity if small authors design their private >> vocabs badly. ?As a bonus, everything gets simpler. ?Essentially it >> devolves into something relatively close to Ian's microdata proposal, >> perhaps with datatype added in (though I do question how necessary >> that is, given a half-intelligent parser can recognize things as >> numbers or dates). > > Ho, ho, you?re making a big leap there! By me explaining that dereferencible > URIs are not needed to make RDF work on a core level, which makes RDF > robust, do not jump to the conclusion that it is of no benefit! URIs are > there for the benefit of linking, and help discoverability a lot (just like > HTML hyperlinks do). Spidering the semantic web in a follow-your-nose style > is effective. Incidentally, if an ontology disappears from its original > address, this kind of spidering will likely lead you to a copy thereof > stored elsewhere. For example on a different spider which has the triples > cached. You had just stated in the previous email, however, that few (if any) major consumers of RDFa *use* what is located on the far end of the URI. If they're not even paying attention to it, where is the value in it? I don't really understand the 'discoverability' argument here, at least in the context of it being similar to HTML hyperlinks. Hyperlinks are useful for people because they make it simple to navigate to a new page. You just click and it works, no need to copypasta the address into a new browser window. I'm also not sure how a rotted link helps you compare vocabularies with other spiders, which in a hypothetical world you are communicating with (at this point we're *far* into theory, not practice). Any uniquifier would allow you to compare things in the same way, no? > You are now only considering the ontologies, that is, types and properties. > You?re forgetting (or ignoring) that in RDF, objects are also named with > URIs so that data at other locations can refer to it. You know, that ?web of > linked data? people refer to, core principle of RDF. No ?simple? scheme > based on what Ian proposed can provide a sufficient level of uniqueness for > that. URIs are the best and most natural fit for use as web-scale > identifiers. Define 'sufficient', as used here. I believe that this is an area where absolute uniqueness is not a requirement. Worst case, you get a little bit of data pollution with weird triples being produced by badly-written pages. Perhaps your browser offers to add an event to your calendar when no event shows up on the page, or a fraction of a search engine's microdata collection is spurious. Neither of these are big deals. That being said, I agree that URIs provide a very convenient source of uniqueness. Ian's microdata allows them to be used either in normal form or in reverse-domain form; either way provides the necessary uniqueness. > And then there is of course also the thing that there is already an existing > framework, which has already been here for a long time, has had a lot of > clever people work on it and is gaining in popularity, and here we have > ?HTML5? wanting to reinvent the wheel and making an entirely new framework > ?just for them?. You?d think that of all places, in a standards body people > would be compelled to adopt existing standards :). There are compelling reasons to make any proposal *compatible* with RDF at the least. Ian's microdata does this, though not perfectly/completely. I've said in another thread that I dislike *all* of the inline microdata proposals. RDFa sucks, Ian's microdata sucks, they all suck. They force structure completely inline, which solves what I feel is a minority issue (carrying microdata while copypasting sourcecode) while introducing several larger downsides (carrying possibly *incorrect* microdata while copypasting source, duplication of meta structure when there is a regular page structure that can obviate this, etc.). It's the exact same problems that inline event handlers or inline @style attributes have. I think Ian is trying to limit the suckiness by at least making it as simple as possible to write. It's probably half as difficult or less to write properly, while solving 90% or more of the cases that RDFa does. This is an effort that I'm in favor of. I won't be using RDF in my pages at all unless I know that I can use something like RDF-EASE or CRDF; they allow me to just write my page as normal, then specify what the page's data means in a separate file. Plus, honestly, CRDF's inline syntax seems just as expressive as microdata and RDFa, while being easier to write than either of them. Frex, taking an example from Ian's proposal (I know that some of the names are slightly out of date now): <div item="org.w3.spec"> <h1 property="org.w3.name">HTML5</h1> <a property="org.w3.url" href="/TR/html5">Current Version</a> <div property="org.w3.status" item> <p>Level: <span property="org.w3.level">WD</span> <p>Date: <span property="org.w3.pubdate">03/02/2009</span> <p>Deadline: <span property="org.w3.deadline">02/03/2009</span> </div> <p>Working Group: <span property="org.w3.wg">HTMLWG</span> </div> This can be written using inline CRDF as: <script type="text/crdf"> @namespace w3 http://www.w3.org/; </script> <div crdf="@|subject; @|typeof: w3|item"> <h1 crdf="w3|name">HTML5</h1> <a crdf="w3|url:attr(href)" href="/TR/html5">Current Version</a> <div crdf="@|subject; @|typeof: w3|status"> <p>Level: <span crdf="w3|level">WD</span></p> <p>Date: <span crdf="w3|pubdate">03/02/2009</p> <p>Deadline: <span crdf="w3|deadline">02/03/2009</p> </div> <p>Working Group: <span crdf="w3|wg">HTMLWG</span></p> </div> I believe this communicates everything necessary for an RDF serialization of the content, but in a somewhat more concise manner than Ian's microdata and in a *much* more easily understandable manner than RDFa. And for fun, the same thing in standard CRDF: <div class="item"> <h1>HTML5</h1> <a href="/TR/html5">Current Version</a> <div class="status"> <p class="level">Level: <span>WD</span></p> <p class="pubdate">Date: <span>03/02/2009</p> <p class="deadline">Deadline: <span>02/03/2009</p> </div> <p class="wg">Working Group: <span>HTMLWG</span></p> </div> <script type="text/crdf"> @namespace w3 http://www.w3.org/; .item { @|subject; @|typeof: w3|item; } .item h1 { w3|name; } .item h1 + a { w3|url: attr(href); } .item .status { @|subject; @|typeof: w3|status; } .item .status .level span { w3|level; } .item .status .pubdate span { w3|pubdate; } .item .status .deadline span { w3|deadline; } .item .wg span { w3|wg; } </script> Obviously quite a bit longer in screen inches, but you see the same thing when comparing a single instance of inline @style to the equivalent CSS. The code *looks* clean, though - it looks like HTML *should* look. This especially shines when you realize that it allows you to extract triples from multiple items on a single page and across an entire site just by adding a few classes (possibly useful for styling anyway) and then including this crdf file. (It would normally be <link>ed in, rather than written in a <script> tag.) If we're sure that we can rely on a very specific structure, we don't even need most of the classes - we can just use positional selectors instead. ~TJ
Received on Saturday, 16 May 2009 11:04:12 UTC