- From: Laurens Holst <laurens.nospam@grauw.nl>
- Date: Sun, 17 May 2009 01:58:58 +0200
Tab Atkins Jr. schreef: >> Ho, ho, you?re making a big leap there! By me explaining that dereferencible >> URIs are not needed to make RDF work on a core level, which makes RDF >> robust, do not jump to the conclusion that it is of no benefit! URIs are >> there for the benefit of linking, and help discoverability a lot (just like >> HTML hyperlinks do). Spidering the semantic web in a follow-your-nose style >> is effective. Incidentally, if an ontology disappears from its original >> address, this kind of spidering will likely lead you to a copy thereof >> stored elsewhere. For example on a different spider which has the triples >> cached. >> > > You had just stated in the previous email, however, that few (if any) > major consumers of RDFa *use* what is located on the far end of the > URI. If they're not even paying attention to it, where is the value > in it? > I said that the ontologies were not used by many RDF consumers. This is because they can be computationally expensive, especially for large data sets, not because they are useless. I think the most clear way I can put this is by comparison: Your argument is like arguing against XML or JSON Schemas, concluding that because they are externally referenced and not used by most XML or JSON applications, they are useless, and in fact that XML and JSON themselves are useless. This is clearly false; removing a reference to a schema from a document, or a document not having a schema, does not make the document itself useless, nor the document format it is expressed in. Although RDF Schema and OWL are definitely part of the ?RDF ecosystem?, they are built on top of the base RDF framework and they are not in themselves required for RDF to function. However the schema does provide a useful description about the document structures and has the ability to express certain semantics, and is thus a worthy technology in its own right. > I don't really understand the 'discoverability' argument here, at > least in the context of it being similar to HTML hyperlinks. > Hyperlinks are useful for people because they make it simple to > navigate to a new page. You just click and it works, no need to > copypasta the address into a new browser window. > By what means the user dereferences the link is not relevant. The fact that an URI is there, identifying a unique location on the world wide web, and thus contributing to the web of linked documents that we call the World Wide Web. Without links and URIs, there would be no ?web?. There would be a big set of networked yet isolated computers that all live in their own walled garden. Links provide discoverability of data provided elsewhere, by indicating a location. Users can find other documents because of this. Search engines like Google can spider the web based on this. The Web of Linked Data is Tim Berners-Lee?s vision of a WWW for data. > I'm also not sure how a rotted link helps you compare vocabularies > with other spiders, which in a hypothetical world you are > communicating with (at this point we're *far* into theory, not > practice). Any uniquifier would allow you to compare things in the > same way, no? > Just a simple rdfs:seeAlso statement referencing it in one single place will allow a spider to ?follow its own nose? and find the triples of the ontology in the republished location. This republication can be anywhere, a new ontology location, or a copy cached by another spider that republishes the triples it harvests on the web (such as archive.org [1]). I agree we?re getting far into the theory-not-practice realm, which is why Shelley is right in saying that in practice vocabularies are served from a location that is well cared for, e.g. using services like purl to provide permanent URLs, or having a solid organisational backing, and Philip Taylor?s list [2] does not do much to discredit this. [Side note: To point out some flaws in Philip?s list, many of the sites in his ?404? and ?not responding? list are experimental URLs. Additionally, the list fails to list usage frequency. Finally, it does not (and can not, obviously) list whether there was any RDF Schema at those locations in the first place. Because, as I explained before, I can make up the following RDF triple right here on the spot, and there would be nothing wrong with it: _:a rdf:type <http://grauw.nl/rdf#Game> The type referenced in this triple?s subject has no ontology at this location. The fact that it is a type is inferred by it being referenced through rdf:type, and that is enough. There is no requirement that this type resolves into a document containing RDF Schema triples. A creative example of this on the list is ?java:java.util.Date?.] >> You are now only considering the ontologies, that is, types and properties. >> You?re forgetting (or ignoring) that in RDF, objects are also named with >> URIs so that data at other locations can refer to it. You know, that ?web of >> linked data? people refer to, core principle of RDF. No ?simple? scheme >> based on what Ian proposed can provide a sufficient level of uniqueness for >> that. URIs are the best and most natural fit for use as web-scale >> identifiers. >> > > Define 'sufficient', as used here. I believe that this is an area > where absolute uniqueness is not a requirement. Worst case, you get a > little bit of data pollution with weird triples being produced by > badly-written pages. Perhaps your browser offers to add an event to > your calendar when no event shows up on the page, or a fraction of a > search engine's microdata collection is spurious. Neither of these > are big deals. > > That being said, I agree that URIs provide a very convenient source of > uniqueness. Ian's microdata allows them to be used either in normal > form or in reverse-domain form; either way provides the necessary > uniqueness. > I am talking about individual triples for MANY pieces of data here. Take for example the identifier of the band Coldplay on Zitgist: <http://zitgist.com/music/artist/cc197bad-dc9c-440d-a5b5-d52ba2e14234> A reverse domain version of such an identifier would look like this: com.zitgist.music.artist.cc197bad-dc9c-440d-a5b5-d52ba2e14234 How exactly is this really shorter, or different other than ?for the sake of doing it different? and failing to build upon the well-known concept of URIs? Note that you can browse to the above URL in your browser of choice and view the data. Also, creating a framework to configure DNS servers to resolve to useful documents for these domains will be pretty tedious. If you ask me, Hixie using the ?reverse DNS? notation in his Microdata proposal is just a trick to pretend he is using something that is different from what RDF uses. If the domain were not ?reversed?, people would see the similarity with URIs too easily. Note that in RDF, if you do not need this global identifying, you can easily create anonymous nodes called blank nodes (?bnodes?). Also, URIs can be written in relative form, making a triple statement often as simple as about="#laurens". Example of some completely anonymous statements using bnodes (aside from using basic RDF building blocks): _:a rdf:type _:Game _:Game rdf:type rdfs:Class _:Game rdfs:label "Game" So as you can see, RDF also caters for the use cases you mentioned above where uniqueness is not required. In RDFa, you achieve this by using a ?typeof? attribute without corresponding ?about? attribute. If you reuse properties from widely-used vocabularies though (such as FOAF, or Dublin Core), it seems obvious that they need to be identified globally to avoid namespace conflicts. Instead of long ?org.foaf-project.Person? identifiers as Hixie proposes, RDF uses URIs and most RDF serialisations go for a (shorter) prefix-based ?foaf:Person? solution, which IMO is pretty user-friendly. ~Laurens [1] http://web.archive.org/web/*/http://www.grauw.nl/foaf.rdf [2] http://philip.html5.org/data/rdf-namespace-status.txt -- Note: New email address! Please update your address book. ~~ Ushiko-san! Kimi wa doushite, Ushiko-san nan da!! ~~ Laurens Holst, student, Utrecht University, the Netherlands Website: www.grauw.nl. Backbase employee; www.backbase.com -------------- next part -------------- A non-text attachment was scrubbed... Name: laurens_nospam.vcf Type: text/x-vcard Size: 111 bytes Desc: not available URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090517/4c372592/attachment.vcf>
Received on Saturday, 16 May 2009 16:58:58 UTC