- From: <Patrick.Stickler@nokia.com>
- Date: Wed, 22 Sep 2004 10:19:00 +0300
- To: <jon@hackcraft.net>, <zednenem@psualum.com>, <daniel.oconnor@gmail.com>
- Cc: <www-rdf-interest@w3.org>
> -----Original Message----- > From: ext Jon Hanna [mailto:jon@hackcraft.net] > Sent: 21 September, 2004 17:41 > To: Stickler Patrick (Nokia-TP-MSW/Tampere); zednenem@psualum.com; > daniel.oconnor@gmail.com > Cc: www-rdf-interest@w3.org > Subject: MGET Again [was: web proper names] > > > > But that doesn't mean that the representation you > > get back is a *description* of the thing denoted. > > > > The resource denoted could be, e.g., a particular > > ontology, and the RDF/XML returned is an expression > > of (representation of) that ontology, *not* a description > > of that ontology. > > > > Eh? > > The RDF/XML is a representation of that ontology, and as such a > description of the terms it defines. There is no reason why > the ontology > should contain a description of the ontology itself in the RDF/XML, > indeed this is both common practice and IMHO good practice. It's true that there's no reason why the RDF/XML representation of an ontology can't include statements about the ontology itself, and in fact, we follow that practice in Nokia for most ontologies and other RDF/XML instances. However, there are many reasons why such a "best practice" cannot be considered a component of the foundational architecture of the SW. 1. Size. It may be that the RDF/XML representation of that ontology is many megabytes in size (e.g. Wordnet, Cyc, etc.) and so if all one wants/needs is a description of the ontology itself, it's pretty impractical to ask for the whole enchilada. I.e., while one may consider the entire set of descriptions of all terms (resources) in a given ontology a valid representation of that ontology, it would not correspond to a concise bounded description of the ontology alone. 2. Ownership/Management: The publisher of a given representation may not be the owner, but merely have rights to publish via a particular URI, and thus it is impractical or even impossible to introduce or augment a description of the resource in question into the RDF/XML representation. Even if the publisher owns the representation, it may still be infeasible to modify the description insofar as publication is concerned via a particular URI due to a complex and distributed content management infrastructure. 3. A robust, efficient, and globally ubiquitous semantic web needs a more precise, well engineered foundation, providing concise bounded descriptions of resources with clear determination of success or failure; rather than a crap shoot with representations simply hoping for the best. Sifting through representations obtained via a URI for information about the resource denoted by that URI, hoping that folks have employed reasonably good practices and hoping that something useful can be gleaned is IMO a pretty sloppy way to go about things, from an engineering perspective. IMO, much better to be able to ask *exactly* for what is needed, and know *explicitly* from the response whether it has been provided. Yes, the approach you outline *can* be made to work in certain contexts where there is total ownership and control of all components of the solution, but it breaks down in other critical application areas, and thus is IMO unsuitable as a part of the foundational architecture of the semantic web which we will have to live with for many, many years to come. For a real world example, compare the response to GET /schemas/nokia/MARS-3.1.rdf Host: sw.nokia.com Accept: application/rdf+xml e.g. curl -H "Accept: application/rdf+xml" -L http://sw.nokia.com/schemas/nokia/MARS-3.1.rdf with MGET /schemas/nokia/MARS-3.1.rdf Host: sw.nokia.com e.g. curl -X MGET -L http://sw.nokia.com/schemas/nokia/MARS-3.1.rdf which I think very well illustrates my points above. Note that <http://sw.nokia.com/schemas/nokia/MARS-3.1.rdf> denotes an RDF/XML instance (a representation, a document) and not an ontology, even though its RDF/XML representation happens to describe (partially) many resources comprising an ontology. The description of the RDF/XML document is also included within the RDF/XML representation, reflecting your suggested "good practice", but the description of the RDF/XML document itself is a very, very small fraction of the total content embodied by its RDF/XML representation. Thus, clearly, a solution such as URIQA does not in any way lessen the utility and "goodness" of the practice of describing resources within RDF/XML representations of those resources. In fact, that is one very good way for a server to obtain concise bounded descriptions of those resources (without having to potentially force-feed the client megabytes of data to be sifted through on the client end) and how we do it on the Nokia Semantic Web Server in many cases. > The only reason I can see for not including triples with the > URI of the > ontology itself in > the ontology is that you don't care to describe it. As I pointed out above, in large publication environments, where there are complex legal agreements regarding content (which may very well include RDF/XML representations) the publisher may simply not have the ability to insert into the representation itself what it considers essential information about the resource, insofar as publication via their URI is concerned. I also consider it a very "good practice" to keep metadata about resources and the resources (or representations) clearly distinct, and most large scale CM systems I've either built, used, or reviewed also embrace such a practice, and in fact employ different kinds of metadata about the same resources at different layers with differing visibility/access. Requiring inclusion of representation descriptions within RDF/XML representations simply won't scale to a globally ubiquitous solution. URIQA allows for either approach, and favors neither over the other. > > Content negotation *cannot* be used to reliably > > accomplish what URIQA seeks to provide. > > > > And when you want a description in N-Triples, XTM, > > TriX, N3, etc. how will you ask for it, if conneg > > is already (improperly) busy doing something else? > > The parenthecal "(improperly)" is where we disagree I think. Still, I > thank you for the CBD I think it's a very useful concept in deciding > what goes into an RDF/XML document (put in the CBD first, then think > about what else, if anything, is justified). By "improperly" I mean that content negotiation is intended to provide informationally equivalent representations in alternative encodings. I do not see the distinction between an arbitrary RDF/XML representation and a concise bounded representation as falling within that scope. I played around with using a distinct MIME type and conneg in a very early incarnation of URIQA, where one could ask for something like application/rdf+xml+cbd individually from application/rdf+xml to request a concise bounded description rather than just some arbitrary (to the client) RDF/XML instance but concluded that it conflicted with the intended purpose of MIME/conneg (and that the distinct HTTP methods were much cleaner from an engineering perspective). Thus, e.g. GET /foo/bar HTTP/1.1 Host: www.example.com Accept: application/rdf+xml would return whatever RDF/XML representation the publisher wanted to provide, which may very well be huge, and describe alot more resources than just http://www.example.com/foo/bar; whereas GET /foo/bar HTTP/1.1 Host: www.example.com Accept: application/rdf+xml+cbd would be synonymous with MGET /foo/bar HTTP/1.1 Host: www.example.com I have my doubts, though, about proper/reliable failure of GET /foo/bar HTTP/1.1 Host: www.example.com Accept: application/rdf+xml+cbd by servers which do not implement conneg, or do so in too "helpful" a manner, and while not recognizing the specialized content time, nevertheless return a representation that is not a CBD. I know at least that if a server has not implemented the explicit URIQA methods, that the request MGET /foo/bar HTTP/1.1 Host: www.example.com will result in a clear failure, and if it has implemented the URIQA methods, a successful response should be reasonably trustworthy as a CBD. That's, to me, far more satisfying from a large scale systems engineering perspective. The semantic web is complicated enough without having to make our agents guess about and sift through arbitrary representations. Patrick
Received on Wednesday, 22 September 2004 07:27:55 UTC