- From: <Patrick.Stickler@nokia.com>
- Date: Fri, 8 Oct 2004 11:01:43 +0300
- To: <www-rdf-interest@w3.org>
Howdy folks, I recently put together some thoughts on a few key issues that several recent discussions in this and other forums have touched upon, and about which I wanted to offer some pointed comments which I think are very important to consider. I've touched upon all of these issues in various ways at various times, but I think that much of the points I aimed to make have been lost in the myriad threads and discussions, so I present them here in (hopefully) a clear and comprehensive form. I'm not looking to see these comments turn in to any number of long drawn out debates which merely generate alot of "digital" heat and go nowhere. I ask that before giving into the impulse to point out how my views are the babbling of a deranged lunatic, that you first make a reasonable effort to understand what I am trying to say and give me the benefit of the doubt about being insane if you simply do not understand something. I'm very happy to discuss any of these issues in a friendly and mutually respectful manner with anyone. This posting is as much "for the record" as it is for the (presumed) benefit of others and for the sake of further discussion. Here goes... ;-) -- 1. A Bootstrapping Mechanism for the Semantic Web is Essential If the semantic web is to become truly globally ubiquitous, with arbitrary semantic web agents intercommunicating in a truly dynamic fashion, then we cannot presume that any given agent will have any pre-existing knowledge about the resource denoted by any particular URI it may encounter; including vocabulary terms. And if we abandon that presumption, as we should, then there must be an efficient, standardized solution to obtaining an answer to the question "what does this URI mean?", from an authoritative source, and asking that question should not require of the agent any further knowledge than the URI itself, and knowledge of the generic, application agnostic, standardized machinery of the web and semantic web. The solution offered by URIQA is to utilize the proven, globally deployed web, and web resolvable URIs, to allow agents to ask that question from the web authority specified in the URI itself. There may be other, third party, sources of information about the resource denoted by that URI, and various query interfaces with differing degrees of functionality providing access to such information, and the agent may be aware of such sources, and even utilize some of them, but such interaction with otherwise known third party sources of information should not be a necessary element of the bootstrapping solution, nor do they alleviate the need for a proper bootstrapping solution based solely on the posession of a particular URI. -- 2. Consistency in Interchange of Resource Descriptions is Essential If the semantic web is to scale, as the web did, then there needs to be a high degree of consistency in the form and scope of typical responses to the semantic web request "tell me about this thing" (i.e. a CBD in RDF/XML) comparable to the high degree of consistency in the form and scope of typical responses to the web request "give me a representation of this thing" (i.e. a document presentable in a browser). That doesn't mean that CBDs would be the only form of description, no more so than all representations returned by a web server are primarily intended for presentation in a browser; but CBDs simply constitute a standardized default form of description when no other more specialized form is either requested or required. However, a completely unpredictable "free for all" or "lottery" of description types will severely hinder the semantic web reaching a critical mass of applications which trully facilitate the free and dynamic interaction of arbitrary agents. Excessive variability in default forms of description will increase the complexity of our agents and/or limit the scope of effective interaction between agents. -- 3. Primary vs Secondary Web Access to Representations and Descriptions is Critical URIrefs with fragment identifiers pose significant practical problems to semantic web applications attempting to employ the existing, proven, standardized web machinery to access knowledge about resources. For the sake of easier discussion (and typing), let me introduce three new terms (the last of which we won't actually concern ourselves with herein): "primary URI" pURI = scheme ":" hier-part "secondary URI" sURI = pURI "#" fragment "query URI" qURI = pURI "?" query where 'scheme', 'hier-part', 'fragment' and 'query' are defined in RFC 2396-bis. A "primary URI" denotes a "primary resource", which is a resource which may, potentially, have directly accessible representations (by resolving the primary URI to one or more representations). A "secondary URI" denotes a "secondary resource", as defined in AWWW, which is only accessible indirectly, through a particular representation of the resource denoted by the base pURI of that sURI. This is particularly crucial if semantic web agents are to have efficient web access to representations (or descriptions via URIQA). For example, consider the following two URIs, each of which denote a distinct vocabulary term; the first of which is a pURI and the second of which is a sURI: http://example.com/foo/bar http://example.com/foo#bas Let us also presume that these are the only URIs which are known to denote these two particular terms. The "secondary resource" denoted by the sURI http://example.com/foo#bas is accessible via the web only indirectly, via the representation of some other resource, namely the resource denoted by the base pURI of the sURI. Thus, in order to access a (kind of) representation of whatever term is denoted by http://example.com/foo#bas one must first obtain a representation of the resource denoted by http://example.com/foo and within the context of that representation, outside the scope of the web machinery proper, on the client side, interpret the fragment identifier "#bas" in order to obtain a (kind of) representation of the term. Note also that the (kind of) representation extracted is not officially considered a representation by the web architecture (hence the qualification 'kind of'). The problem for a requesting agent, one which may very well be running on an embedded or mobile device with limited capacities (and that is, by the way, the beautiful vision of the ubiquitous semantic web painted for us), is that the representation of the resource denoted by http://example.com/foo may be several megabytes in size, constituting the complete definition of a complex ontology consisting of thousands of terms; yet downloading that mass of (mostly irrelevant) information is the only way to access the needed, limited, information required about the particular term http://example.com/foo#bas Real world examples of such problems exist (Cyc, Wordnet) and more are likely to surface as the semantic web gains critical mass. It is simply the case that arbitrary semantic web agents simply cannot be expected to be "force fed" huge masses of information to obtain the small bits of information needed to accomplish a given task. For agents running on mobile devices, which is obviously an application area of particular interest to Nokia, this is a critical issue. In contrast, if the term is treated as a "primary resource" by being denoted by a primary URI http://example.com/foo/bar then the semantic web agent can access representations of that term, specifically and directly (and efficiently) independently of any representation of any other resource. Primary resources denoted by primary URIs are first class citizens of the web and semantic web agents can directly interact with representations (and descriptions) of those resources, and can do so in an efficient manner, employing the full richness of the web machinery. If the existing, proven web machinery is to be re-used and employed by the semantic web, and I think that is a widely held presumption and desire, then this issue regarding naming methodology and the continued use of secondary URIs for vocabulary terms, which should be considered primary resources, is critical. The use of secondary URIs as the official URIs denoting resources which a large number of semantic web agents are likely to refer to and inquire about constitutes an inefficient and non-scalable methodology. Secondary resources denoted by secondary URIs are second class citizens of the web. Vocabulary terms should be considered first class citizens of the web, and therefore vocabulary terms should always be denoted by primary URIs. It should be considered a "best practice" to avoid the use of secondary URIs, except for particular cases where the secondary resources in question constitute logical or functional subcomponents of the resource denoted by the base URI (e.g. a section of a web page, or a line segment in an SVG graphic, etc.) and access to such component secondary resources is not expected to happen independently from access to the encompassing resource. Applying this "best practice" includes avoiding the use of XML Namespaces ending in '#', rather ending all XML Namespaces in '/' (or some other character which does not result in the creation of secondary URIs). -- Patrick Stickler (+358 40) 801 9690 Senior Architect Hatanpäänkatu 1 Forum Nokia Web Services 33900 Tampere FINLAND Nokia Technology Platforms patrick.stickler@nokia.com Forum Nokia is Nokia's online community for third party developers creating mobile applications. Registered members can find a wide range of development tools, supporting documents, and can meet other developers on-line. The Forum Nokia site was established in 1995 and has currently over 1 500 000 members. Forum Nokia http://www.forum.nokia.com Register Online http://www.forum.nokia.com/register Info on Developer Platforms http://www.forum.nokia.com/platform Check the Latest Devices http://www.forum.nokia.com/devices Get the Latest Tools http://www.forum.nokia.com/tools Read the Latest Documents http://www.forum.nokia.com/documents Sell Your Applications http://www.forum.nokia.com/business Get Technical Support http://www.forum.nokia.com/support Get information on Testing http://www.forum.nokia.com/testing Give Feedback http://www.forum.nokia.com/feedback Search/Browse Resources http://www.nokia.com/search3
Received on Friday, 8 October 2004 08:11:00 UTC