Database example; was: Why are relative NS identifiers used?

-----Original Message-----
From: Rick JELLIFFE <ricko@geotempo.com>
To: xml-uri@w3.org <xml-uri@w3.org>
Date: Thursday, May 18, 2000 4:43 AM
Subject: Why are relative NS identifiers used?


>We cannot expect Microsoft and others who use relative NS identifiers to
>stop using them unless the new solution allows them comparable
>functionality.


That makes sense.

>Relative NS identifiers are used, apparantly, to locate schemas.


I would still say that they idntify the namespace.  If on dereferencing
them you get a schema then that is the sort of information which the
namespace publisher deemed appropriate to give the inquierer
about the namespace.

>I don't think it can work merely to say (with Tim Bray) that because a
>relative
>NS has neither persistence nor uniqueness it is no good as a NS
>identifier.
>Not because Tim is incorrect, but because the relative NS URIs are not
>being primarily used here for naming but as schema locators.


As an imaginary example, is we need, let us imagine that someone
has created a database of employees and offices. They decide to
export this onto the company inranet, and they are using software
which understands XML and xml-schema and maybe RDF too.
This sofware, attached as server aplet to the web server, supports the
following virtual resources. (Virtual = not stored in files, generated on
the fly)

- the database (say) has URI http://internal.example.com/2000/05/foo . When
asked to
render that, the HTTP server will respond with an XML document pointing
users
at the various tables and written using the xHTML namespace. (It references
it of course by absolute URI)
It makes a link to the employees table as "foo/employees".

- the namespace for encoding data from the database is (relative to the
above) foo/ns.
When asked to render this by a client understanding XML, a document is
returned written
in the xml-schema namespace. (Referenced by absuolute URI). foo/ns defines
the
datatypes of the columns in the database and the syntactic constraints for
te serialization
of database data represented in foo/ns.  [It may also use terms from the
rdf-schema namespace
to indicate that the foo/ns#zipcode (a unique ID it generates for the
element used
to represent the "zipcode" column in the emploee database) is a subproperty
of
http://standards.uspo.com/2000/05/zipcodes/ns#zip  if it is that smart but
that's another story]

- the employee table is foo/employees. This is rendered as a an XML table
with the data
encoded in the namspace foo/ns . When asked to render employees, the server
responds
with a document using namespace foo/ns, which it refers to using relative
URI "ns".
The server *could* look up the original URI it had been invoked with, and
use
http://internal.example.com/2000/05/foo/ns  but  this would be the first
time it has to be
aware of its own hostname. Further, as the namespace is a custom one for
this particular
database, it is as persistent or transient as the data itself. So there is
no "persistentce"
argument for using a relative URI.  There is a data management argument for
using
a relative URI.  (This also applies in the case of the same thing being
written by a human)

So here we have a one-off namespace made quite automatically by the database
application. Until and unless the user has made some connection to other
information,
[such as the zipcode link above], then it stands alone with the data. It is
intimately
connected with the data, they are published together just as an HTML file
and
an embedded image. Relative URIs seem quite reasonable to me here.

If we need an example of the system failing when URIs are compared literally
then
it will be a little forced, and so unlikley to occur by accident. In fact
probably in the
deployed resources which actually use namespaces and relative URIs I would
bet
a case of liquid that there is no actual extant example.  However, I can
imagine
(if the software continued to compare literal strings) this situation being
created as a malicious
attack. So suspend your disbelieve and let us construct a scenario....

-Over at example.org they cloned example.com's employee database
but of course filled it with different data. The database is
http://www.example.com/admin/staff/foo.
we are using this base address nw.

- It has relative to that URI just the same accessories as virtual
resources: the
namespace ns/foo and the employee table foo/employees

- There is a hand-written XSL script foo/labels which
generates conference badges and other paraphenalia
for attendees when fed suitable  infomation about them.
This has a template to spot anyone who has been designated
as an employee using the foo/ns namespace and give them
a spcial red pass with special privileges. It looks for anyone
who is described by the <staff:employee> element where xmlns:staff="foo/ns".

- The staff of example.com are all invited to a conference
at example.org, and to make labels for
them, the http://www.example.org/admin/staff/foo/labels script is used to
process http://internal.example.com/2000/05/foo/employes
foo/labels script to generate the labels. The XPath expression
used to search for <staff:empoyee> uses
literal comparison, and so regards the "foo:ns" reference
to http://internal.example.com/2000/05/foo/ns
as being the same as the "foo/ns"  (really
http://www.example.org/admin/staff/foo/ns) which it is looking for.
All the visitors end up getting red staff labels which causes great
embarassment. If the software had
only absolutized the URI before handling it as it says in the current spec,
this would never have happened.

I am sorry to belabour this point with a long example.  The early part
was to demonstrated that IMHO there are cases in which relative
URIs as namespace are not only a matter of fact but also quite a
reasonable thingto do.

I have also heard arguments that it doesn't matter if the notion of identity
used for
namespaces when considred as URIs and namespaces when compared by
XPath or for well-formedness checking are different. The example above
is of an error due to this difference.

(I will reply to the rest of your message, Rick, in a separate message if I
have time today)

Tim Berners-Lee

Received on Saturday, 20 May 2000 02:55:04 UTC