Re: QName URI Scheme Re-Visited, Revised, and Revealing from David Allsopp on 2001-08-23 (www-rdf-interest@w3.org from August 2001)

From: David Allsopp <dallsopp@signal.qinetiq.com>
Date: Thu, 23 Aug 2001 16:50:46 +0100
CC: www-rdf-interest@w3.org
Message-ID: <3B852656.B70F055D@signal.qinetiq.com>
"Sean B. Palmer" wrote:

> > The remaining question is whether we are causing ourselves
> > problems when we are in both camps, e.g. trying to convert
> > plain XML 'legacy' documents into RDF.
> 
> You can't really convert "documents" into "data", but you can scrape the
> semantics out into RDF, 

That's what I meant.

> The only reason why you'd want to point at QNames is if you wanted to
> annotate them with Dublin Core, and store them in a DLG, for accessibility
> reasons, or something like that. 

I'm sure there must be more reasons. For example, round-tripping from
legacy XML documents, to RDF and back again.

> And now that we have Patrick's QN URI
> scheme, we can do so quite easily!
> 
> > For example, lets say we are processing XML files in
> > various schemas into one common format; it is possible,
> > although unlikely, that we will have QNames from
> > different sources that happen to map to the same URI.
> 
> Not if you use Patrick's new scheme. Remember, you can't use arbitrary XML
> in RDF, because the RDF parser will always parse it as RDF, obliterating
> the QNames, and turning them into URI references. 
> To use those QNames, you
> have to *convert* that XML instance into data (i.e. resources), 

I thought that's what I just said 8-).

> which is
> just as well because the methods for doing so very from XML language to XML
> language. In other words, it depends how you scrape the data: remembering
> that different XML languages store data in different places, in different
> ways, and this is often unambiguous. 
                          ^^^^^^^^^^^
YM ambiguous, I presume.

From looking over this debate, I am assuming that up till now people
have in fact been scraping XML data by using the XML QName to generate a
URI by concatenation? I think we agree that this is risky. The problem
is perhaps that, naively, that's the obvious way to convert some things.

> [...]
> > [*] However, I think there may still be a separate problem
> > if people choose RDF namespaces without a terminal '#',
> > as we can still get unintentional URI clashes on rare
> > occasions, if people choose very similar namespaces:
> > http://foo.com/discovery + ack
> > http://foo.com/discover + yack
> 
> No, because RDF has no notion of a "namespace" either. The things above are
> geared towards being XML namespaces, and that's fine. But to represent them
> as URIs, you'll have to map them onto a URI somehow.

Um, that's not what I meant. In the example above, if two people at
foo.com write some RDF, in XML - one chooses
http://foo.com/discovery as the namespace and the other
http://foo.com/discover.

Then they define some terms, in the false belief that they must have
defined unique URIs because they know that they have a different
namespace from any other department at foo.com. The concatenation of the
namespaces and terms creates clashing URIs, no?

For example:

<?xml version="1.0"?><rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:a="http://foo.com/discovery"
  xmlns:b="http://foo.com/discover" >
  <rdf:Description about="http://foo.com/stuff.html">
    <b:yack>ABCDE</b:yack>
    <a:ack>FGHIJ</a:ack>
  </rdf:Description>
</rdf:RDF>

The above causes no error reports (using the Stanford RDF API), but you
only get one triple (the FGHIJ one, unsurprisingly).

[It would be an obvious mistake if you write both forms in one file, but
if you merged data from the two departments, this could happen].

Regards,

David Allsopp,
QinetiQ
UK

-- 
/d{def}def/u{dup}d[0 -185 u 0 300 u]concat/q 5e-3 d/m{mul}d/z{A u m B u
m}d/r{rlineto}d/X -2 q 1{d/Y -2 q 2{d/A 0 d/B 0 d 64 -1 1{/f exch d/B
A/A z sub X add d B 2 m m Y add d z add 4 gt{exit}if/f 64 d}for f 64 div
setgray X Y moveto 0 q neg u 0 0 q u 0 r r r r fill/Y}for/X}for showpage
Received on Thursday, 23 August 2001 11:51:07 UTC