RE: Q to implementers: Resource identifiers - XML Names and/or (concatenated) URIs? (was RE: rdfs.isDefinedBy...) from Jeremy Gray on 2002-06-07 (www-rdf-interest@w3.org from June 2002)

From: Jeremy Gray <jeremy@jeremygray.ca>
Date: Thu, 6 Jun 2002 23:51:39 -0700
To: "'Dave Beckett'" <dave.beckett@bristol.ac.uk>
Cc: <www-rdf-interest@w3.org>
Message-ID: <000001c20def$c6d8d880$5a16b742@Dora9>
Thanks for the response, Dave. I really appreciate hearing back from someone
so active in the RDF community. Sadly, however, your response smells of
someone offended and/or confused by the reading of my message. I hoped to do
neither, and would really like to have a constructive discussion of these
issues. I hope you are open to the same. With that in mind I have some
comments and clarifications which I've included inline:

> > - if RDF/XML requires prefixes before each attribute, it is
> > simply not Namespaces in XML -compliant XML

Sorry, I exchanged two words - "not" and "simply".

What I meant to say was that "it is not simply Namespaces in XML -compliant
XML", intending to intimate the commonly-mentioned RDF/XML serialization
specification addition of a requirement: that all attributes be directly
prefixed (this is supported by the examples I've seen of late), whereas the
Namespaces in XML spec allows un-prefixed attributes to inherit the
currently-scoped namespace.

If when stating that attributes must be prefixed it is meant that they must,
through one means or another, be expressed relative to a specified
namespace, explicitly or otherwise, then the wording could use an update to
clarify that it means as much. Based on recent mailing list postings and
examples at the least, it can be (and often is) easily interpreted that all
attributes must be expressed with an explicit namespace prefix, which would
be a requirement above and beyond Namespaces in XML.

I should have stated this expressly in my original post, and apologize for
any misunderstanding its exclusion may have caused.

Can you clarify for me the expected interpretation of this "prefixing"
issue?

> The XML qnames are used in RDF/XML as abbreviation mechanism for
> long URIs, and although collision has been recognised, it has not
> been considered a huge problem.  The point is to generate the URI,
> the qname used isn't important.

While I do agree that collisions will be uncommon, especially if creators of
namespaces and identifiers take appropriate steps to protect themselves from
collisions, and while I'm actually more concerned about the splitting side
of the equation (hence the provided example), I still feel that the whole
process is a pretty weak crutch given that full adoption of some kind of
namespace mechanism was so obvious and not at all inappropriate yet was not
pursued.

In addition, not all (I'd go further to say that initially relatively few)
creators of URIs that end up used in RDF will be doing so in terms of RDF's
current concatenation/splitting recommendations, nor should they be expected
to, and I would argue that our implementations should not punish them for
whatever has caused this issue to be unresolved to this point. <sarcasm>Talk
about a great way to promote RDF: Feed your identifiers into RDF! We'll
incorrectly equate them to each other and possibly those of other
organizations, and later output them as whole new identifiers, often in
namespaces you've never even heard of!</sarcasm>

Man, I've stayed up too late editing this email. :)

My understanding is that the intention was to avoid becoming dependent on
any other specification (in this case, Namespaces in XML), and I certainly
agree with the intent, but an analogous and compatible mechanism could
easily have been created to resolve concatenation/splitting issues cleanly
and consistently while offering other benefits also (i.e. performance,
memory footprint, etc.).

At the risk of offending you further and perhaps offending others
additionally, if you really want isolation from Namespaces in XML (outside
the scope of serialization), I would suggest (among other things) ceasing
use of the term "QName", especially in sentences like "QNames are just
punctuation" :) one which I've seen tossed about a bit recently. In fact,
even within the scope of serialization such a statement is disappointing at
best, frightening at worst.

Okay, raise your hand if I just offended you. :)

> > - the suggested URI -> XML Names splitting method is so flawed
> > as to not be responsibly implementable (i.e. you can't generate
> > valid LocalParts by splitting on non-Name characters, but can if
> > splitting on non-NCName characters. However, using the latter
> > method produces differently invalid results, e.g. if splitting
> > urn:NewsML:afp.com:20000811:010607144425.x6pxrl6k:1)
>
> Well, the schema and namespace issues which above, you don't want
> to discuss, could provide solutions to that.

I'm quite interested in discussing implementable solutions and these issues
with respect to them, and my original email was quite clear on that point.
What I did not want to do, however, was duplicate countless previous threads
which have already covered well (much more effectively than I could, I
admit) the various less implementational, more academic views on the
subject. I think I was clear on that as well but please accept my apologies
if you interpreted my comment differently.

Re: "... the schema and namespace issues ... could provide solutions ..."

Can you describe (or refer me to) your solution?

> > ...  Internally? Are you following the WG-recommended
> > concatenation and splitting processes? If not, what are you
> > doing instead?
>
> Yes.

Interesting you say "Yes.", as it disagrees with the earlier ("... the
schema and namespace issues ... could provide solutions ...") and the later
("By using namespace URIs to indicate when to split.") through both of which
you seem to indicate that you, like I, am doing something above and beyond
RDF M&S + issue tracking, though we surely differ on the mechanism and
likely differ on the perceived effect on compliant behaviour.

> > My company, for example, intends to produce behaviour indicative
> > of full and correct interpretation of the Namespaces in XML
> > specification so that behaviour both inside and outside of our
> > system is consistent. It may not be strict RDF, but it has a
> > better chance of producing correct results. ...
>
> Can you say what you are doing that is different?  (Strict
> RDF=Strict RDF/XML I assume)

Certainly.

By "strict RDF" I meant pure RDF M&S 1.0 + issue tracking. Regarding "it
(our implementation) may not be strict RDF": Namespaces, including the
partitioning implied by them, are currently a first-class entity in our RDF
implementation - not just in the (de)serialization processes but also in the
application's RDF model itself.

The product could be modified to operate in a more "traditional"
namespace-free mode, if needed, though I would dislike doing so for a number
of reasons (e.g. among others, loss of at least some of the various
namespace benefits I mentioned earlier). We have, however, already
considered the changes that would be required to do so.

From what you've intimated, I suspect your application might be providing
similar abilities, at least with respect to the XML (de)serialization
issues, though it sounds like you are accomplishing it through different
(but equally above and beyond pure RDF) means.

I would like to discuss with you the ways in which the differences between
our solutions might affect our corresponding applications, especially WRT
integration behaviour. Hopefully, we can keep things pragmatic and free from
corner cases :)

> > ... Since Namespaces exist within our RDF system (once again,
> > NOT for schema identification or resolution, just for
> > identifiers), ...
>
> (XML) namespaces aren't in the RDF model, so you are going beyond
> what other RDF apps might expect.

Yes, we are going beyond, no denying that. After a significant period of
consideration of the possible alternatives we decided that we can provide a
higher level of behavioural consistency and performance through our selected
implementation, while providing at least as much (if not significantly more)
external consistency with and as the third-party systems with which we
integrate.

While I am a full believer and supporter of the standardization process I,
for better or worse, am at the same time responsible for implementing and
delivering a quality product. As such, when choosing between delivering a
standards-conformant product which has known behavioural flaws vs.
delivering a not-quite-but-close-to-standards-conformant product which does
not, I simply have no choice and doubt anyone else in my situation really
does. Our customers appreciate conformance to standards but they _demand_
expectable, consistent results. Your situation may differ, however, and
there is plenty of room to agree to disagree on this point.

Worry not, though... We are not in the position or of the mind to "embrace
and exterminate/(introduce preferred insult here)". We have every intention
of actively pursuing standards conformance at every possible place in our
products and point in their life cycles, except, and I wish I didn't have to
say this but I do, when exact conformance has a detrimental effect for our
customers.

If, however, a solid case can be presented to counter our position (i.e.
illustrating that our solution has an even more significant detrimental
effect than the one it was designed to resolve), believe me when I say this:
I want to hear it.

> You seem to be conflating the XML (Infoset say, which models
> things like elements, attributes, namespaces) and RDF model
> (graph, URIs, no namespaces).

Absolutely not.

My interpretation may seem blurred to you due to our adoption of namespaces
as first-class entities for use in identifiers, but I assure you, it was
considered the lesser of the evils after much exploration. Having said that,
I WILL ensure appropriate pre-shipment changes to our products if reasonably
convinced otherwise.

We pride ourselves on having reached a very clear interpretational
separation between model and syntax. If I were to concede one detail about
my still unmentioned company is that this very separation is the conceptual
base for our entire product line. Even that's probably saying too much at
this time. It took a lot of time and effort to wade through countless RDF
discussions and documents to reach a clear, confident understanding of and
position on this distinction. Many sources of such information have great
difficulty separating model and syntax, have difficulty adopting and using a
consistent conceptual model and set of terminology for the sake of
straight-forward communication and education, are often based on different
needs in different domains viewed from different perspectives, etc. but
we've made it through the process alive and mostly unscathed. :)

To be honest, and to ramble a bit more, I am regularly frustrated to see
RDF's development slowed by persisting views of RDF which seem unable to
separate syntax from model, especially after such effort to do so for myself
and then for my colleagues. Case in point - a recent mailing list discussion
regarding RDF model -based query representations that was just getting
interesting ("interesting" defined as "unsurprisingly familiar" ;) ) until a
few posts dragged the serialization syntax in and basically killed the
thread. What a missed opportunity! Note to self: comb the archives some time
to see how many times that has happened to this one example.

I know this has been said before by others and as such is more or less
redundant, but while we're on the topic: I think the issues of my previous
two paragraphs and their effect on RDF's approachability have been and are a
real barrier to RDF adoption - let's be honest, M&S is ~three years old and
has had relatively* few implementations. Please note the use of the word
"relatively". Those I know who have been able to learn "RDF The Model"
before ever seeing "RDF The XML Syntax" reap greater benefits from both, do
so with a significantly easier learning curve, and can do so from a much
lower level of technical knowledge and experience (i.e. even our sales team
gets it :).

It's unfortunate that we, the RDF community, have put potential and new
users through such an unnecessarily difficult learning process. I am glad,
however, that this has been recognized and efforts are underway to correct
it (i.e. The Primer, updated spec documents, and such). I do, though, some
times wonder where RDF might be today had this problem not ever been.

> > How are you addressing Namespaces in XML -related issues in and
> > around RDF?
>
> By using namespace URIs to indicate when to split.

As I remarked earlier, I find the inconsistency between this and one of your
earlier statements quite intriguing.

Would I be correct in assuming that you are providing some form of
"round-tripping hints", as it were, in order to assist in the RDF model ->
RDF/XML serialization process? Does your application go further than that,
as ours does? If not, can you enumerate (or refer me to such an enumeration)
some of the reasons why?

Finally, for you, Dave, and for others reading this thread, I'd like to
again state that by no means have I intended or do I intend to offend or
criticize the values or results of the standardization process, the WG(s),
the community, or individual members of either.

My intention was and is to raise and discuss issues in context of
implementation, though these must, to a certain degree, be discussed also in
context of current standardization efforts (i.e. I know full well these
issues will be resolved eventually, and am confident that our products can
be easily brought in line with changes resulting from the standardization
process, but the WG(s) can't do anything about it between now and then and,
for better or worse, I still need to ship. :)

I do understand and admit that a number of things I wrote in my initial
email could have been delivered more clearly after additional editing, and
hope that Dave's reply was more the result of a misunderstanding than of a
fundamental disagreement.

It's also possible, now that I notice how late it is and how long I've spent
in front of this email, that you now, having read this, have more material
ripe for possible misinterpretation. :)

If you're wanting to dive headlong into a response on something I've said
but feel you just might have possibly misinterpreted it, please ask me to
clarify.

With all said and done, I hope the discussion can continue constructively
and I do apologize for any misunderstanding I've created thus far.

Jeremy Gray
Received on Friday, 7 June 2002 02:52:17 UTC