Re: XML Enriched N-Triples (XENT) from Sean B. Palmer on 2003-06-15 (www-rdf-interest@w3.org from June 2003)

From: Sean B. Palmer <sean@mysterylights.com>
Date: Sun, 15 Jun 2003 16:25:27 +0100
To: <jimbobbs@hotmail.com>
Cc: <www-rdf-interest@w3.org>
Message-ID: <00ad01c33352$5ace24e0$23bc0150@localhost>
> [...] the more experimentation, then the more likely some
> good variants will be created.

Quite. Whilst RDFCore are well past overdue going by their chartered
timeline, and whilst the RDF Syntax specification is still not at CR,
RDF/XML is only undergoing a bug fix from the 1999 original; it's a
half-decade old technology coming to fruition. It may be too widely
deployed now for any alternate serialization to seriously challenge
it, and that, like it or not, is going to put off a lot of people from
using RDF. It's a shame that the main barrier to alternate
serializations, and hence RDF's adoption, is an historical/political
accident.

With any format like RPV or XENT, or your own hypothetical YARS (Yet
Another...), the suitability of the language--how well it fits the
requirements of those who use RDF--is unfortunately a small factor.
The XML and RDF communities are full of a lot of people who have very
strong opinions about lots of things--by necessity, though it tends to
lead to some quite obsessive and heated likes/dislikes of various
constructs.

For example, as was noted to me on #rdfig [1], XENT itself is pretty
much a mix of constructs from Notation3/N-Triples and RDF/XML mashed
together into one proposal. I tried, of course, to take the best
features from each approach, but the problem is that proponents of RDF
serializations are usually very passionately one-sided about which
method they prefer (talking from my possibly incorrect experience
here). In other words, people tend to favor one serialization very
much over the others. So whilst you'd think that a compromise between
them would be a good idea, it'll probably just end up with almost
every established member of the RDF community snubbing it :-)

I note that even though QNames are used heavily in communications
about SW vocabularies, they're viewed as harmful in the motivation
section of RPV. We need them; we may as well deploy them.

> On abbreviating element names for URIs, there has been
> controversy [2].

A better reference is:-

http://www.w3.org/2001/tag/doc/qnameids-2002-07-15

There's a lot of FUD surrounding QNames, but at the end of the day
they just map prefixes to namespaces. For RDF, it's too handy an
abbreviation mechanism for URIs to pass up. RDF/XML and Notation3
would be lost without them, and NTriples is basically too difficult to
write because it doesn't have them.

The TAG finding says that parsing costs of QNames in PCDATA may be
high. I've proved that, for Python and SAX and least, the opposite is
true. I suspect that this will be the case in many other languages
too. The TAG, in the finding above, say that since the approach is
widely deployed to good effect, it's "reasonable to use QNames in this
way".

> I don't want to build a tokenizer for parsing apostrophes,
> white spaces, and other strings on top of another tokenizer,
> the XML processor.

Why not?

> IMHO, using two or more different escaping methods
> really mucks up the language.

I'm not actually sure what you mean by this--could you expand, please?
Actually, escaping should've been listed as a TODO in my original
announcement. But since XENT uses XML/RDF, it's basically going to
have to use entity escaping *instead* of the Python-esque \uHHHH
method. No big deal, and not an issue that would get in the way of any
standards track work on the format, IMO.

> I like your ideas; [...]

Thanks. Your feedback is appreciated.

> CC: Tim Bray - as with the origional message

Hmm. I think that BCC is better since this thread is liable to go off
topic...

Cheers,

--
Sean B. Palmer, <http://purl.org/net/sbp/>
"phenomicity by the bucketful" - http://miscoranda.com/
Received on Sunday, 15 June 2003 11:25:32 UTC