RDF For Manipulating XML from Sean B. Palmer on 2003-07-01 (www-archive@w3.org from July 2003)

From: Sean B. Palmer <sean@mysterylights.com>
Date: Tue, 1 Jul 2003 02:46:59 +0100
To: <www-archive@w3.org>
Message-ID: <1057787081.IAA22192@phantom.w3.org>

XMLTRAMP [1], and the fact that [ :formerlyCalled "((Echo))" ] [2] and
RSS 2.0 [3] are non-RDF lead me to think that the infoset based
approaches to handling XML in RDF [4], [5] are the wrong way of going
about things.

Keep it simple, stupid.

What we want is some simple model of an XML document that covers 99%
of daily use. We want to be able to easily generate some simple
"readable" RDF from any XML document such that we can actually use the
resulting format for useful things.

So this is about data extraction, as usual, not pedantic nitpickingly
obfuscatory specification-oriented completeness.

It's interesting that XMLTRAMP uses __call__ for non-namespaced
attributes. It's a shame about XML namespace partitioning. Anyway, all
we really need is:-

:Element rdfs:subClassOf
     [ owl:onProperty :content; owl:allValuesFrom rdfs:Literal ] .
:RootElement rdfs:subClassOf :Element .
:Attribute a rdfs:Class; rdfs:subClassOf
     [ owl:onProperty :content; owl:allValuesFrom :ContentList ] .
:name rdfs:domain :Element, :Attribute; rdfs:range :QName .
:attributes rdfs:domain :Element; rdfs:range :AttributeList .
:content rdfs:domain :Element, :Attribute .
:QName _:example ("http://www.w3.org/1999/xhtml" "html") .
:AttributeList rdfs:comment "list of Attributes" .
:ContentList rdfs:comment "mixture of literals an Elements" .

Okay, but that's fairly obvious stuff. The exciting thing comes from
actually querying the data...

The problem is that we want to be able to access the data easily. We
want to be able to:-

* Get all attributes/child elements of an element according to some
template.
* Get a particular index or slice of child elements.
* Get the content of an element as an XMLLiteral.

So in short we need a set of DOM/TRAMP/whatever builtins for this sort
of stuff.

Hmm. There was a point to this rant, but it seems to have eluded me or
failed to materialize. Oh, I guess I want to write an RSS crawler in
RDF. Need HTTP mechanisms for that, though, and there's no way that'll
work in N3. Which reminds me that the Semantic Web needs an RDF
programming language.

Mmm... www-archive.

[1] http://www.aaronsw.com/2002/xmltramp/
[2] http://www.intertwingly.net/wiki/pie/
[3] http://backend.userland.com/rss2
[4] http://www.w3.org/2000/10/swap/infoset/xmod67
[5] http://infomesh.net/2002/03-10xslir/

--
Sean B. Palmer, <http://purl.org/net/sbp/>
"phenomicity by the bucketful" - http://miscoranda.com/

Received on Tuesday, 1 July 2003 02:35:59 UTC