W3C home > Mailing lists > Public > www-rdf-interest@w3.org > October 2002

Re: Transforming XML content into RDF assertions

From: Murray Spork <m.spork@qut.edu.au>
Date: Mon, 07 Oct 2002 11:57:07 +1000
To: www-rdf-interest@w3.org
Message-id: <3DA0E9F3.9040408@qut.edu.au>

m batsis wrote:

[...]

> IMHO the bottleneck is not in the transformation; this can be done in 
> many ways (XSLT, SAX, put your stuff here) or it may not happen at all. 
> The problems are in the complexity of such a transformation result or 
> the actual XML. Almost any XML can be interpreted as RDF but some of the 
> challenges are:
> 
>  * The amound of unneeded information and the proccess of filtering this 
> out of the transformation result.

The RSS 2.0 folks seem to be claiming the opposite - that an RDF syntax 
causes unnecessary verbosity and that their XML-only syntax is much cleaner.

>  * The result RDF can only be considered as a temporary graph that 
> cannot really be merged with others, unless there is a way to avoid 
> inconsistency between resource identifiers (which may be different for 
> the same resource in different transformations).

I don't understand why this would be the case - surely this is just a 
matter for how the XSLT transform is written?

> If the XML has been designed with RDF in mind (avoiding meaningless 
> containers and using URIs or IDs to identify what RDF sees as subjects) 
> then the problem is much easier to solve.

I guess the more general the XML -> RDF transform mechanism - the more 
likely you are to run into these problems. I've been much less ambitious 
-  my transforms are pretty specific - I write them with a particular 
XML -> RDF transform in mind. The problems you mention above then tend 
to dissapear - at least the structural ones do - identification is 
always going to be problematic I feel - but then I think this problem is 
general to RDF and not neccessarily specific to XML -> RDF transforms.

I do try to make the identification of resources (in my general XML 
documents) as RDF frielndly as possible - that means using URIRefs where 
possible (especially if the resource in question has been defined by 
another party) - or ids that are easily transformable into URIRefs.

In the case of RSS 2.0 (I haven't looked too deeply into this) I assumed 
that everyone would be using the same XSLT doc to do RSS 2.0 -> RDF 
transform - this transform would imbed the logic associated with 
identification of resources and therefor everyone is still using a 
consistent identifiaction scheme (well - at least as consistent as RSS 
0.9x)

[BTW - please don't take this as me supporting RSS 2.0]

> Personally, when designing XML schemas for clients, I find it much 
> easier to use something close the RDF model than a fancy XML one with 
> sections in the document grouping statements etc. The result is simple 
> and predictable while code designed for it can be highly reusable.

I tend not to worry too much what the impact of structure in my XML doc 
will have on an RDF view - I just design the XML doc to do the job I 
want it do do - then if I want to do RDF stuff with it afterwards I'll 
write an XSLT sheet to handle that. But I can see that making my XML 
more RDF friendly may allow for greater reuse (of the transform code at 
least).

Cheers,

-- 
Murray Spork
Centre for Information Technology Innovation (CITI)
The Redcone Project
Queensland University of Technology, Brisbane, Australia
Phone: +61-7-3864-9488
Email: m.spork@qut.edu.au
Web: http://redcone.gbst.com/
Received on Sunday, 6 October 2002 21:56:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:51:56 GMT