Re: xml without rdf, but with an ontology [0]

[trimming CC's]

On Thu, 2005-01-13 at 09:37, Ian Davis wrote:
> On 13/01/2005 13:37, Henry Story wrote:
> > I am not sure how to model xml fragments correctly. But there may be 
> > transformations one can make
> > if one knew how to do that correctly. I think there will have to be a 
> > list of string and nodes in it.
> > There must be some solution to this...
> 
> I'm not convinced that there is a general solution (to the problem of 
> converting arbitrary XML to useful triples).

No.  The format and it's spec have to express that they are RDF;
interpreting abitrary XML as RDF is dangerous though interesting. :)

But, it woudl also be nice to have another XML serialization option
that's closer to what XML people tend to write, and thus hopefully have
an option to lure people into adopting RDF as well...

> There are various approaches, all of which appear to have problems. I've 
> thought in the past about heuristics such as "if the the tag contains 
> any character data then the object of the triple is a Literal or 
> XMLLiteral, otherwise it's a blank node". However, all of the following 
> need to be handled consistently and my heuristic fails:

Yes, I was thinking the parser would implement a heursitic along these
lines, with the notable exception of mixed-content.  Modulo markup,
there's not a lot of mixed content.  And if the tools borked on
encountering it, I believe people would be willing to accept it as being
a bit weird.

A translation...

> <feed>

[ a :Feed;

>    <entry>
>     <extension>some content</extension>
>    </entry>

 :entry [ :extension "some content" ];

>    <entry>
>     <extension>some <term>mixed</term> content</extension>
>    </entry>

# error, as per above.

>    <entry>
>     <extension><term>mixed</term></extension>
>    </entry>

 :entry [ :extension [ :term "mixed" ] ];

>    <entry>
>     <extension />
>    </entry>

# ooh ...

 :entry [ :extension [] ];

# or maybe...

 :entry [ :extension rdf:nil ];

# ?

> </feed>

].

> I think the only way is to use an annotated schema to provide the 
> transformation hints.

Perhaps, though hopefully only to make certain hard things possible.  I
believe that 80% of the common uses of XML for describing data formats
can be done with schema-agnostic rules or simple in-band data.

...jsled

-- 
http://asynchronous.org/ - `a=jsled; b=asynchronous.org; echo ${a}@${b}`

Received on Thursday, 13 January 2005 15:09:24 UTC