- From: Paul Tyson <phtyson@sbcglobal.net>
- Date: Tue, 01 Jul 2008 23:36:57 -0500
- To: semantic-web@w3.org
Maciej, here is another way to look at this. It is not any simpler, but
it does illustrate a point of isomorphism between XML and RDF.
Take each of your XML samples and convert to Infoset RDF. The first
sample would look like this:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xis: <http://www.w3.org/2001/04/infoset#> .
_:jA29510 a xis:Document ;
xis:children _:jA29512 ;
xis:documentElement _:jUd0e1 .
_:jA29512 a xis:InfoSetSeq ;
rdf:_1 _:jUd0e1 .
_:jA29514 a xis:InfoSetSeq ;
rdf:_1 _:jUd0e2 ;
rdf:_2 _:jUd0e4 .
_:jA29516 a xis:InfoSetSeq ;
rdf:_1 "Sensor220" .
_:jA29518 a xis:InfoSetSeq ;
rdf:_1 _:jUd0e5 .
_:jA29520 a xis:InfoSetSeq ;
rdf:_1 "E330" .
_:jUd0e1 a xis:Element ;
xis:children _:jA29514 ;
xis:localName "Sensor" .
_:jUd0e2 a xis:Element ;
xis:children _:jA29516 ;
xis:localName "name" .
_:jUd0e4 a xis:Element ;
xis:children _:jA29518 ;
xis:localName "isLocatedNearBy" .
_:jUd0e5 a xis:Element ;
xis:children _:jA29520 ;
xis:localName "Road" .
Then you could write a SPARQL query to get the information you wanted
from any of the three formats, by using a UNION of patterns. If later
you introduced a new XML structure you would add another UNION pattern
to your query. Or you could CONSTRUCT a new graph in the desired schema
from any of the various input schemas, again by using a UNION of
patterns in the WHERE clause.
You could of course do the same thing by writing an XSLT stylesheet to
convert any of your input formats to a single output format.
Any XML instance can be considered a compact, early-bound serialization
of an infoset RDF graph.
A simple, generic XSLT can be used to convert any arbitrary XML instance
to Infoset rdf. Here's a sample that does most of it.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xis="http://www.w3.org/2001/04/infoset#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
version="2.0">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<rdf:RDF>
<xis:Document>
<xis:documentElement rdf:nodeID="{generate-id(child::*[1])}"/>
<xis:children>
<xis:InfoSetSeq>
<xsl:apply-templates/>
</xis:InfoSetSeq>
</xis:children>
</xis:Document>
</rdf:RDF>
</xsl:template>
<xsl:template match="*">
<rdf:li>
<xis:Element rdf:nodeID="{generate-id()}">
<xis:localName><xsl:value-of
select="local-name()"/></xis:localName>
<xsl:if test="@*">
<xis:attributes>
<xis:AttributeSet>
<xsl:apply-templates select="@*"/>
</xis:AttributeSet>
</xis:attributes>
</xsl:if>
<xsl:if test="*|text()|comment()|processing-instruction()">
<xis:children>
<xis:InfoSetSeq>
<xsl:apply-templates/>
</xis:InfoSetSeq>
</xis:children>
</xsl:if>
</xis:Element>
</rdf:li>
</xsl:template><!-- match="*" -->
<xsl:template match="@*">
<rdf:li>
<xis:Attribute>
<xis:localName><xsl:value-of
select="local-name()"/></xis:localName>
<xis:normalizedValue><xsl:value-of
select="."/></xis:normalizedValue>
</xis:Attribute>
</rdf:li>
</xsl:template><!-- match="@*" -->
<xsl:template match="text()">
<rdf:li>
<xsl:value-of select="normalize-space(.)"/>
</rdf:li>
</xsl:template>
<xsl:template match="comment()">
<rdf:li>
<xis:Comment>
<xis:content>
<xsl:value-of select="."/>
</xis:content>
</xis:Comment>
</rdf:li>
</xsl:template><!-- match="comment()" -->
<xsl:template match="processing-instruction()">
<rdf:li>
<xis:ProcessingInstruction>
<xis:target>
<xsl:value-of select="local-name()"/>
</xis:target>
<xis:content>
<xsl:value-of select="."/>
</xis:content>
</xis:ProcessingInstruction>
</rdf:li>
</xsl:template>
</xsl:stylesheet>
Maciej Gawinecki wrote:
>
> In one of the article comparing two data models: XML and RDF I found a
> statement stating that (I'm loosely citing from my memory):
>
> Searching XML with XPath query expression is easy if you know the
> schema of the document being quiried. However, the same query will not
> work any a document, which is differently structured, but contains
> equivalent information. This can be solved by usage of RDF model,
> which can be then queried with RDQL or SPARQL query.
>
> Is that really true, that XPath-based XML search is limited due to its
> structure? Yes, that's why there is a great research on keyword-based
> quering of XML documents (not knowing schema in advance). But is it RDF
> really better for this issue ?
>
> I will try to give a few example what I exactly mean. [Of course, I'm
> ommiting here the problem of knowning the name a tag/property/resource,
> only the structure can be different.] Let's see two XML documents:
>
> <Sensor>
> <name>Sensor220</name>
> <isLocatedNearBy>
> <Road>
> E330
> </Road>
> <isLocatedNearBy>
> </Sensor>
>
> Here road value can be check through XPath expression:
> \\Sensor\isLocatedNearBy\Road
>
> And let's see differently structured document (road defined by name
> property)
>
> <Sensor>
> <name>Sensor220</name>
> <isLocatedNearBy>
> <Road>
> <name>E330</name>
> </Road>
> <isLocatedNearBy>
> </Sensor>
>
> With XPath expression: \\Sensor\isLocatedNearBy\Road\name
>
> Or yet another one (road is ancestor tag to the sensor tag, not the
> oposite)
>
> <Road>
> <name>E330</name>
> <hasSensor>
> <Sensor>
> <name>Sensor 220</name>
> </Sensor>
> </hasSensor>
> </Road>
>
> XPath: \\Road\name
>
> The same problem would be with RDF. Let see the first model
>
> :Sensor220 :isLocatedNearBy :Road_E330 .
>
> WHERE clause of SPARQL query would be then like a
>
> ?s :isLocatedNearBy :Road_E330 .
>
> For other version we define a road with a specific value of hasName
> property:
>
> :Sensor220 :isLocatedNearBy :RoadXXX .
> :RoadXXX :hasName "E330" .
>
> the SPARQL query part:
>
> ?s :isLocatedNearBy ?r .
> ?r :hasName "E330" .
>
> or by analogy to the third XML representation (road "has" a sensor, not
> the opposite):
>
> :RoadXXX :hasName "E330" .
> :RoadXXX :hasSensor :Sensor220 .
>
> the SPARQL query part:
>
> ?r :hasName "E330" .
> ?r :hasSensor ?s .
>
> Can someone comment it ?
>
> Thanks,
> Maciej
>
>
>
>
Received on Wednesday, 2 July 2008 04:36:40 UTC