- From: Danny Ayers <danny666@virgilio.it>
- Date: Fri, 16 Aug 2002 20:44:31 +0200
- To: "RDF-Interest" <www-rdf-interest@w3.org>, "Seth Russell" <seth@robustai.net>
>Yes, I was going to suggest this schema too. But to convert any >kind of XML >into RDF, don't you still need a mapping from each element and attribute >name in the XML document to the set of things defined by the RDF model? >I've used translatesTo in the following example: > >foo > type infoset:Element; > translatesTo sru:Node. >bar > type infoset:Attribute; > sru:translatesTo sru:Arrow; > sru:content rdfs:Resource. > >So that the following XML: > ><foo> > <bar>http://robustai.net/sailor/</bar> ></foo> > >Would end up being the following RDF: > ><rdf:description> > <bar resource="http://robustai.net/sailor/" /> ></rdf:description> > >This needs a lot more work, but perhaps you get the drift. The point being >that we can extract RDF from any given XML but just describing the XML >schema in this manner. Alternatively you could just assume that Elements >alternate between nodes and arrows as they nest and that all attributes are >arrows ... but making this explicit should give us better results. I've been attacking this general problem on a few levels, in different ways - one of which is very like your translatesTo (I have a mapSource and a mapTarget instead). In all of them is the use of an internal representation that in itself has virtually no constraints, just a decorated digraph. For straight XML with arbitrary/unknown semantics, so far I've been taking it in with vertices corresponding to elements, with these vertices carrying the set of attributes from the element as a hashtable. The nesting of the tree is interpreted as edges/arcs. It would be straightforward to include the attributes as arc+nodes dangling off the element, but with the formats I've played with so far this hasn't been needed. The options are fairly wide open for taking the internal graph and making RDF from it (probably with dangling nodes), and you can see what I wanted the vocabulary for - something like : source xml - <a> <b x="4"/> </a> internal graph - [a] -[parent]-> [b {x=4}] output rdf - (I wasted half an hour trying to manually work this out, then realised it'd be quicker to hack the code - there are undoubtedly many mistakes. Anyone happen to know how to prettyprint away the RDFNsIds in Jena, btw?) <rdf:RDF xmlns:RDFNsId2='http://www.w3.org/2001/04/infoset#' xmlns:RDFNsId1='http://purl.org/puninj/2001/05/rgml-schema#' xmlns:rdfs='http://www.w3.org/2000/01/rdf-schema#' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'> </RDFNsId1:Node> <RDFNsId1:Node rdf:about='vertex1' rdfs:label='a' rdf:type='http://www.w3.org/2001/04/infoset#Element'/> <RDFNsId1:Node rdf:about='vertex2' rdfs:label='b' rdf:type='http://www.w3.org/2001/04/infoset#Element'> <RDFNsId2:attributes> <RDFNsId2:AttributeSet> <rdf:_1 rdf:type='http://www.w3.org/2001/04/infoset#Attribute' RDFNsId2:localName='x' RDFNsId2:normalizedValue='4'/> </RDFNsId2:AttributeSet> </RDFNsId2:attributes> <RDFNsId1:Edge rdf:ID='e1'> <RDFNsId1:source rdf:resource='#vertex1'/> <RDFNsId1:target rdf:resource='#vertex2'/> </RDFNsId1:Edge> </rdf:RDF> The IDs are just local for now - imagine a http://behind RGML is a minimal graph vocabulary. Hmm - that is a little more verbose than <a><b x="4"/></a> Maybe it isn't such a good idea after all... At the smartest (and least implemented) level, I've been putting together a subsystem for taking a graph (or tree) with any semantics and mapping it to the generalised internal graph using transformations specified in an RDF file. This is rather like the RDFPath idea, but I think by allowing processing instructions as well the system should be able to handle a wider variety of input/output. Once the data is in the internal graph, it can be mapped out again in the same declarative fashion. This stuff very quickly flies off into the deep end, so I've spent quite a lot of time hard-coding the transformations for various languages and trying to get the core model in the application in a form that would be a good compromise between versatility & easiness. Incidentally, Manos mentioned redundancy, and my app is becoming a big friend of that - below is the SVG generated at the same time from the input xml. Cheers, Danny. <?xml version="1.0"?> <?xml-stylesheet href="null" type="text/css"?> <svg contentScriptType="text/ecmascript" zoomAndPan="magnify" contentStyleType="text/css" viewBox="0 0 800 600" preserveAspectRatio="xMidYMid meet" xmlns="http://www.w3.org/2000/svg" version="1.0"> <defs> <marker refX="0" markerUnits="strokeWidth" refY="5" orient="auto" class="triangle" markerHeight="9" id="triangle" viewBox="0 0 10 10" preserveAspectRatio="xMidYMid meet" markerWidth="12"> <path d="M 0 0 L 10 5 L 0 10 z"/> </marker> </defs> <g uri="edges"> <g> <line x1="120" x2="120" y1="121" y2="186" class="" marker-end="url(#triangle)"/> </g> <g> <line x1="120" x2="120" y1="121" y2="186" class="" marker-end="url(#triangle)"/> <text class="" transform="translate(120,151)"> subclass </text> </g> </g> <g uri="vertices"> <g transform="translate(120,91) scale(2,2)"> <rect x="-50" width="100" y="-15" height="30" class="vertex"/> <text class="H2"> a </text> </g> <g transform="translate(120,216) scale(2,2)"> <ellipse rx="50" ry="15" class="vertex"/> <text class="H2"> b </text> </g> </g> <g uri="adjuncts"/> </svg>
Received on Friday, 16 August 2002 14:53:16 UTC