XML/RDF syntax equivalences from Andrea Chiodi on 2000-05-26 (www-rdf-interest@w3.org from May 2000)

From: Andrea Chiodi <andrea.chiodi@mail.inet.it>
Date: Fri, 26 May 2000 12:32:26 +0200
To: "'www-rdf-interest@w3.org'" <www-rdf-interest@w3.org>
Message-ID: <01BFC70E.75917600.andrea.chiodi@mail.inet.it>
I need to define a XML schema for my application. I'd love to apply RDF 
concepts, but I can't risk to scare my collegues with triples, semantic web 
and other philosophical stuff. So I'm try to convince myself that I can 
write a XML form that will be parse-able as RDF also.
This is my reasoning: please correct me if I'm wrong.

Let's take my XML form:

(1)
<author>
    <name> Peppo </name>
    <address>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
    </address>
</author>

I want to interpret it as RDF.
I can assume that all the tags are instances of rdf:Property, defining this 
basic schema:

<rdfs:Property ID="author" />
<rdfs:Property ID="address"/>
<rdfs:Property ID="name" />
<rdfs:Property ID="city" />
<rdfs:Property ID="state"/>

Now, the form (1) can be interpreted in what RDFMS calls the 'second 
abbreviation form' of :

(2)
<rdf:Description>
<author>
  <rdf:Description>
    <name> Peppo </name>
    <address>
      <rdf:Description>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
      </rdf:Description>
    </address>
  </rdf:Description>
</author>
</rdf:Description>

The resources in <rdf:Description> have no rdf:type. So, <city> and 
<address> are properties without a defined domain. However, their values 
are resources, so I should be allowed to assume that their rfd:type is 
rdf:Resource. The same form can be written:

(3)
<rdf:Description>
 <rdf:type resource="&rdf;Resource" />
 <author>
  <rdf:Description>
    <rdf:type resource="&rdf;Resource" />
    <name> Peppo </name>
    <address>
      <rdf:Description>
        <rdf:type resource="&rdf;Resource" />
        <city> PeppoCity </city>
        <state> PeppoLand </state>
      </rdf:Description>
    </address>
  </rdf:Description>
 </author>
</rdf:Description>

BTW, applying another abbreviation form I should be allowed to write:

(4)
<rdf:Resource>
<author>
  <rdf:Resource>
    <name> Peppo </name>
    <address>
      <rdf:Resource>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
      </rdf:Resource>
    </address>
  </rdf:Resource>
</author>
</rdf:Resource>

I never seen a similar application of 'Resource', but it seems correct. Any 
comment ?

Now, let's say I want to better define my application's schema:

(6)
<rdfs:Class ID="Person" />
<rdfs:Class ID="Address" />

<rdfs:Property ID="author"
  rdfs:range="Person" />
<rdfs:Property ID="address"
  rdfs:domain="&myschema;Person"
  rdfs:range="Address" />
<rdfs:Property ID="name"
  rdfs:domain="&myschema;Person" />
<rdfs:Property ID="city"
  rdfs:domain="&myschema;Address" />
<rdfs:Property ID="state"
  rdfs:domain="&myschema;Address" />

Now I can write my form as:

(7)
<rdf:Resource>
<author>
  <Person>
    <name> Peppo </name>
    <address>
      <Address>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
      </Address>
    </address>
  </Person>
</author>
</rdf:Resource>

This really represent an <author> property valued with a <Person> object. 
It is true RDF (I hope).  But it is exacly the same form as (1).

So, my hypothesys is:
- The form (1) is RDF.
- Having defined <rdf:Property xxx/> for each XML tag, a standard parser 
should be able to parse it.
- The types in <Description> can be left undefined, or assumed to be 
'rdf:Resource', but ...
- Having defined <rdf:Class ID="x"> and <Property ID="p" rdfs:range="x">, 
if the (?standard?) RDF parser assumes the rdfs:range as the default type 
for the property x (e.g. <Person> for <author>) it should generate the form 
(7) automatically.
- The root of the XML document (let it be our <author> ) could be 
interpreted as a property applied to the outer <Description>, that is the 
only (top level) Description of the RDF document.

?May I continue or I'm already wrong?  !Do  not unsubscribe me, please #-(

There are some cases that seems special, where I can't find a clear mapping 
from XML to RDF.

(A) Attributes mixed to elements creates a syntax problem, being this form 
not allowed in RDF (BNF [6.1] in RDFMS):

(8)
<author sex="M">
    <name> Peppo </name>
    <address role="Home">
        <city> PeppoCity </city>
        <state> PeppoLand </state>
    </address>
</author>

I would love to consider this as equivalent to:

(9)
<author>
    <sex> M </sex>
    <name> Peppo </name>
    <address>
        <role> Home </role>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
    </address>
</author>

To be honest, I don't understand the reason why attributes and elements 
can't mixed in a property.

A hack, however, could be to consider every attribute of a property as 
applied to the correspondent <Description>, as in:

(10)
<rdf:Description>
<author>
  <rdf:Description sex="M">
    <name> Peppo </name>
    <address>
      <rdf:Description>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
      </rdf:Description>
    </address>
  </rdf:Description>
</author>
</rdf:Description>

This is allowed ( BNF [6.3] in RDFMS ). However (9) will not be parsed by a 
standard RDF processor.

(B) Sometimes (e.g. P3P specification) , I've seen empty elements used as 
value, like <male/> in:

(11)
<author>
    <sex> <male/> </sex>
    <name> Peppo </name>
</author>

Even in this case, I can define <Property ID="male" />, and interpret (11) 
in this way:

(12)
<author>
  <rdf:Description>
    <sex>
      <rdf:Description>
        <male/>
      </rdf:Description>
    </sex>
    <name> Peppo </name>
  </rdf:Description>
</author>

<male> value is empty but is however *defined* (while, in this example, 
<female/> is undefined). This is  like in C language: '#define male 1'  and 
 '#define male' are the same from the '#ifdef male' point of view.

Anyway, the resulting RDF model is a bit strange: a reasonable schema will 
probably define
<Class ID="male" rdfs:domain="sex" />, not <Property ID="male" />.

Maybe, I can write:

<rdfs:Class ID="male"  rdfs:domain="sex" />
<rdfs:Class ID="female"  rdfs:domain="sex" />
<rdfs:Property ID="sex"
  rdfs:range="male"
  rdfs:range="female"
/>

Will a standard RDF parsed interpret it in this way?:

(13)
<author>
  <rdf:Description>
    <sex>
      <rdf:Description>
        <rdf:type resource ="male" />
      </rdf:Description>
    </sex>
    <name> Peppo </name>
  </rdf:Description>
</author>

If this is true, I could *drive* the RDF parsed througth the XML form. I 
can force the interpretation of:

(14)
<book>
<author>
    <name> Peppo </name>
    <address>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
    </address>
</author>
</book>

with:

(15)
<rdfs:Class ID="book" />
<rdfs:Property ID="author" />
<rdfs:Property ID="address"/>
<rdfs:Property ID="name" />
<rdfs:Property ID="city" />
<rdfs:Property ID="state"/>

to obtain (same as (2), but now the root is forced to be a class, not a 
property):

(16)
<rdf:Description>
<rdfs:type="book" />
<author>
  <rdf:Description>
    <name> Peppo </name>
    <address>
      <rdf:Description>
        <city> PeppoCity </city>
        <state> PeppoLand </state>
      </rdf:Description>
    </address>
  </rdf:Description>
</author>
</rdf:Description>


Thank you for reading this long posting!

---
Andrea Chiodi (andrea.chiodi@mail.inet.it)
Received on Friday, 26 May 2000 07:57:18 UTC