W3C home > Mailing lists > Public > public-rif-wg@w3.org > October 2008

Re: straw proposal for mapping XML schema valid XML data to RIF frames

From: Gary Hallmark <gary.hallmark@oracle.com>
Date: Tue, 14 Oct 2008 22:21:21 -0700
Message-ID: <48F57DD1.70704@oracle.com>
To: Dave Reynolds <der@hplb.hpl.hp.com>
CC: RIF WG <public-rif-wg@w3.org>

Of course, for the case where there is more than one target namespace, 
one would need the abbreviated syntax, not the very abbreviated.  So 
yeah, I guess maybe we just bite the bullet and adopt their syntax, 
since it works for all cases.

Gary Hallmark wrote:
> Dave,
>
> Thanks for the suggestion -- I wasn't aware of this work.  The 
> abbreviated syntax looks in the common case where there is a target 
> namespace, e.g. instead of an IRI like
>
> http://example.org/shiporder/shipto/country
>
> You get something like
>
> http://example.org#xmlns(tns,http://example.org)xscd(/tns:shiporder/country) 
>
>
> I think it is because they don't assume that the IRI on the left of 
> the '#' is in fact the target namespace.  I think that they should 
> consider a "very abbreviated syntax" e.g.
>
> http://example.org#/shiporder/shipto/country
>
> when http://example.org is the target namespace of the global element 
> or type (shiporder here).
>
> I do think its a good idea to adopt a subset of their abbreviated 
> schema component paths (stuff inside xscd(...)).  E.g. using /~ to 
> denote the type axis and /@ to denote the attribute axis.  I would not 
> allow //.
>
> Dave Reynolds wrote:
>>
>> Hi Gary,
>>
>> Not studied this properly yet but one comment/question:
>>
>> Might it make sense to use the URI format from XML Schema Component 
>> Designators (the abbreviated syntax) for the class/property URIs?
>>
>> I realize that XSCD is still a work in progress, and has been in 
>> progress for a while, so we probably don't want a dependency. 
>> However, having our URIs compatible with their current proposals 
>> might be worth contemplating.
>>
>> Warning: I've very little knowledge or understanding of XSCD so this 
>> may be a stupid suggestion but at the least we probably want an 
>> answer to any future last call comment around "why not use XSCD".
>>
>> Dave
>>
>> Gary Hallmark wrote:
>>>
>>> This is in response to ACTION-591.
>>>
>>> Mapping XML Schema valid XML Data to RIF Frames
>>> ===============================================
>>>
>>> This is a strawman for mapping XML documents whose structure is 
>>> described by an XML schema[1] to and
>>> from RIF Core frames.
>>>
>>> An XML element has a type as defined by its schema.  The type can be 
>>> simple or complex.  Simple types
>>> can be atomic, lists, or unions.  Atomic types can be primitive 
>>> (e.g. xs:string), or they can be
>>> enumerations or restrictions of atomic types.  A complex type can 
>>> have attributes and content.  The
>>> content can be simple, or it can be a sequence, choice, or set of 
>>> elements.  An attribute has a name and
>>> a value.  The value has a simple type.  Types can be derived by 
>>> extension or restriction.
>>> Elements and types can be named globally or locally.  A local 
>>> element can be defined by referring to a
>>> global element. An element may be defined by referring to a global 
>>> type, or can include the type
>>> definition in the content of its own definition.  There are many 
>>> ways to write the "same" schema.
>>>
>>> Thus, XML schema is quite complex.  Here, we limit our concern to 
>>> mapping elements of complex type to
>>> frames. The only simple types we will handle are the primitive types 
>>> supported by RIF DTB. Our
>>> contribution is mainly to define how to construct IRIs of class 
>>> constants and slot name constants from
>>> the XML schema.
>>>
>>> This is a "strawman by example", so we start with an example 
>>> document (from [2]).
>>>
>>> Example XML Document
>>> --------------------
>>>
>>> <shiporder orderid="889923" xmlns="http://example.org">
>>> <orderperson>John Smith</orderperson>
>>> <shipto>
>>>  <name>Ola Nordmann</name>
>>>  <address>Langgt 23</address>
>>>  <city>4000 Stavanger</city>
>>>  <country>Norway</country>
>>> </shipto>
>>> <item>
>>>  <title>Empire Burlesque</title>
>>>  <note>Special Edition</note>
>>>  <quantity>1</quantity>
>>>  <price>10.90</price>
>>> </item>
>>> <item>
>>>  <title>Hide your heart</title>
>>>  <quantity>1</quantity>
>>>  <price>9.90</price>
>>> </item>
>>> </shiporder>
>>>
>>> We consider 3 ways to write a schema for the above document.
>>>
>>> 1. One big element
>>> ------------------
>>>
>>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>           targetNamespace="http://example.org">
>>>
>>> <xs:element name="shiporder">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element name="orderperson" type="xs:string"/>
>>>   <xs:element name="shipto">
>>>    <xs:complexType>
>>>     <xs:sequence>
>>>      <xs:element name="name" type="xs:string"/>
>>>      <xs:element name="address" type="xs:string"/>
>>>      <xs:element name="city" type="xs:string"/>
>>>      <xs:element name="country" type="xs:string"/>
>>>     </xs:sequence>
>>>    </xs:complexType>
>>>   </xs:element>
>>>   <xs:element name="item" maxOccurs="unbounded">
>>>    <xs:complexType>
>>>     <xs:sequence>
>>>      <xs:element name="title" type="xs:string"/>
>>>      <xs:element name="note" type="xs:string" minOccurs="0"/>
>>>      <xs:element name="quantity" type="xs:positiveInteger"/>
>>>      <xs:element name="price" type="xs:decimal"/>
>>>     </xs:sequence>
>>>    </xs:complexType>
>>>   </xs:element>
>>>  </xs:sequence>
>>>  <xs:attribute name="orderid" type="xs:string" use="required"/>
>>> </xs:complexType>
>>> </xs:element>
>>>
>>> </xs:schema>
>>>
>>> Using the above schema, we represent the shiporder using the 
>>> following RIF-PS:
>>>
>>> Prefix(tns http://example.org)
>>>
>>> _obj1#<tns:/shiporder>
>>> _obj1[<tns:/shiporder@orderid>     -> "889923"
>>>      <tns:/shiporder/orderperson> -> "John Smith"
>>>      <tns:/shiporder/shipto>      -> _obj2      
>>> <tns:/shiporder/item>        -> _obj3
>>>      <tns:/shiporder/item>        -> _obj4
>>> ]
>>>
>>> _obj2#<tns:/shiporder/shipto>
>>> _obj2[<tns:/shiporder/shipto/name>    -> "Ola Nordmann"
>>>      <tns:/shiporder/shipto/address> -> "Langgt 23"
>>>      <tns:/shiporder/shipto/city>    -> "4000 Stavanger"
>>>      <tns:/shiporder/shipto/country> -> "Norway"
>>> ]
>>>
>>> _obj3#<tns:/shiporder/item>
>>> _obj3[<tns:/shiporder/item/title>    -> "Empire Burlesque"
>>>      <tns:/shiporder/item/note>     -> "Special Edition"
>>>      <tns:/shiporder/item/quantity> -> 1
>>>      <tns:/shiporder/item/price>    -> 10.90
>>> ]
>>>
>>> _obj4#<tns:/shiporder/item>
>>> _obj4[<tns:/shiporder/item/title>    -> "Hide your heart"
>>>      <tns:/shiporder/item/quantity> -> 1
>>>      <tns:/shiporder/item/price>    -> 9.90
>>> ]
>>>
>>>
>>> 2. Refs to Global Elements and Attributes
>>> -----------------------------------------
>>>
>>> A second equivalent schema uses the following style.
>>>
>>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>           targetNamespace="http://example.org">
>>>
>>> <!-- definition of simple elements -->
>>> <xs:element name="orderperson" type="xs:string"/>
>>> <xs:element name="name" type="xs:string"/>
>>> <xs:element name="address" type="xs:string"/>
>>> <xs:element name="city" type="xs:string"/>
>>> <xs:element name="country" type="xs:string"/>
>>> <xs:element name="title" type="xs:string"/>
>>> <xs:element name="note" type="xs:string"/>
>>> <xs:element name="quantity" type="xs:positiveInteger"/>
>>> <xs:element name="price" type="xs:decimal"/>
>>>
>>> <!-- definition of attributes -->
>>> <xs:attribute name="orderid" type="xs:string"/>
>>>
>>> <!-- definition of complex elements -->
>>> <xs:element name="shipto">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element ref="name"/>
>>>   <xs:element ref="address"/>
>>>   <xs:element ref="city"/>
>>>   <xs:element ref="country"/>
>>>  </xs:sequence>
>>> </xs:complexType>
>>> </xs:element>
>>> <xs:element name="item">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element ref="title"/>
>>>   <xs:element ref="note" minOccurs="0"/>
>>>   <xs:element ref="quantity"/>
>>>   <xs:element ref="price"/>
>>>  </xs:sequence>
>>> </xs:complexType>
>>> </xs:element>
>>>
>>> <xs:element name="shiporder">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element ref="orderperson"/>
>>>   <xs:element ref="shipto"/>
>>>   <xs:element ref="item" maxOccurs="unbounded"/>
>>>  </xs:sequence>
>>>  <xs:attribute ref="orderid" use="required"/>
>>> </xs:complexType>
>>> </xs:element>
>>>
>>> </xs:schema>
>>>
>>> Using the above schema, we represent the shiporder using the 
>>> following RIF-PS:
>>>
>>> Prefix(tns http://example.org)
>>>
>>> _obj1#<tns:/shiporder>
>>> _obj1[<tns:@orderid>     -> "889923"
>>>      <tns:/orderperson> -> "John Smith"
>>>      <tns:/shipto>      -> _obj2      <tns:/item>        -> _obj3
>>>      <tns:/item>        -> _obj4
>>> ]
>>>
>>> _obj2#<tns:/shipto>
>>> _obj2[<tns:/name>    -> "Ola Nordmann"
>>>      <tns:/address> -> "Langgt 23"
>>>      <tns:/city>    -> "4000 Stavanger"
>>>      <tns:/country> -> "Norway"
>>> ]
>>>
>>> _obj3#<tns:/item>
>>> _obj3[<tns:/title>    -> "Empire Burlesque"
>>>      <tns:/note>     -> "Special Edition"
>>>      <tns:/quantity> -> 1
>>>      <tns:/price>    -> 10.90
>>> ]
>>>
>>> _obj4#<tns:/item>
>>> _obj4[<tns:/title>    -> "Hide your heart"
>>>      <tns:/quantity> -> 1
>>>      <tns:/price>    -> 9.90
>>> ]
>>>
>>> 3. Named Types
>>> --------------
>>>
>>> The third style of schema uses named types.
>>>
>>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>           targetNamespace="http://example.org">
>>>
>>> <xs:simpleType name="stringtype">
>>> <xs:restriction base="xs:string"/>
>>> </xs:simpleType>
>>>
>>> <xs:simpleType name="inttype">
>>> <xs:restriction base="xs:positiveInteger"/>
>>> </xs:simpleType>
>>>
>>> <xs:simpleType name="dectype">
>>> <xs:restriction base="xs:decimal"/>
>>> </xs:simpleType>
>>>
>>> <xs:simpleType name="orderidtype">
>>> <xs:restriction base="xs:string">
>>>  <xs:pattern value="[0-9]{6}"/>
>>> </xs:restriction>
>>> </xs:simpleType>
>>>
>>> <xs:complexType name="shiptotype">
>>> <xs:sequence>
>>>  <xs:element name="name" type="stringtype"/>
>>>  <xs:element name="address" type="stringtype"/>
>>>  <xs:element name="city" type="stringtype"/>
>>>  <xs:element name="country" type="stringtype"/>
>>> </xs:sequence>
>>> </xs:complexType>
>>>
>>> <xs:complexType name="itemtype">
>>> <xs:sequence>
>>>  <xs:element name="title" type="stringtype"/>
>>>  <xs:element name="note" type="stringtype" minOccurs="0"/>
>>>  <xs:element name="quantity" type="inttype"/>
>>>  <xs:element name="price" type="dectype"/>
>>> </xs:sequence>
>>> </xs:complexType>
>>>
>>> <xs:complexType name="shipordertype">
>>> <xs:sequence>
>>>  <xs:element name="orderperson" type="stringtype"/>
>>>  <xs:element name="shipto" type="shiptotype"/>
>>>  <xs:element name="item" maxOccurs="unbounded" type="itemtype"/>
>>> </xs:sequence>
>>> <xs:attribute name="orderid" type="orderidtype" use="required"/>
>>> </xs:complexType>
>>>
>>> <xs:element name="shiporder" type="shipordertype"/>
>>>
>>> </xs:schema>
>>>
>>> Using the above schema, we represent the shiporder using the 
>>> following RIF-PS:
>>>
>>> Prefix(tns http://example.org)
>>>
>>> _obj1#<tns:/shiporder>
>>> <tns:/shiporder>##<tns:/shipordertype>
>>> _obj1[<tns:/shipordertype@orderid>     -> "889923"
>>>      <tns:/shipordertype/orderperson> -> "John Smith"
>>>      <tns:/shipordertype/shipto>      -> _obj2      
>>> <tns:/shipordertype/item>        -> _obj3
>>>      <tns:/shipordertype/item>        -> _obj4
>>> ]
>>>
>>> _obj2#<tns:/shiptotype>
>>> _obj2[<tns:/shiptotype/name>    -> "Ola Nordmann"
>>>      <tns:/shiptotype/address> -> "Langgt 23"
>>>      <tns:/shiptotype/city>    -> "4000 Stavanger"
>>>      <tns:/shiptotype/country> -> "Norway"
>>> ]
>>>
>>> _obj3#<tns:/itemtype>
>>> _obj3[<tns:/itemtype/title>    -> "Empire Burlesque"
>>>      <tns:/itemtype/note>     -> "Special Edition"
>>>      <tns:/itemtype/quantity> -> 1
>>>      <tns:/itemtype/price>    -> 10.90
>>> ]
>>>
>>> _obj4#<tns:/itemtype>
>>> _obj4[<tns:/itemtype/title>    -> "Hide your heart"
>>>      <tns:/itemtype/quantity> -> 1
>>>      <tns:/itemtype/price>    -> 9.90
>>> ]
>>>
>>> General Rules
>>> -------------
>>>
>>> 1. order of sequences is not preserved
>>>
>>> 2. cardinality (minOccurs, maxOccurs) is ignored
>>>
>>> 3. simple types are ignored
>>>
>>> 4. the IRI for an element e (IRI(e)) is given by
>>>  a. <tns:/e> if e is a global element, where tns is the 
>>> targetNamespace of the schema
>>>  b. <C:/e> otherwise, where C is the IRI of the containing 
>>> complexType of e
>>>
>>> 5. the IRI for an attribute a (IRI(a)) is given by
>>>  a. <tns:@a> if a is a global attribute
>>>  b. <C:@a> otherwise, where C is the IRI of the containing 
>>> complexType of a
>>>
>>> 6. the IRI for a complexType c (IRI(c)) is given by
>>>  a. <tns:/c> if c is a global complexType
>>>  b. <E:c> otherwise, where E is the IRI of the element containing c
>>>
>>> 7. an instance of an XML element e with complexType c maps to an 
>>> object _o that is a member of IRI(e).
>>> I.e., _o#IRI(e). If IRI(e) != IRI(c), then additionally we have the 
>>> axiom IRI(e)##IRI(c).
>>>
>>> 8. an element f contained in e (whether in a sequence, choice, or 
>>> all) is a frame slot of _o named
>>> IRI(f).  E.g. _o[IRI(f)->...]
>>>
>>> 9. if complexType sub extends a complexType sup, then 
>>> IRI(sub)##IRI(sup)
>>>
>>> Issues
>>> ------
>>>
>>> Slot names are not disjoint from class names.
>>>
>>> We could of course map much more schema information to axioms.  E.g. 
>>> maxOccurs=1 could be expressed as
>>> ?x=?y :- _o[slot1->?x slot1->?y].  But that's not Core.  Are there 
>>> other things expressible in Core that
>>> we should map?
>>>
>>> Should we care if the trailing char of tns is '/'?  or should we use 
>>> '#' instead of the first '/' in the
>>> curie?  or should we use '/' or '/@' instead of '@' for attributes?
>>>
>>> Neither '#' nor '##' is legal in the conclusion in Core.  Probably 
>>> they should be allowed in ground
>>> facts.
>>>
>>> Because we don't capture all the schema constraints, it may be 
>>> impossible to serialize a collection of
>>> frames computed by a seemingly consistent ruleset into a 
>>> schema-valid XML document.
>>>
>>> Fully-striped XML data doesn't need a schema, and probably should 
>>> follow an RDF-style mapping (not
>>> covered here).
>>>
>>>
>>> [1] http://www.w3.org/TR/xmlschema-0/
>>> [2] http://www.w3schools.com/Schema/schema_example.asp
>>>
>>
>>
>>
>
Received on Wednesday, 15 October 2008 05:22:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 18:33:56 GMT