Re: straw proposal for mapping XML schema valid XML data to RIF frames

Hi Gary,

Not studied this properly yet but one comment/question:

Might it make sense to use the URI format from XML Schema Component 
Designators (the abbreviated syntax) for the class/property URIs?

I realize that XSCD is still a work in progress, and has been in 
progress for a while, so we probably don't want a dependency. However, 
having our URIs compatible with their current proposals might be worth 
contemplating.

Warning: I've very little knowledge or understanding of XSCD so this may 
be a stupid suggestion but at the least we probably want an answer to 
any future last call comment around "why not use XSCD".

Dave

Gary Hallmark wrote:
> 
> This is in response to ACTION-591.
> 
> Mapping XML Schema valid XML Data to RIF Frames
> ===============================================
> 
> This is a strawman for mapping XML documents whose structure is 
> described by an XML schema[1] to and
> from RIF Core frames.
> 
> An XML element has a type as defined by its schema.  The type can be 
> simple or complex.  Simple types
> can be atomic, lists, or unions.  Atomic types can be primitive (e.g. 
> xs:string), or they can be
> enumerations or restrictions of atomic types.  A complex type can have 
> attributes and content.  The
> content can be simple, or it can be a sequence, choice, or set of 
> elements.  An attribute has a name and
> a value.  The value has a simple type.  Types can be derived by 
> extension or restriction.
> Elements and types can be named globally or locally.  A local element 
> can be defined by referring to a
> global element. An element may be defined by referring to a global type, 
> or can include the type
> definition in the content of its own definition.  There are many ways to 
> write the "same" schema.
> 
> Thus, XML schema is quite complex.  Here, we limit our concern to 
> mapping elements of complex type to
> frames. The only simple types we will handle are the primitive types 
> supported by RIF DTB. Our
> contribution is mainly to define how to construct IRIs of class 
> constants and slot name constants from
> the XML schema.
> 
> This is a "strawman by example", so we start with an example document 
> (from [2]).
> 
> Example XML Document
> --------------------
> 
> <shiporder orderid="889923" xmlns="http://example.org">
> <orderperson>John Smith</orderperson>
> <shipto>
>  <name>Ola Nordmann</name>
>  <address>Langgt 23</address>
>  <city>4000 Stavanger</city>
>  <country>Norway</country>
> </shipto>
> <item>
>  <title>Empire Burlesque</title>
>  <note>Special Edition</note>
>  <quantity>1</quantity>
>  <price>10.90</price>
> </item>
> <item>
>  <title>Hide your heart</title>
>  <quantity>1</quantity>
>  <price>9.90</price>
> </item>
> </shiporder>
> 
> We consider 3 ways to write a schema for the above document.
> 
> 1. One big element
> ------------------
> 
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>           targetNamespace="http://example.org">
> 
> <xs:element name="shiporder">
> <xs:complexType>
>  <xs:sequence>
>   <xs:element name="orderperson" type="xs:string"/>
>   <xs:element name="shipto">
>    <xs:complexType>
>     <xs:sequence>
>      <xs:element name="name" type="xs:string"/>
>      <xs:element name="address" type="xs:string"/>
>      <xs:element name="city" type="xs:string"/>
>      <xs:element name="country" type="xs:string"/>
>     </xs:sequence>
>    </xs:complexType>
>   </xs:element>
>   <xs:element name="item" maxOccurs="unbounded">
>    <xs:complexType>
>     <xs:sequence>
>      <xs:element name="title" type="xs:string"/>
>      <xs:element name="note" type="xs:string" minOccurs="0"/>
>      <xs:element name="quantity" type="xs:positiveInteger"/>
>      <xs:element name="price" type="xs:decimal"/>
>     </xs:sequence>
>    </xs:complexType>
>   </xs:element>
>  </xs:sequence>
>  <xs:attribute name="orderid" type="xs:string" use="required"/>
> </xs:complexType>
> </xs:element>
> 
> </xs:schema>
> 
> Using the above schema, we represent the shiporder using the following 
> RIF-PS:
> 
> Prefix(tns http://example.org)
> 
> _obj1#<tns:/shiporder>
> _obj1[<tns:/shiporder@orderid>     -> "889923"
>      <tns:/shiporder/orderperson> -> "John Smith"
>      <tns:/shiporder/shipto>      -> _obj2      
> <tns:/shiporder/item>        -> _obj3
>      <tns:/shiporder/item>        -> _obj4
> ]
> 
> _obj2#<tns:/shiporder/shipto>
> _obj2[<tns:/shiporder/shipto/name>    -> "Ola Nordmann"
>      <tns:/shiporder/shipto/address> -> "Langgt 23"
>      <tns:/shiporder/shipto/city>    -> "4000 Stavanger"
>      <tns:/shiporder/shipto/country> -> "Norway"
> ]
> 
> _obj3#<tns:/shiporder/item>
> _obj3[<tns:/shiporder/item/title>    -> "Empire Burlesque"
>      <tns:/shiporder/item/note>     -> "Special Edition"
>      <tns:/shiporder/item/quantity> -> 1
>      <tns:/shiporder/item/price>    -> 10.90
> ]
> 
> _obj4#<tns:/shiporder/item>
> _obj4[<tns:/shiporder/item/title>    -> "Hide your heart"
>      <tns:/shiporder/item/quantity> -> 1
>      <tns:/shiporder/item/price>    -> 9.90
> ]
> 
> 
> 2. Refs to Global Elements and Attributes
> -----------------------------------------
> 
> A second equivalent schema uses the following style.
> 
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>           targetNamespace="http://example.org">
> 
> <!-- definition of simple elements -->
> <xs:element name="orderperson" type="xs:string"/>
> <xs:element name="name" type="xs:string"/>
> <xs:element name="address" type="xs:string"/>
> <xs:element name="city" type="xs:string"/>
> <xs:element name="country" type="xs:string"/>
> <xs:element name="title" type="xs:string"/>
> <xs:element name="note" type="xs:string"/>
> <xs:element name="quantity" type="xs:positiveInteger"/>
> <xs:element name="price" type="xs:decimal"/>
> 
> <!-- definition of attributes -->
> <xs:attribute name="orderid" type="xs:string"/>
> 
> <!-- definition of complex elements -->
> <xs:element name="shipto">
> <xs:complexType>
>  <xs:sequence>
>   <xs:element ref="name"/>
>   <xs:element ref="address"/>
>   <xs:element ref="city"/>
>   <xs:element ref="country"/>
>  </xs:sequence>
> </xs:complexType>
> </xs:element>
> <xs:element name="item">
> <xs:complexType>
>  <xs:sequence>
>   <xs:element ref="title"/>
>   <xs:element ref="note" minOccurs="0"/>
>   <xs:element ref="quantity"/>
>   <xs:element ref="price"/>
>  </xs:sequence>
> </xs:complexType>
> </xs:element>
> 
> <xs:element name="shiporder">
> <xs:complexType>
>  <xs:sequence>
>   <xs:element ref="orderperson"/>
>   <xs:element ref="shipto"/>
>   <xs:element ref="item" maxOccurs="unbounded"/>
>  </xs:sequence>
>  <xs:attribute ref="orderid" use="required"/>
> </xs:complexType>
> </xs:element>
> 
> </xs:schema>
> 
> Using the above schema, we represent the shiporder using the following 
> RIF-PS:
> 
> Prefix(tns http://example.org)
> 
> _obj1#<tns:/shiporder>
> _obj1[<tns:@orderid>     -> "889923"
>      <tns:/orderperson> -> "John Smith"
>      <tns:/shipto>      -> _obj2      <tns:/item>        -> _obj3
>      <tns:/item>        -> _obj4
> ]
> 
> _obj2#<tns:/shipto>
> _obj2[<tns:/name>    -> "Ola Nordmann"
>      <tns:/address> -> "Langgt 23"
>      <tns:/city>    -> "4000 Stavanger"
>      <tns:/country> -> "Norway"
> ]
> 
> _obj3#<tns:/item>
> _obj3[<tns:/title>    -> "Empire Burlesque"
>      <tns:/note>     -> "Special Edition"
>      <tns:/quantity> -> 1
>      <tns:/price>    -> 10.90
> ]
> 
> _obj4#<tns:/item>
> _obj4[<tns:/title>    -> "Hide your heart"
>      <tns:/quantity> -> 1
>      <tns:/price>    -> 9.90
> ]
> 
> 3. Named Types
> --------------
> 
> The third style of schema uses named types.
> 
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>           targetNamespace="http://example.org">
> 
> <xs:simpleType name="stringtype">
> <xs:restriction base="xs:string"/>
> </xs:simpleType>
> 
> <xs:simpleType name="inttype">
> <xs:restriction base="xs:positiveInteger"/>
> </xs:simpleType>
> 
> <xs:simpleType name="dectype">
> <xs:restriction base="xs:decimal"/>
> </xs:simpleType>
> 
> <xs:simpleType name="orderidtype">
> <xs:restriction base="xs:string">
>  <xs:pattern value="[0-9]{6}"/>
> </xs:restriction>
> </xs:simpleType>
> 
> <xs:complexType name="shiptotype">
> <xs:sequence>
>  <xs:element name="name" type="stringtype"/>
>  <xs:element name="address" type="stringtype"/>
>  <xs:element name="city" type="stringtype"/>
>  <xs:element name="country" type="stringtype"/>
> </xs:sequence>
> </xs:complexType>
> 
> <xs:complexType name="itemtype">
> <xs:sequence>
>  <xs:element name="title" type="stringtype"/>
>  <xs:element name="note" type="stringtype" minOccurs="0"/>
>  <xs:element name="quantity" type="inttype"/>
>  <xs:element name="price" type="dectype"/>
> </xs:sequence>
> </xs:complexType>
> 
> <xs:complexType name="shipordertype">
> <xs:sequence>
>  <xs:element name="orderperson" type="stringtype"/>
>  <xs:element name="shipto" type="shiptotype"/>
>  <xs:element name="item" maxOccurs="unbounded" type="itemtype"/>
> </xs:sequence>
> <xs:attribute name="orderid" type="orderidtype" use="required"/>
> </xs:complexType>
> 
> <xs:element name="shiporder" type="shipordertype"/>
> 
> </xs:schema>
> 
> Using the above schema, we represent the shiporder using the following 
> RIF-PS:
> 
> Prefix(tns http://example.org)
> 
> _obj1#<tns:/shiporder>
> <tns:/shiporder>##<tns:/shipordertype>
> _obj1[<tns:/shipordertype@orderid>     -> "889923"
>      <tns:/shipordertype/orderperson> -> "John Smith"
>      <tns:/shipordertype/shipto>      -> _obj2      
> <tns:/shipordertype/item>        -> _obj3
>      <tns:/shipordertype/item>        -> _obj4
> ]
> 
> _obj2#<tns:/shiptotype>
> _obj2[<tns:/shiptotype/name>    -> "Ola Nordmann"
>      <tns:/shiptotype/address> -> "Langgt 23"
>      <tns:/shiptotype/city>    -> "4000 Stavanger"
>      <tns:/shiptotype/country> -> "Norway"
> ]
> 
> _obj3#<tns:/itemtype>
> _obj3[<tns:/itemtype/title>    -> "Empire Burlesque"
>      <tns:/itemtype/note>     -> "Special Edition"
>      <tns:/itemtype/quantity> -> 1
>      <tns:/itemtype/price>    -> 10.90
> ]
> 
> _obj4#<tns:/itemtype>
> _obj4[<tns:/itemtype/title>    -> "Hide your heart"
>      <tns:/itemtype/quantity> -> 1
>      <tns:/itemtype/price>    -> 9.90
> ]
> 
> General Rules
> -------------
> 
> 1. order of sequences is not preserved
> 
> 2. cardinality (minOccurs, maxOccurs) is ignored
> 
> 3. simple types are ignored
> 
> 4. the IRI for an element e (IRI(e)) is given by
>  a. <tns:/e> if e is a global element, where tns is the targetNamespace 
> of the schema
>  b. <C:/e> otherwise, where C is the IRI of the containing complexType of e
> 
> 5. the IRI for an attribute a (IRI(a)) is given by
>  a. <tns:@a> if a is a global attribute
>  b. <C:@a> otherwise, where C is the IRI of the containing complexType of a
> 
> 6. the IRI for a complexType c (IRI(c)) is given by
>  a. <tns:/c> if c is a global complexType
>  b. <E:c> otherwise, where E is the IRI of the element containing c
> 
> 7. an instance of an XML element e with complexType c maps to an object 
> _o that is a member of IRI(e).
> I.e., _o#IRI(e). If IRI(e) != IRI(c), then additionally we have the 
> axiom IRI(e)##IRI(c).
> 
> 8. an element f contained in e (whether in a sequence, choice, or all) 
> is a frame slot of _o named
> IRI(f).  E.g. _o[IRI(f)->...]
> 
> 9. if complexType sub extends a complexType sup, then IRI(sub)##IRI(sup)
> 
> Issues
> ------
> 
> Slot names are not disjoint from class names.
> 
> We could of course map much more schema information to axioms.  E.g. 
> maxOccurs=1 could be expressed as
> ?x=?y :- _o[slot1->?x slot1->?y].  But that's not Core.  Are there other 
> things expressible in Core that
> we should map?
> 
> Should we care if the trailing char of tns is '/'?  or should we use '#' 
> instead of the first '/' in the
> curie?  or should we use '/' or '/@' instead of '@' for attributes?
> 
> Neither '#' nor '##' is legal in the conclusion in Core.  Probably they 
> should be allowed in ground
> facts.
> 
> Because we don't capture all the schema constraints, it may be 
> impossible to serialize a collection of
> frames computed by a seemingly consistent ruleset into a schema-valid 
> XML document.
> 
> Fully-striped XML data doesn't need a schema, and probably should follow 
> an RDF-style mapping (not
> covered here).
> 
> 
> [1] http://www.w3.org/TR/xmlschema-0/
> [2] http://www.w3schools.com/Schema/schema_example.asp
> 

Received on Tuesday, 14 October 2008 15:35:12 UTC