Re: straw proposal for mapping XML schema valid XML data to RIF frames

Dave,

Thanks for the suggestion -- I wasn't aware of this work.  The 
abbreviated syntax looks in the common case where there is a target 
namespace, e.g. instead of an IRI like

http://example.org/shiporder/shipto/country

You get something like

http://example.org#xmlns(tns,http://example.org)xscd(/tns:shiporder/country)

I think it is because they don't assume that the IRI on the left of the 
'#' is in fact the target namespace.  I think that they should consider 
a "very abbreviated syntax" e.g.

http://example.org#/shiporder/shipto/country

when http://example.org is the target namespace of the global element or 
type (shiporder here).

I do think its a good idea to adopt a subset of their abbreviated schema 
component paths (stuff inside xscd(...)).  E.g. using /~ to denote the 
type axis and /@ to denote the attribute axis.  I would not allow //.

Dave Reynolds wrote:
>
> Hi Gary,
>
> Not studied this properly yet but one comment/question:
>
> Might it make sense to use the URI format from XML Schema Component 
> Designators (the abbreviated syntax) for the class/property URIs?
>
> I realize that XSCD is still a work in progress, and has been in 
> progress for a while, so we probably don't want a dependency. However, 
> having our URIs compatible with their current proposals might be worth 
> contemplating.
>
> Warning: I've very little knowledge or understanding of XSCD so this 
> may be a stupid suggestion but at the least we probably want an answer 
> to any future last call comment around "why not use XSCD".
>
> Dave
>
> Gary Hallmark wrote:
>>
>> This is in response to ACTION-591.
>>
>> Mapping XML Schema valid XML Data to RIF Frames
>> ===============================================
>>
>> This is a strawman for mapping XML documents whose structure is 
>> described by an XML schema[1] to and
>> from RIF Core frames.
>>
>> An XML element has a type as defined by its schema.  The type can be 
>> simple or complex.  Simple types
>> can be atomic, lists, or unions.  Atomic types can be primitive (e.g. 
>> xs:string), or they can be
>> enumerations or restrictions of atomic types.  A complex type can 
>> have attributes and content.  The
>> content can be simple, or it can be a sequence, choice, or set of 
>> elements.  An attribute has a name and
>> a value.  The value has a simple type.  Types can be derived by 
>> extension or restriction.
>> Elements and types can be named globally or locally.  A local element 
>> can be defined by referring to a
>> global element. An element may be defined by referring to a global 
>> type, or can include the type
>> definition in the content of its own definition.  There are many ways 
>> to write the "same" schema.
>>
>> Thus, XML schema is quite complex.  Here, we limit our concern to 
>> mapping elements of complex type to
>> frames. The only simple types we will handle are the primitive types 
>> supported by RIF DTB. Our
>> contribution is mainly to define how to construct IRIs of class 
>> constants and slot name constants from
>> the XML schema.
>>
>> This is a "strawman by example", so we start with an example document 
>> (from [2]).
>>
>> Example XML Document
>> --------------------
>>
>> <shiporder orderid="889923" xmlns="http://example.org">
>> <orderperson>John Smith</orderperson>
>> <shipto>
>>  <name>Ola Nordmann</name>
>>  <address>Langgt 23</address>
>>  <city>4000 Stavanger</city>
>>  <country>Norway</country>
>> </shipto>
>> <item>
>>  <title>Empire Burlesque</title>
>>  <note>Special Edition</note>
>>  <quantity>1</quantity>
>>  <price>10.90</price>
>> </item>
>> <item>
>>  <title>Hide your heart</title>
>>  <quantity>1</quantity>
>>  <price>9.90</price>
>> </item>
>> </shiporder>
>>
>> We consider 3 ways to write a schema for the above document.
>>
>> 1. One big element
>> ------------------
>>
>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>           targetNamespace="http://example.org">
>>
>> <xs:element name="shiporder">
>> <xs:complexType>
>>  <xs:sequence>
>>   <xs:element name="orderperson" type="xs:string"/>
>>   <xs:element name="shipto">
>>    <xs:complexType>
>>     <xs:sequence>
>>      <xs:element name="name" type="xs:string"/>
>>      <xs:element name="address" type="xs:string"/>
>>      <xs:element name="city" type="xs:string"/>
>>      <xs:element name="country" type="xs:string"/>
>>     </xs:sequence>
>>    </xs:complexType>
>>   </xs:element>
>>   <xs:element name="item" maxOccurs="unbounded">
>>    <xs:complexType>
>>     <xs:sequence>
>>      <xs:element name="title" type="xs:string"/>
>>      <xs:element name="note" type="xs:string" minOccurs="0"/>
>>      <xs:element name="quantity" type="xs:positiveInteger"/>
>>      <xs:element name="price" type="xs:decimal"/>
>>     </xs:sequence>
>>    </xs:complexType>
>>   </xs:element>
>>  </xs:sequence>
>>  <xs:attribute name="orderid" type="xs:string" use="required"/>
>> </xs:complexType>
>> </xs:element>
>>
>> </xs:schema>
>>
>> Using the above schema, we represent the shiporder using the 
>> following RIF-PS:
>>
>> Prefix(tns http://example.org)
>>
>> _obj1#<tns:/shiporder>
>> _obj1[<tns:/shiporder@orderid>     -> "889923"
>>      <tns:/shiporder/orderperson> -> "John Smith"
>>      <tns:/shiporder/shipto>      -> _obj2      
>> <tns:/shiporder/item>        -> _obj3
>>      <tns:/shiporder/item>        -> _obj4
>> ]
>>
>> _obj2#<tns:/shiporder/shipto>
>> _obj2[<tns:/shiporder/shipto/name>    -> "Ola Nordmann"
>>      <tns:/shiporder/shipto/address> -> "Langgt 23"
>>      <tns:/shiporder/shipto/city>    -> "4000 Stavanger"
>>      <tns:/shiporder/shipto/country> -> "Norway"
>> ]
>>
>> _obj3#<tns:/shiporder/item>
>> _obj3[<tns:/shiporder/item/title>    -> "Empire Burlesque"
>>      <tns:/shiporder/item/note>     -> "Special Edition"
>>      <tns:/shiporder/item/quantity> -> 1
>>      <tns:/shiporder/item/price>    -> 10.90
>> ]
>>
>> _obj4#<tns:/shiporder/item>
>> _obj4[<tns:/shiporder/item/title>    -> "Hide your heart"
>>      <tns:/shiporder/item/quantity> -> 1
>>      <tns:/shiporder/item/price>    -> 9.90
>> ]
>>
>>
>> 2. Refs to Global Elements and Attributes
>> -----------------------------------------
>>
>> A second equivalent schema uses the following style.
>>
>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>           targetNamespace="http://example.org">
>>
>> <!-- definition of simple elements -->
>> <xs:element name="orderperson" type="xs:string"/>
>> <xs:element name="name" type="xs:string"/>
>> <xs:element name="address" type="xs:string"/>
>> <xs:element name="city" type="xs:string"/>
>> <xs:element name="country" type="xs:string"/>
>> <xs:element name="title" type="xs:string"/>
>> <xs:element name="note" type="xs:string"/>
>> <xs:element name="quantity" type="xs:positiveInteger"/>
>> <xs:element name="price" type="xs:decimal"/>
>>
>> <!-- definition of attributes -->
>> <xs:attribute name="orderid" type="xs:string"/>
>>
>> <!-- definition of complex elements -->
>> <xs:element name="shipto">
>> <xs:complexType>
>>  <xs:sequence>
>>   <xs:element ref="name"/>
>>   <xs:element ref="address"/>
>>   <xs:element ref="city"/>
>>   <xs:element ref="country"/>
>>  </xs:sequence>
>> </xs:complexType>
>> </xs:element>
>> <xs:element name="item">
>> <xs:complexType>
>>  <xs:sequence>
>>   <xs:element ref="title"/>
>>   <xs:element ref="note" minOccurs="0"/>
>>   <xs:element ref="quantity"/>
>>   <xs:element ref="price"/>
>>  </xs:sequence>
>> </xs:complexType>
>> </xs:element>
>>
>> <xs:element name="shiporder">
>> <xs:complexType>
>>  <xs:sequence>
>>   <xs:element ref="orderperson"/>
>>   <xs:element ref="shipto"/>
>>   <xs:element ref="item" maxOccurs="unbounded"/>
>>  </xs:sequence>
>>  <xs:attribute ref="orderid" use="required"/>
>> </xs:complexType>
>> </xs:element>
>>
>> </xs:schema>
>>
>> Using the above schema, we represent the shiporder using the 
>> following RIF-PS:
>>
>> Prefix(tns http://example.org)
>>
>> _obj1#<tns:/shiporder>
>> _obj1[<tns:@orderid>     -> "889923"
>>      <tns:/orderperson> -> "John Smith"
>>      <tns:/shipto>      -> _obj2      <tns:/item>        -> _obj3
>>      <tns:/item>        -> _obj4
>> ]
>>
>> _obj2#<tns:/shipto>
>> _obj2[<tns:/name>    -> "Ola Nordmann"
>>      <tns:/address> -> "Langgt 23"
>>      <tns:/city>    -> "4000 Stavanger"
>>      <tns:/country> -> "Norway"
>> ]
>>
>> _obj3#<tns:/item>
>> _obj3[<tns:/title>    -> "Empire Burlesque"
>>      <tns:/note>     -> "Special Edition"
>>      <tns:/quantity> -> 1
>>      <tns:/price>    -> 10.90
>> ]
>>
>> _obj4#<tns:/item>
>> _obj4[<tns:/title>    -> "Hide your heart"
>>      <tns:/quantity> -> 1
>>      <tns:/price>    -> 9.90
>> ]
>>
>> 3. Named Types
>> --------------
>>
>> The third style of schema uses named types.
>>
>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>           targetNamespace="http://example.org">
>>
>> <xs:simpleType name="stringtype">
>> <xs:restriction base="xs:string"/>
>> </xs:simpleType>
>>
>> <xs:simpleType name="inttype">
>> <xs:restriction base="xs:positiveInteger"/>
>> </xs:simpleType>
>>
>> <xs:simpleType name="dectype">
>> <xs:restriction base="xs:decimal"/>
>> </xs:simpleType>
>>
>> <xs:simpleType name="orderidtype">
>> <xs:restriction base="xs:string">
>>  <xs:pattern value="[0-9]{6}"/>
>> </xs:restriction>
>> </xs:simpleType>
>>
>> <xs:complexType name="shiptotype">
>> <xs:sequence>
>>  <xs:element name="name" type="stringtype"/>
>>  <xs:element name="address" type="stringtype"/>
>>  <xs:element name="city" type="stringtype"/>
>>  <xs:element name="country" type="stringtype"/>
>> </xs:sequence>
>> </xs:complexType>
>>
>> <xs:complexType name="itemtype">
>> <xs:sequence>
>>  <xs:element name="title" type="stringtype"/>
>>  <xs:element name="note" type="stringtype" minOccurs="0"/>
>>  <xs:element name="quantity" type="inttype"/>
>>  <xs:element name="price" type="dectype"/>
>> </xs:sequence>
>> </xs:complexType>
>>
>> <xs:complexType name="shipordertype">
>> <xs:sequence>
>>  <xs:element name="orderperson" type="stringtype"/>
>>  <xs:element name="shipto" type="shiptotype"/>
>>  <xs:element name="item" maxOccurs="unbounded" type="itemtype"/>
>> </xs:sequence>
>> <xs:attribute name="orderid" type="orderidtype" use="required"/>
>> </xs:complexType>
>>
>> <xs:element name="shiporder" type="shipordertype"/>
>>
>> </xs:schema>
>>
>> Using the above schema, we represent the shiporder using the 
>> following RIF-PS:
>>
>> Prefix(tns http://example.org)
>>
>> _obj1#<tns:/shiporder>
>> <tns:/shiporder>##<tns:/shipordertype>
>> _obj1[<tns:/shipordertype@orderid>     -> "889923"
>>      <tns:/shipordertype/orderperson> -> "John Smith"
>>      <tns:/shipordertype/shipto>      -> _obj2      
>> <tns:/shipordertype/item>        -> _obj3
>>      <tns:/shipordertype/item>        -> _obj4
>> ]
>>
>> _obj2#<tns:/shiptotype>
>> _obj2[<tns:/shiptotype/name>    -> "Ola Nordmann"
>>      <tns:/shiptotype/address> -> "Langgt 23"
>>      <tns:/shiptotype/city>    -> "4000 Stavanger"
>>      <tns:/shiptotype/country> -> "Norway"
>> ]
>>
>> _obj3#<tns:/itemtype>
>> _obj3[<tns:/itemtype/title>    -> "Empire Burlesque"
>>      <tns:/itemtype/note>     -> "Special Edition"
>>      <tns:/itemtype/quantity> -> 1
>>      <tns:/itemtype/price>    -> 10.90
>> ]
>>
>> _obj4#<tns:/itemtype>
>> _obj4[<tns:/itemtype/title>    -> "Hide your heart"
>>      <tns:/itemtype/quantity> -> 1
>>      <tns:/itemtype/price>    -> 9.90
>> ]
>>
>> General Rules
>> -------------
>>
>> 1. order of sequences is not preserved
>>
>> 2. cardinality (minOccurs, maxOccurs) is ignored
>>
>> 3. simple types are ignored
>>
>> 4. the IRI for an element e (IRI(e)) is given by
>>  a. <tns:/e> if e is a global element, where tns is the 
>> targetNamespace of the schema
>>  b. <C:/e> otherwise, where C is the IRI of the containing 
>> complexType of e
>>
>> 5. the IRI for an attribute a (IRI(a)) is given by
>>  a. <tns:@a> if a is a global attribute
>>  b. <C:@a> otherwise, where C is the IRI of the containing 
>> complexType of a
>>
>> 6. the IRI for a complexType c (IRI(c)) is given by
>>  a. <tns:/c> if c is a global complexType
>>  b. <E:c> otherwise, where E is the IRI of the element containing c
>>
>> 7. an instance of an XML element e with complexType c maps to an 
>> object _o that is a member of IRI(e).
>> I.e., _o#IRI(e). If IRI(e) != IRI(c), then additionally we have the 
>> axiom IRI(e)##IRI(c).
>>
>> 8. an element f contained in e (whether in a sequence, choice, or 
>> all) is a frame slot of _o named
>> IRI(f).  E.g. _o[IRI(f)->...]
>>
>> 9. if complexType sub extends a complexType sup, then IRI(sub)##IRI(sup)
>>
>> Issues
>> ------
>>
>> Slot names are not disjoint from class names.
>>
>> We could of course map much more schema information to axioms.  E.g. 
>> maxOccurs=1 could be expressed as
>> ?x=?y :- _o[slot1->?x slot1->?y].  But that's not Core.  Are there 
>> other things expressible in Core that
>> we should map?
>>
>> Should we care if the trailing char of tns is '/'?  or should we use 
>> '#' instead of the first '/' in the
>> curie?  or should we use '/' or '/@' instead of '@' for attributes?
>>
>> Neither '#' nor '##' is legal in the conclusion in Core.  Probably 
>> they should be allowed in ground
>> facts.
>>
>> Because we don't capture all the schema constraints, it may be 
>> impossible to serialize a collection of
>> frames computed by a seemingly consistent ruleset into a schema-valid 
>> XML document.
>>
>> Fully-striped XML data doesn't need a schema, and probably should 
>> follow an RDF-style mapping (not
>> covered here).
>>
>>
>> [1] http://www.w3.org/TR/xmlschema-0/
>> [2] http://www.w3schools.com/Schema/schema_example.asp
>>
>
>
>

Received on Wednesday, 15 October 2008 05:18:30 UTC