Re: straw proposal for mapping XML schema valid XML data to RIF frames

Of course, for the case where there is more than one target namespace, 
one would need the abbreviated syntax, not the very abbreviated.  So 
yeah, I guess maybe we just bite the bullet and adopt their syntax, 
since it works for all cases.

Gary Hallmark wrote:
> Dave,
>
> Thanks for the suggestion -- I wasn't aware of this work.  The 
> abbreviated syntax looks in the common case where there is a target 
> namespace, e.g. instead of an IRI like
>
> http://example.org/shiporder/shipto/country
>
> You get something like
>
> http://example.org#xmlns(tns,http://example.org)xscd(/tns:shiporder/country) 
>
>
> I think it is because they don't assume that the IRI on the left of 
> the '#' is in fact the target namespace.  I think that they should 
> consider a "very abbreviated syntax" e.g.
>
> http://example.org#/shiporder/shipto/country
>
> when http://example.org is the target namespace of the global element 
> or type (shiporder here).
>
> I do think its a good idea to adopt a subset of their abbreviated 
> schema component paths (stuff inside xscd(...)).  E.g. using /~ to 
> denote the type axis and /@ to denote the attribute axis.  I would not 
> allow //.
>
> Dave Reynolds wrote:
>>
>> Hi Gary,
>>
>> Not studied this properly yet but one comment/question:
>>
>> Might it make sense to use the URI format from XML Schema Component 
>> Designators (the abbreviated syntax) for the class/property URIs?
>>
>> I realize that XSCD is still a work in progress, and has been in 
>> progress for a while, so we probably don't want a dependency. 
>> However, having our URIs compatible with their current proposals 
>> might be worth contemplating.
>>
>> Warning: I've very little knowledge or understanding of XSCD so this 
>> may be a stupid suggestion but at the least we probably want an 
>> answer to any future last call comment around "why not use XSCD".
>>
>> Dave
>>
>> Gary Hallmark wrote:
>>>
>>> This is in response to ACTION-591.
>>>
>>> Mapping XML Schema valid XML Data to RIF Frames
>>> ===============================================
>>>
>>> This is a strawman for mapping XML documents whose structure is 
>>> described by an XML schema[1] to and
>>> from RIF Core frames.
>>>
>>> An XML element has a type as defined by its schema.  The type can be 
>>> simple or complex.  Simple types
>>> can be atomic, lists, or unions.  Atomic types can be primitive 
>>> (e.g. xs:string), or they can be
>>> enumerations or restrictions of atomic types.  A complex type can 
>>> have attributes and content.  The
>>> content can be simple, or it can be a sequence, choice, or set of 
>>> elements.  An attribute has a name and
>>> a value.  The value has a simple type.  Types can be derived by 
>>> extension or restriction.
>>> Elements and types can be named globally or locally.  A local 
>>> element can be defined by referring to a
>>> global element. An element may be defined by referring to a global 
>>> type, or can include the type
>>> definition in the content of its own definition.  There are many 
>>> ways to write the "same" schema.
>>>
>>> Thus, XML schema is quite complex.  Here, we limit our concern to 
>>> mapping elements of complex type to
>>> frames. The only simple types we will handle are the primitive types 
>>> supported by RIF DTB. Our
>>> contribution is mainly to define how to construct IRIs of class 
>>> constants and slot name constants from
>>> the XML schema.
>>>
>>> This is a "strawman by example", so we start with an example 
>>> document (from [2]).
>>>
>>> Example XML Document
>>> --------------------
>>>
>>> <shiporder orderid="889923" xmlns="http://example.org">
>>> <orderperson>John Smith</orderperson>
>>> <shipto>
>>>  <name>Ola Nordmann</name>
>>>  <address>Langgt 23</address>
>>>  <city>4000 Stavanger</city>
>>>  <country>Norway</country>
>>> </shipto>
>>> <item>
>>>  <title>Empire Burlesque</title>
>>>  <note>Special Edition</note>
>>>  <quantity>1</quantity>
>>>  <price>10.90</price>
>>> </item>
>>> <item>
>>>  <title>Hide your heart</title>
>>>  <quantity>1</quantity>
>>>  <price>9.90</price>
>>> </item>
>>> </shiporder>
>>>
>>> We consider 3 ways to write a schema for the above document.
>>>
>>> 1. One big element
>>> ------------------
>>>
>>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>           targetNamespace="http://example.org">
>>>
>>> <xs:element name="shiporder">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element name="orderperson" type="xs:string"/>
>>>   <xs:element name="shipto">
>>>    <xs:complexType>
>>>     <xs:sequence>
>>>      <xs:element name="name" type="xs:string"/>
>>>      <xs:element name="address" type="xs:string"/>
>>>      <xs:element name="city" type="xs:string"/>
>>>      <xs:element name="country" type="xs:string"/>
>>>     </xs:sequence>
>>>    </xs:complexType>
>>>   </xs:element>
>>>   <xs:element name="item" maxOccurs="unbounded">
>>>    <xs:complexType>
>>>     <xs:sequence>
>>>      <xs:element name="title" type="xs:string"/>
>>>      <xs:element name="note" type="xs:string" minOccurs="0"/>
>>>      <xs:element name="quantity" type="xs:positiveInteger"/>
>>>      <xs:element name="price" type="xs:decimal"/>
>>>     </xs:sequence>
>>>    </xs:complexType>
>>>   </xs:element>
>>>  </xs:sequence>
>>>  <xs:attribute name="orderid" type="xs:string" use="required"/>
>>> </xs:complexType>
>>> </xs:element>
>>>
>>> </xs:schema>
>>>
>>> Using the above schema, we represent the shiporder using the 
>>> following RIF-PS:
>>>
>>> Prefix(tns http://example.org)
>>>
>>> _obj1#<tns:/shiporder>
>>> _obj1[<tns:/shiporder@orderid>     -> "889923"
>>>      <tns:/shiporder/orderperson> -> "John Smith"
>>>      <tns:/shiporder/shipto>      -> _obj2      
>>> <tns:/shiporder/item>        -> _obj3
>>>      <tns:/shiporder/item>        -> _obj4
>>> ]
>>>
>>> _obj2#<tns:/shiporder/shipto>
>>> _obj2[<tns:/shiporder/shipto/name>    -> "Ola Nordmann"
>>>      <tns:/shiporder/shipto/address> -> "Langgt 23"
>>>      <tns:/shiporder/shipto/city>    -> "4000 Stavanger"
>>>      <tns:/shiporder/shipto/country> -> "Norway"
>>> ]
>>>
>>> _obj3#<tns:/shiporder/item>
>>> _obj3[<tns:/shiporder/item/title>    -> "Empire Burlesque"
>>>      <tns:/shiporder/item/note>     -> "Special Edition"
>>>      <tns:/shiporder/item/quantity> -> 1
>>>      <tns:/shiporder/item/price>    -> 10.90
>>> ]
>>>
>>> _obj4#<tns:/shiporder/item>
>>> _obj4[<tns:/shiporder/item/title>    -> "Hide your heart"
>>>      <tns:/shiporder/item/quantity> -> 1
>>>      <tns:/shiporder/item/price>    -> 9.90
>>> ]
>>>
>>>
>>> 2. Refs to Global Elements and Attributes
>>> -----------------------------------------
>>>
>>> A second equivalent schema uses the following style.
>>>
>>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>           targetNamespace="http://example.org">
>>>
>>> <!-- definition of simple elements -->
>>> <xs:element name="orderperson" type="xs:string"/>
>>> <xs:element name="name" type="xs:string"/>
>>> <xs:element name="address" type="xs:string"/>
>>> <xs:element name="city" type="xs:string"/>
>>> <xs:element name="country" type="xs:string"/>
>>> <xs:element name="title" type="xs:string"/>
>>> <xs:element name="note" type="xs:string"/>
>>> <xs:element name="quantity" type="xs:positiveInteger"/>
>>> <xs:element name="price" type="xs:decimal"/>
>>>
>>> <!-- definition of attributes -->
>>> <xs:attribute name="orderid" type="xs:string"/>
>>>
>>> <!-- definition of complex elements -->
>>> <xs:element name="shipto">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element ref="name"/>
>>>   <xs:element ref="address"/>
>>>   <xs:element ref="city"/>
>>>   <xs:element ref="country"/>
>>>  </xs:sequence>
>>> </xs:complexType>
>>> </xs:element>
>>> <xs:element name="item">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element ref="title"/>
>>>   <xs:element ref="note" minOccurs="0"/>
>>>   <xs:element ref="quantity"/>
>>>   <xs:element ref="price"/>
>>>  </xs:sequence>
>>> </xs:complexType>
>>> </xs:element>
>>>
>>> <xs:element name="shiporder">
>>> <xs:complexType>
>>>  <xs:sequence>
>>>   <xs:element ref="orderperson"/>
>>>   <xs:element ref="shipto"/>
>>>   <xs:element ref="item" maxOccurs="unbounded"/>
>>>  </xs:sequence>
>>>  <xs:attribute ref="orderid" use="required"/>
>>> </xs:complexType>
>>> </xs:element>
>>>
>>> </xs:schema>
>>>
>>> Using the above schema, we represent the shiporder using the 
>>> following RIF-PS:
>>>
>>> Prefix(tns http://example.org)
>>>
>>> _obj1#<tns:/shiporder>
>>> _obj1[<tns:@orderid>     -> "889923"
>>>      <tns:/orderperson> -> "John Smith"
>>>      <tns:/shipto>      -> _obj2      <tns:/item>        -> _obj3
>>>      <tns:/item>        -> _obj4
>>> ]
>>>
>>> _obj2#<tns:/shipto>
>>> _obj2[<tns:/name>    -> "Ola Nordmann"
>>>      <tns:/address> -> "Langgt 23"
>>>      <tns:/city>    -> "4000 Stavanger"
>>>      <tns:/country> -> "Norway"
>>> ]
>>>
>>> _obj3#<tns:/item>
>>> _obj3[<tns:/title>    -> "Empire Burlesque"
>>>      <tns:/note>     -> "Special Edition"
>>>      <tns:/quantity> -> 1
>>>      <tns:/price>    -> 10.90
>>> ]
>>>
>>> _obj4#<tns:/item>
>>> _obj4[<tns:/title>    -> "Hide your heart"
>>>      <tns:/quantity> -> 1
>>>      <tns:/price>    -> 9.90
>>> ]
>>>
>>> 3. Named Types
>>> --------------
>>>
>>> The third style of schema uses named types.
>>>
>>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>           targetNamespace="http://example.org">
>>>
>>> <xs:simpleType name="stringtype">
>>> <xs:restriction base="xs:string"/>
>>> </xs:simpleType>
>>>
>>> <xs:simpleType name="inttype">
>>> <xs:restriction base="xs:positiveInteger"/>
>>> </xs:simpleType>
>>>
>>> <xs:simpleType name="dectype">
>>> <xs:restriction base="xs:decimal"/>
>>> </xs:simpleType>
>>>
>>> <xs:simpleType name="orderidtype">
>>> <xs:restriction base="xs:string">
>>>  <xs:pattern value="[0-9]{6}"/>
>>> </xs:restriction>
>>> </xs:simpleType>
>>>
>>> <xs:complexType name="shiptotype">
>>> <xs:sequence>
>>>  <xs:element name="name" type="stringtype"/>
>>>  <xs:element name="address" type="stringtype"/>
>>>  <xs:element name="city" type="stringtype"/>
>>>  <xs:element name="country" type="stringtype"/>
>>> </xs:sequence>
>>> </xs:complexType>
>>>
>>> <xs:complexType name="itemtype">
>>> <xs:sequence>
>>>  <xs:element name="title" type="stringtype"/>
>>>  <xs:element name="note" type="stringtype" minOccurs="0"/>
>>>  <xs:element name="quantity" type="inttype"/>
>>>  <xs:element name="price" type="dectype"/>
>>> </xs:sequence>
>>> </xs:complexType>
>>>
>>> <xs:complexType name="shipordertype">
>>> <xs:sequence>
>>>  <xs:element name="orderperson" type="stringtype"/>
>>>  <xs:element name="shipto" type="shiptotype"/>
>>>  <xs:element name="item" maxOccurs="unbounded" type="itemtype"/>
>>> </xs:sequence>
>>> <xs:attribute name="orderid" type="orderidtype" use="required"/>
>>> </xs:complexType>
>>>
>>> <xs:element name="shiporder" type="shipordertype"/>
>>>
>>> </xs:schema>
>>>
>>> Using the above schema, we represent the shiporder using the 
>>> following RIF-PS:
>>>
>>> Prefix(tns http://example.org)
>>>
>>> _obj1#<tns:/shiporder>
>>> <tns:/shiporder>##<tns:/shipordertype>
>>> _obj1[<tns:/shipordertype@orderid>     -> "889923"
>>>      <tns:/shipordertype/orderperson> -> "John Smith"
>>>      <tns:/shipordertype/shipto>      -> _obj2      
>>> <tns:/shipordertype/item>        -> _obj3
>>>      <tns:/shipordertype/item>        -> _obj4
>>> ]
>>>
>>> _obj2#<tns:/shiptotype>
>>> _obj2[<tns:/shiptotype/name>    -> "Ola Nordmann"
>>>      <tns:/shiptotype/address> -> "Langgt 23"
>>>      <tns:/shiptotype/city>    -> "4000 Stavanger"
>>>      <tns:/shiptotype/country> -> "Norway"
>>> ]
>>>
>>> _obj3#<tns:/itemtype>
>>> _obj3[<tns:/itemtype/title>    -> "Empire Burlesque"
>>>      <tns:/itemtype/note>     -> "Special Edition"
>>>      <tns:/itemtype/quantity> -> 1
>>>      <tns:/itemtype/price>    -> 10.90
>>> ]
>>>
>>> _obj4#<tns:/itemtype>
>>> _obj4[<tns:/itemtype/title>    -> "Hide your heart"
>>>      <tns:/itemtype/quantity> -> 1
>>>      <tns:/itemtype/price>    -> 9.90
>>> ]
>>>
>>> General Rules
>>> -------------
>>>
>>> 1. order of sequences is not preserved
>>>
>>> 2. cardinality (minOccurs, maxOccurs) is ignored
>>>
>>> 3. simple types are ignored
>>>
>>> 4. the IRI for an element e (IRI(e)) is given by
>>>  a. <tns:/e> if e is a global element, where tns is the 
>>> targetNamespace of the schema
>>>  b. <C:/e> otherwise, where C is the IRI of the containing 
>>> complexType of e
>>>
>>> 5. the IRI for an attribute a (IRI(a)) is given by
>>>  a. <tns:@a> if a is a global attribute
>>>  b. <C:@a> otherwise, where C is the IRI of the containing 
>>> complexType of a
>>>
>>> 6. the IRI for a complexType c (IRI(c)) is given by
>>>  a. <tns:/c> if c is a global complexType
>>>  b. <E:c> otherwise, where E is the IRI of the element containing c
>>>
>>> 7. an instance of an XML element e with complexType c maps to an 
>>> object _o that is a member of IRI(e).
>>> I.e., _o#IRI(e). If IRI(e) != IRI(c), then additionally we have the 
>>> axiom IRI(e)##IRI(c).
>>>
>>> 8. an element f contained in e (whether in a sequence, choice, or 
>>> all) is a frame slot of _o named
>>> IRI(f).  E.g. _o[IRI(f)->...]
>>>
>>> 9. if complexType sub extends a complexType sup, then 
>>> IRI(sub)##IRI(sup)
>>>
>>> Issues
>>> ------
>>>
>>> Slot names are not disjoint from class names.
>>>
>>> We could of course map much more schema information to axioms.  E.g. 
>>> maxOccurs=1 could be expressed as
>>> ?x=?y :- _o[slot1->?x slot1->?y].  But that's not Core.  Are there 
>>> other things expressible in Core that
>>> we should map?
>>>
>>> Should we care if the trailing char of tns is '/'?  or should we use 
>>> '#' instead of the first '/' in the
>>> curie?  or should we use '/' or '/@' instead of '@' for attributes?
>>>
>>> Neither '#' nor '##' is legal in the conclusion in Core.  Probably 
>>> they should be allowed in ground
>>> facts.
>>>
>>> Because we don't capture all the schema constraints, it may be 
>>> impossible to serialize a collection of
>>> frames computed by a seemingly consistent ruleset into a 
>>> schema-valid XML document.
>>>
>>> Fully-striped XML data doesn't need a schema, and probably should 
>>> follow an RDF-style mapping (not
>>> covered here).
>>>
>>>
>>> [1] http://www.w3.org/TR/xmlschema-0/
>>> [2] http://www.w3schools.com/Schema/schema_example.asp
>>>
>>
>>
>>
>

Received on Wednesday, 15 October 2008 05:22:43 UTC