- From: Gary Hallmark <gary.hallmark@oracle.com>
- Date: Mon, 13 Oct 2008 16:05:27 -0700
- To: RIF WG <public-rif-wg@w3.org>
This is in response to ACTION-591. Mapping XML Schema valid XML Data to RIF Frames =============================================== This is a strawman for mapping XML documents whose structure is described by an XML schema[1] to and from RIF Core frames. An XML element has a type as defined by its schema. The type can be simple or complex. Simple types can be atomic, lists, or unions. Atomic types can be primitive (e.g. xs:string), or they can be enumerations or restrictions of atomic types. A complex type can have attributes and content. The content can be simple, or it can be a sequence, choice, or set of elements. An attribute has a name and a value. The value has a simple type. Types can be derived by extension or restriction. Elements and types can be named globally or locally. A local element can be defined by referring to a global element. An element may be defined by referring to a global type, or can include the type definition in the content of its own definition. There are many ways to write the "same" schema. Thus, XML schema is quite complex. Here, we limit our concern to mapping elements of complex type to frames. The only simple types we will handle are the primitive types supported by RIF DTB. Our contribution is mainly to define how to construct IRIs of class constants and slot name constants from the XML schema. This is a "strawman by example", so we start with an example document (from [2]). Example XML Document -------------------- <shiporder orderid="889923" xmlns="http://example.org"> <orderperson>John Smith</orderperson> <shipto> <name>Ola Nordmann</name> <address>Langgt 23</address> <city>4000 Stavanger</city> <country>Norway</country> </shipto> <item> <title>Empire Burlesque</title> <note>Special Edition</note> <quantity>1</quantity> <price>10.90</price> </item> <item> <title>Hide your heart</title> <quantity>1</quantity> <price>9.90</price> </item> </shiporder> We consider 3 ways to write a schema for the above document. 1. One big element ------------------ <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.org"> <xs:element name="shiporder"> <xs:complexType> <xs:sequence> <xs:element name="orderperson" type="xs:string"/> <xs:element name="shipto"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="item" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="note" type="xs:string" minOccurs="0"/> <xs:element name="quantity" type="xs:positiveInteger"/> <xs:element name="price" type="xs:decimal"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="orderid" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:schema> Using the above schema, we represent the shiporder using the following RIF-PS: Prefix(tns http://example.org) _obj1#<tns:/shiporder> _obj1[<tns:/shiporder@orderid> -> "889923" <tns:/shiporder/orderperson> -> "John Smith" <tns:/shiporder/shipto> -> _obj2 <tns:/shiporder/item> -> _obj3 <tns:/shiporder/item> -> _obj4 ] _obj2#<tns:/shiporder/shipto> _obj2[<tns:/shiporder/shipto/name> -> "Ola Nordmann" <tns:/shiporder/shipto/address> -> "Langgt 23" <tns:/shiporder/shipto/city> -> "4000 Stavanger" <tns:/shiporder/shipto/country> -> "Norway" ] _obj3#<tns:/shiporder/item> _obj3[<tns:/shiporder/item/title> -> "Empire Burlesque" <tns:/shiporder/item/note> -> "Special Edition" <tns:/shiporder/item/quantity> -> 1 <tns:/shiporder/item/price> -> 10.90 ] _obj4#<tns:/shiporder/item> _obj4[<tns:/shiporder/item/title> -> "Hide your heart" <tns:/shiporder/item/quantity> -> 1 <tns:/shiporder/item/price> -> 9.90 ] 2. Refs to Global Elements and Attributes ----------------------------------------- A second equivalent schema uses the following style. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.org"> <!-- definition of simple elements --> <xs:element name="orderperson" type="xs:string"/> <xs:element name="name" type="xs:string"/> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/> <xs:element name="title" type="xs:string"/> <xs:element name="note" type="xs:string"/> <xs:element name="quantity" type="xs:positiveInteger"/> <xs:element name="price" type="xs:decimal"/> <!-- definition of attributes --> <xs:attribute name="orderid" type="xs:string"/> <!-- definition of complex elements --> <xs:element name="shipto"> <xs:complexType> <xs:sequence> <xs:element ref="name"/> <xs:element ref="address"/> <xs:element ref="city"/> <xs:element ref="country"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="item"> <xs:complexType> <xs:sequence> <xs:element ref="title"/> <xs:element ref="note" minOccurs="0"/> <xs:element ref="quantity"/> <xs:element ref="price"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="shiporder"> <xs:complexType> <xs:sequence> <xs:element ref="orderperson"/> <xs:element ref="shipto"/> <xs:element ref="item" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute ref="orderid" use="required"/> </xs:complexType> </xs:element> </xs:schema> Using the above schema, we represent the shiporder using the following RIF-PS: Prefix(tns http://example.org) _obj1#<tns:/shiporder> _obj1[<tns:@orderid> -> "889923" <tns:/orderperson> -> "John Smith" <tns:/shipto> -> _obj2 <tns:/item> -> _obj3 <tns:/item> -> _obj4 ] _obj2#<tns:/shipto> _obj2[<tns:/name> -> "Ola Nordmann" <tns:/address> -> "Langgt 23" <tns:/city> -> "4000 Stavanger" <tns:/country> -> "Norway" ] _obj3#<tns:/item> _obj3[<tns:/title> -> "Empire Burlesque" <tns:/note> -> "Special Edition" <tns:/quantity> -> 1 <tns:/price> -> 10.90 ] _obj4#<tns:/item> _obj4[<tns:/title> -> "Hide your heart" <tns:/quantity> -> 1 <tns:/price> -> 9.90 ] 3. Named Types -------------- The third style of schema uses named types. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.org"> <xs:simpleType name="stringtype"> <xs:restriction base="xs:string"/> </xs:simpleType> <xs:simpleType name="inttype"> <xs:restriction base="xs:positiveInteger"/> </xs:simpleType> <xs:simpleType name="dectype"> <xs:restriction base="xs:decimal"/> </xs:simpleType> <xs:simpleType name="orderidtype"> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{6}"/> </xs:restriction> </xs:simpleType> <xs:complexType name="shiptotype"> <xs:sequence> <xs:element name="name" type="stringtype"/> <xs:element name="address" type="stringtype"/> <xs:element name="city" type="stringtype"/> <xs:element name="country" type="stringtype"/> </xs:sequence> </xs:complexType> <xs:complexType name="itemtype"> <xs:sequence> <xs:element name="title" type="stringtype"/> <xs:element name="note" type="stringtype" minOccurs="0"/> <xs:element name="quantity" type="inttype"/> <xs:element name="price" type="dectype"/> </xs:sequence> </xs:complexType> <xs:complexType name="shipordertype"> <xs:sequence> <xs:element name="orderperson" type="stringtype"/> <xs:element name="shipto" type="shiptotype"/> <xs:element name="item" maxOccurs="unbounded" type="itemtype"/> </xs:sequence> <xs:attribute name="orderid" type="orderidtype" use="required"/> </xs:complexType> <xs:element name="shiporder" type="shipordertype"/> </xs:schema> Using the above schema, we represent the shiporder using the following RIF-PS: Prefix(tns http://example.org) _obj1#<tns:/shiporder> <tns:/shiporder>##<tns:/shipordertype> _obj1[<tns:/shipordertype@orderid> -> "889923" <tns:/shipordertype/orderperson> -> "John Smith" <tns:/shipordertype/shipto> -> _obj2 <tns:/shipordertype/item> -> _obj3 <tns:/shipordertype/item> -> _obj4 ] _obj2#<tns:/shiptotype> _obj2[<tns:/shiptotype/name> -> "Ola Nordmann" <tns:/shiptotype/address> -> "Langgt 23" <tns:/shiptotype/city> -> "4000 Stavanger" <tns:/shiptotype/country> -> "Norway" ] _obj3#<tns:/itemtype> _obj3[<tns:/itemtype/title> -> "Empire Burlesque" <tns:/itemtype/note> -> "Special Edition" <tns:/itemtype/quantity> -> 1 <tns:/itemtype/price> -> 10.90 ] _obj4#<tns:/itemtype> _obj4[<tns:/itemtype/title> -> "Hide your heart" <tns:/itemtype/quantity> -> 1 <tns:/itemtype/price> -> 9.90 ] General Rules ------------- 1. order of sequences is not preserved 2. cardinality (minOccurs, maxOccurs) is ignored 3. simple types are ignored 4. the IRI for an element e (IRI(e)) is given by a. <tns:/e> if e is a global element, where tns is the targetNamespace of the schema b. <C:/e> otherwise, where C is the IRI of the containing complexType of e 5. the IRI for an attribute a (IRI(a)) is given by a. <tns:@a> if a is a global attribute b. <C:@a> otherwise, where C is the IRI of the containing complexType of a 6. the IRI for a complexType c (IRI(c)) is given by a. <tns:/c> if c is a global complexType b. <E:c> otherwise, where E is the IRI of the element containing c 7. an instance of an XML element e with complexType c maps to an object _o that is a member of IRI(e). I.e., _o#IRI(e). If IRI(e) != IRI(c), then additionally we have the axiom IRI(e)##IRI(c). 8. an element f contained in e (whether in a sequence, choice, or all) is a frame slot of _o named IRI(f). E.g. _o[IRI(f)->...] 9. if complexType sub extends a complexType sup, then IRI(sub)##IRI(sup) Issues ------ Slot names are not disjoint from class names. We could of course map much more schema information to axioms. E.g. maxOccurs=1 could be expressed as ?x=?y :- _o[slot1->?x slot1->?y]. But that's not Core. Are there other things expressible in Core that we should map? Should we care if the trailing char of tns is '/'? or should we use '#' instead of the first '/' in the curie? or should we use '/' or '/@' instead of '@' for attributes? Neither '#' nor '##' is legal in the conclusion in Core. Probably they should be allowed in ground facts. Because we don't capture all the schema constraints, it may be impossible to serialize a collection of frames computed by a seemingly consistent ruleset into a schema-valid XML document. Fully-striped XML data doesn't need a schema, and probably should follow an RDF-style mapping (not covered here). [1] http://www.w3.org/TR/xmlschema-0/ [2] http://www.w3schools.com/Schema/schema_example.asp
Received on Monday, 13 October 2008 23:07:32 UTC