- From: Changhai Ke <cke@ilog.fr>
- Date: Tue, 14 Oct 2008 12:17:14 +0200
- To: "Gary Hallmark" <gary.hallmark@oracle.com>, "RIF WG" <public-rif-wg@w3.org>
Gary and all, I fully agree with the 3 styles of XML schema. This leads me to think that PRD should also start a schema, because we'll be faced to a question of style. The BNF syntax is good for discussion, but isn't enough. Note I think one important thing for PRD schema will be its reusability and also extensibility. With this criterion in mind, the third style has my favor. But let's reserve this discussion for later. This e-mail is linked to Action-591. Does the section "General rules" constitute your proposal for referencing XML elements from PRD? It's not clear neither what's the answer to issue-37. Changhai -----Original Message----- From: public-rif-wg-request@w3.org [mailto:public-rif-wg-request@w3.org] On Behalf Of Gary Hallmark Sent: mardi 14 octobre 2008 01:05 To: RIF WG Subject: straw proposal for mapping XML schema valid XML data to RIF frames This is in response to ACTION-591. Mapping XML Schema valid XML Data to RIF Frames =============================================== This is a strawman for mapping XML documents whose structure is described by an XML schema[1] to and from RIF Core frames. An XML element has a type as defined by its schema. The type can be simple or complex. Simple types can be atomic, lists, or unions. Atomic types can be primitive (e.g. xs:string), or they can be enumerations or restrictions of atomic types. A complex type can have attributes and content. The content can be simple, or it can be a sequence, choice, or set of elements. An attribute has a name and a value. The value has a simple type. Types can be derived by extension or restriction. Elements and types can be named globally or locally. A local element can be defined by referring to a global element. An element may be defined by referring to a global type, or can include the type definition in the content of its own definition. There are many ways to write the "same" schema. Thus, XML schema is quite complex. Here, we limit our concern to mapping elements of complex type to frames. The only simple types we will handle are the primitive types supported by RIF DTB. Our contribution is mainly to define how to construct IRIs of class constants and slot name constants from the XML schema. This is a "strawman by example", so we start with an example document (from [2]). Example XML Document -------------------- <shiporder orderid="889923" xmlns="http://example.org"> <orderperson>John Smith</orderperson> <shipto> <name>Ola Nordmann</name> <address>Langgt 23</address> <city>4000 Stavanger</city> <country>Norway</country> </shipto> <item> <title>Empire Burlesque</title> <note>Special Edition</note> <quantity>1</quantity> <price>10.90</price> </item> <item> <title>Hide your heart</title> <quantity>1</quantity> <price>9.90</price> </item> </shiporder> We consider 3 ways to write a schema for the above document. 1. One big element ------------------ <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.org"> <xs:element name="shiporder"> <xs:complexType> <xs:sequence> <xs:element name="orderperson" type="xs:string"/> <xs:element name="shipto"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="item" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="note" type="xs:string" minOccurs="0"/> <xs:element name="quantity" type="xs:positiveInteger"/> <xs:element name="price" type="xs:decimal"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="orderid" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:schema> Using the above schema, we represent the shiporder using the following RIF-PS: Prefix(tns http://example.org) _obj1#<tns:/shiporder> _obj1[<tns:/shiporder@orderid> -> "889923" <tns:/shiporder/orderperson> -> "John Smith" <tns:/shiporder/shipto> -> _obj2 <tns:/shiporder/item> -> _obj3 <tns:/shiporder/item> -> _obj4 ] _obj2#<tns:/shiporder/shipto> _obj2[<tns:/shiporder/shipto/name> -> "Ola Nordmann" <tns:/shiporder/shipto/address> -> "Langgt 23" <tns:/shiporder/shipto/city> -> "4000 Stavanger" <tns:/shiporder/shipto/country> -> "Norway" ] _obj3#<tns:/shiporder/item> _obj3[<tns:/shiporder/item/title> -> "Empire Burlesque" <tns:/shiporder/item/note> -> "Special Edition" <tns:/shiporder/item/quantity> -> 1 <tns:/shiporder/item/price> -> 10.90 ] _obj4#<tns:/shiporder/item> _obj4[<tns:/shiporder/item/title> -> "Hide your heart" <tns:/shiporder/item/quantity> -> 1 <tns:/shiporder/item/price> -> 9.90 ] 2. Refs to Global Elements and Attributes ----------------------------------------- A second equivalent schema uses the following style. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.org"> <!-- definition of simple elements --> <xs:element name="orderperson" type="xs:string"/> <xs:element name="name" type="xs:string"/> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/> <xs:element name="title" type="xs:string"/> <xs:element name="note" type="xs:string"/> <xs:element name="quantity" type="xs:positiveInteger"/> <xs:element name="price" type="xs:decimal"/> <!-- definition of attributes --> <xs:attribute name="orderid" type="xs:string"/> <!-- definition of complex elements --> <xs:element name="shipto"> <xs:complexType> <xs:sequence> <xs:element ref="name"/> <xs:element ref="address"/> <xs:element ref="city"/> <xs:element ref="country"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="item"> <xs:complexType> <xs:sequence> <xs:element ref="title"/> <xs:element ref="note" minOccurs="0"/> <xs:element ref="quantity"/> <xs:element ref="price"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="shiporder"> <xs:complexType> <xs:sequence> <xs:element ref="orderperson"/> <xs:element ref="shipto"/> <xs:element ref="item" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute ref="orderid" use="required"/> </xs:complexType> </xs:element> </xs:schema> Using the above schema, we represent the shiporder using the following RIF-PS: Prefix(tns http://example.org) _obj1#<tns:/shiporder> _obj1[<tns:@orderid> -> "889923" <tns:/orderperson> -> "John Smith" <tns:/shipto> -> _obj2 <tns:/item> -> _obj3 <tns:/item> -> _obj4 ] _obj2#<tns:/shipto> _obj2[<tns:/name> -> "Ola Nordmann" <tns:/address> -> "Langgt 23" <tns:/city> -> "4000 Stavanger" <tns:/country> -> "Norway" ] _obj3#<tns:/item> _obj3[<tns:/title> -> "Empire Burlesque" <tns:/note> -> "Special Edition" <tns:/quantity> -> 1 <tns:/price> -> 10.90 ] _obj4#<tns:/item> _obj4[<tns:/title> -> "Hide your heart" <tns:/quantity> -> 1 <tns:/price> -> 9.90 ] 3. Named Types -------------- The third style of schema uses named types. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.org"> <xs:simpleType name="stringtype"> <xs:restriction base="xs:string"/> </xs:simpleType> <xs:simpleType name="inttype"> <xs:restriction base="xs:positiveInteger"/> </xs:simpleType> <xs:simpleType name="dectype"> <xs:restriction base="xs:decimal"/> </xs:simpleType> <xs:simpleType name="orderidtype"> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{6}"/> </xs:restriction> </xs:simpleType> <xs:complexType name="shiptotype"> <xs:sequence> <xs:element name="name" type="stringtype"/> <xs:element name="address" type="stringtype"/> <xs:element name="city" type="stringtype"/> <xs:element name="country" type="stringtype"/> </xs:sequence> </xs:complexType> <xs:complexType name="itemtype"> <xs:sequence> <xs:element name="title" type="stringtype"/> <xs:element name="note" type="stringtype" minOccurs="0"/> <xs:element name="quantity" type="inttype"/> <xs:element name="price" type="dectype"/> </xs:sequence> </xs:complexType> <xs:complexType name="shipordertype"> <xs:sequence> <xs:element name="orderperson" type="stringtype"/> <xs:element name="shipto" type="shiptotype"/> <xs:element name="item" maxOccurs="unbounded" type="itemtype"/> </xs:sequence> <xs:attribute name="orderid" type="orderidtype" use="required"/> </xs:complexType> <xs:element name="shiporder" type="shipordertype"/> </xs:schema> Using the above schema, we represent the shiporder using the following RIF-PS: Prefix(tns http://example.org) _obj1#<tns:/shiporder> <tns:/shiporder>##<tns:/shipordertype> _obj1[<tns:/shipordertype@orderid> -> "889923" <tns:/shipordertype/orderperson> -> "John Smith" <tns:/shipordertype/shipto> -> _obj2 <tns:/shipordertype/item> -> _obj3 <tns:/shipordertype/item> -> _obj4 ] _obj2#<tns:/shiptotype> _obj2[<tns:/shiptotype/name> -> "Ola Nordmann" <tns:/shiptotype/address> -> "Langgt 23" <tns:/shiptotype/city> -> "4000 Stavanger" <tns:/shiptotype/country> -> "Norway" ] _obj3#<tns:/itemtype> _obj3[<tns:/itemtype/title> -> "Empire Burlesque" <tns:/itemtype/note> -> "Special Edition" <tns:/itemtype/quantity> -> 1 <tns:/itemtype/price> -> 10.90 ] _obj4#<tns:/itemtype> _obj4[<tns:/itemtype/title> -> "Hide your heart" <tns:/itemtype/quantity> -> 1 <tns:/itemtype/price> -> 9.90 ] General Rules ------------- 1. order of sequences is not preserved 2. cardinality (minOccurs, maxOccurs) is ignored 3. simple types are ignored 4. the IRI for an element e (IRI(e)) is given by a. <tns:/e> if e is a global element, where tns is the targetNamespace of the schema b. <C:/e> otherwise, where C is the IRI of the containing complexType of e 5. the IRI for an attribute a (IRI(a)) is given by a. <tns:@a> if a is a global attribute b. <C:@a> otherwise, where C is the IRI of the containing complexType of a 6. the IRI for a complexType c (IRI(c)) is given by a. <tns:/c> if c is a global complexType b. <E:c> otherwise, where E is the IRI of the element containing c 7. an instance of an XML element e with complexType c maps to an object _o that is a member of IRI(e). I.e., _o#IRI(e). If IRI(e) != IRI(c), then additionally we have the axiom IRI(e)##IRI(c). 8. an element f contained in e (whether in a sequence, choice, or all) is a frame slot of _o named IRI(f). E.g. _o[IRI(f)->...] 9. if complexType sub extends a complexType sup, then IRI(sub)##IRI(sup) Issues ------ Slot names are not disjoint from class names. We could of course map much more schema information to axioms. E.g. maxOccurs=1 could be expressed as ?x=?y :- _o[slot1->?x slot1->?y]. But that's not Core. Are there other things expressible in Core that we should map? Should we care if the trailing char of tns is '/'? or should we use '#' instead of the first '/' in the curie? or should we use '/' or '/@' instead of '@' for attributes? Neither '#' nor '##' is legal in the conclusion in Core. Probably they should be allowed in ground facts. Because we don't capture all the schema constraints, it may be impossible to serialize a collection of frames computed by a seemingly consistent ruleset into a schema-valid XML document. Fully-striped XML data doesn't need a schema, and probably should follow an RDF-style mapping (not covered here). [1] http://www.w3.org/TR/xmlschema-0/ [2] http://www.w3schools.com/Schema/schema_example.asp
Received on Tuesday, 14 October 2008 10:18:59 UTC