- From: Anders W. Tell <anderst@toolsmiths.se>
- Date: Wed, 10 May 2000 09:47:27 +0200
- To: WWW XML Schema Comment <www-xml-schema-comments@w3.org>
- CC: xml-dev@xml.org
Problem: ---------- A common phenomena which now and then surfaces in the markup world is the occurrence of what some authors calls "Micro-parsing". This is the situation when Schema writers define that a XML attribute should contain structured information and therefore creates a need for customized parsers, hence the above term. Two examples are XPath expression in XSL: match="/cars/car[@name='volvo']" Path in SVG: <path d="M 100 100 L 140 100 L 120 140 z"/> Is this not a paradox? A markup language which cannot be used for markup anymore? Of course all markup languages have a limit and maybe XML's limit have been reached. Why: What are the reasons for encoding complex information in a single attribute ? The reason I have seen are sofar are: * compression, produces smaller XML streams (SVG paths,...) * usage of attribute strings for readability (XPath expressions.,,,) * usage of attribute strings for compactness (XPath expressions,...) *... The following suggestions is an attempt to "internalize" these encoding scenario's, to capture as much as possible of the encoding information inside XML Schema's instead of relying on externally created and managed documentation. Another side effect of the proposal is that its now possible to have DOM access to structured attributes as if they where XML element encoded. For Grove enthusiasts it is also possible to view (with a little effort ;)) attributes as hierarchical node's. So here goes... Solution: - - - - - - - - - - - - - - - - - - - First a few initial short definitions: * Encoding "Stereotype" <=> something that should be encoded, is defined by a information model which may be defined in terms of one or more information items (nodes/properties,...). * Encoding "Form" <=> principles for how nodes/properties in an Stereotype's information model must be encoded as a strings or XML elements. (the following suggestion implies two forms, one for attribute encoding and one for XML element encoding) * "Attribute-Micro-Parser" <=> A software artifact which encodes and decodes XML attribute strings to/from XML elements. - - - - - - - - - - - - - - - - - - - * Add new XML Schema data type which represents "MicroParsed" attribute values. Make it a subtype of "string" with all its facets. Schema writers can now derive their own MicroParsed data types, one for each stereotype they want to encode as attribute. * In this new data type add a reference to a complexType. This referenced schema defines how to encode the contents (information model) of the attribute string ("attribute form") as an XML element tree ("element form"). Note: Maybe this reference should be a new facet for the string data type. Note: With this design it is possible to encode the same stereotype as either XML attribute string or XML element tree in documents. * In this new data type add a reference to the attribute's "form specification", i.e. where to find more information on how to construct attribute strings from the underlying information model. * All available information in the stereotypes information model MUST be encoded in the "element form" and the information encoded in the "attribute form" MUST be a "subset" of the information encoded in the element form's encoding (similar to applying a grove plan before encoding as attribute). The "element form" is considered a "complete" encoding form (contains all information in the information model). * Information set: Add an extra optional property to attribute information item. property: "parsed" sequence<element-info-item[zero or one]> * Recommend that all Schema authors first create an information model for the stereotype then create encoding "form"s for the primary XML element encoding form and last the corresponding XML attribute strings form. * DOM framework Create a new software artifact called "DOMAttributeMicroParser" interface DOMAttributeMicroParser { readonly attribute string name; readonly attribute string namespace; /* parse attribute string and create the corresponding element tree */ long parse(in DOMAttribute from, out DOMElement to); /* Traverse the element tree and create corresponding attribute string expression */ long construct(in DOMElement from, out DOMAttribute to); }; * DOM framework [Optional] Create a subclass to DOM Attribute called "DOMParsedAttribute" interface DOMParsedAttribute : DOMAttribute { attribute DOMElement fParsed; /* parsed attribute */ }; All comment are welcome. Best Regards Anders W. Tell -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Received on Wednesday, 10 May 2000 03:46:29 UTC