example of xml-notation (was Re: SD1 - Short End Tags [fmt])

>  From: Jean Paoli <jeanpa@microsoft.com>

>  Structured data introduces a lot of tags, which can overwhelm the
>  actual data content.  
...
>  Current syntax:
> 
>  <ORDERS>
> 	 <ITEM>
> 		 <AUTHOR>Dantzig</AUTHOR>
> 		 <PRICE>5.95</PRICE>
> 	 </ITEM>
> 	 <ITEM>
> 		 <AUTHOR>Wardlaw, Lee</AUTHOR>
> 		 <PRICE>6.58</PRICE>
> 	 </ITEM>
> 	 <ITEM>
> 		 <AUTHOR>Bonner</AUTHOR>
> 		 <PRICE>5.95</PRICE>
> 	 </ITEM>
> 	 <ITEM>
> 		 <AUTHOR>Milne, A. A.</AUTHOR>
> 		 <PRICE>12.50</PRICE>
> 	 </ITEM>
>  </ORDERS>
> 
>  Proposed syntax :
... 
>  <ORDERS>
> 	 <ITEM>
> 		 <AUTHOR>Dantzig</>
> 		 <PRICE>5.95</>
> 	 </ITEM>
> 	 <ITEM>
> 		 <AUTHOR>Wardlaw, Lee</>
> 		 <PRICE>6.58</>
> 	 </ITEM>
> 	 <ITEM>
> 		 <AUTHOR>Bonner</>
> 		 <PRICE>5.95</>
> 	 </ITEM>
> 	 <ITEM>
> 		 <AUTHOR>Milne, A. A.</>
> 		 <PRICE>12.50</>
> 	 </ITEM>
>  </ORDERS>

Response syntax using notations:

<!NOTATION comma-delimited-data
	PUBLIC "IDN//W3.ORG//NOTATION XML comma delimited data//EN">
<!NOTATION pipe-delimited-data
	PUBLIC "IDN//W3.ORG//NOTATION XML pipe delimited data//EN">
<!ELEMENT orders - - #PCDATA)>
<!ATTLIST orders
	xml-notation  
		NOTATION  (comma-delimited-data | pipe-delimited-data)  
			#REQUIRED
	xml-record-check  
		NUMBER #REQUIRED
	xml-field-check  
		NUMBER #REQUIRED
	xml-record-name
		NAME #IMPLIED
	xml-field-names 
		NAMES #IMPLIED
	xml-field-sqltypes
		NAMES #REQUIRED
	xml-field-sql-sizes
		NUMBERS #REQUIRED
>

-----------------------------------------------XML Instance---
<?XML version="1.0" ?>
<ORDERS 
	xml-notation="pipe-delimited-data"
	xml-record-check="4"
	xml-field-check="2"  
	xml-record-name="item"
	xml-field-names="author  price" 
	xml-field-sqltypes="varchar  decimal"
	xml-field-sqlsizes="72  2">
Dantzig|5.95
Wardlaw, Lee|6.58
Bonner|5.95
Milne, A. A|12.50
</ORDERS>
----------------------------------------------------------------------

That reduces overhead to approach 200 char overhead + 1 char per field.
So an 400 record database would have 1000 overhead characters.
In  the original that would have about (32+ 15) x400.  (ooops
gatta go, the next door house is buring down....)

Surely the relational database people would prefer this to  "<*" or "</>"
Plus it lets you add typing and sizing, and is current valid SGML.

It also uses notations rather than PIs. Tim's typing proposal at
 http://www.textuality.com/xml/typing.html
uses element attributes in a similar way.  (Bert's version has them
in processing instructions, perhaps based on a older version.)
I am taking Tim's strong typing attributes, but applying them to the
database root element, essentially.

This is obviously only good for relational and fixed-field tables,
not for arbitrary object-databases. And we'd have to figure out 
what to do with subfields.  But the arbitrary databases are
well-served by XML anyway.

I suggest that what XML 1.0 needs, in the spirit of simplicity which has
got us so far already, is merely to cope with embedded pipe- or comma-
delimited formats.  This has the advantage of not being too grandious,
and keeps SGML capatibility.   (For those people who don't like it that
the fields are not available as nodes in SGML groves under my scheme,
I'd say that 

1)  they can be if you use shortrefs;
2)  this is no different in approach to CALS colspecs, so it is current art. 

Rick Jelliffe

Received on Saturday, 17 May 1997 07:31:16 UTC