- From: Martin J. Duerst <duerst@w3.org>
- Date: Fri, 12 May 2000 16:38:15 +0900
- To: "Joseph M. Reagle Jr." <reagle@w3.org>
- Cc: www-xml-schema-comments@w3.org, "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>, cmsmcq@w3.org
At 00/05/11 13:00 -0400, Joseph M. Reagle Jr. wrote: >At 04:53 PM 5/11/00 +0900, Martin J. Duerst wrote: > >- Because the 'boolean' datatype has four lexical values (true, false, > > 1, 0; this is in the spec, no kidding) instead of two lexical values, > > that means that additional effort (at least) is necessary if somebody > > wants to create a schema for some data containing boolean values. > >Martin, thank you for reminding me about this. I recall you've mentioned >this before and I believe we had an agreement from Michael to do something >about ensuring a consistent lexical representation of data types. I can't >find a URL for that agreement (I think it was sometime last year) but I can >find evidence that the WG was trying to satisfy that requirement (for >floating points at least): > >3.2.3 - 3.2.5 Lexical notation of floating-point numbers >[Where the author requested other notations] >AM>> This argument was made by several people but there was a strong >AM>> sentiment for a single >AM>> lexical representation. >http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000AprJun/0043.h >tml > >However, I'm not sure to what extent this is a problem (I'm expressing >ignorance, not arguing it isn't). If an XML instance uses '0' that is >signed, is there and expectation that since the schema permits a 'false' as >well, intermediary processors would change it? Not necessarily, but that may well happen. We already see in the DSig group that people want to use the DOM, and don't want to keep around e.g. whether an attribute was single-quoted or double-quoted. As we move up the semantic ladder (well, it feels more like a very flat slope, but that's a different issue), exactly the same will very easily happen one step higher. >I appreciate this might >happen with character code mappings, but I tend to view schema's as >constraints on permissible values, and not a processor (in the vein of >infoset/C14N/DOM). Constraints on permissible values is one function, and probably the most important one the way the spec is written. But for datatypes, the 'infoset' aspect is already there, and C14N is what we are just discussing here, and would be very very easy to add at this point in time compared to having to start another group,... in a few months. Something like DOM is not done yet, but conversion from data to XML streaming and back is an important application of XML Schema, probably the most important one. >(For instance, just because a schema permits an >unconstrained string, one wouldn't presume it would change the string ...?) There is a clear difference between changing the value, and producing a different lexical representation for the same value. Changing from 0 to false is the later, changing the string is the former. > >- If there is some way to express that elements of the same type > > have to appear in a certain order (don't know whether this is in > > the spec or not), this will also help to create schemata that can > > be used to validate data and then feed that data into XML DSig > > without any or without much processing. > > > >In other words, try to make sure that for appropriately designed > >XML Schemas, no additional 'data canonicalization' step is necessary > >to sign some data. > >I don't quite follow. Element of the same element type? Can you give an >example? Well, let's assume you have a list of students, with student id, birthday, and a boolean for 'male' (gender). The task is to produce a signable XML document from this data. In order for the sign to be reproducable, the XML document has to be exactly the same for the same data. Assuming that the structure looks something like <student><id>...</id><birthday>date</birthday><male>boolean</male></student> ... ... and is described as an XML Schema, the 'missing pieces' for the above task are to make sure the students are always in the same order (e.g. by id) and that date and boolean are always in a canonical form (and of course that the underlying XML is in C14N). Probably the above is not the most appropriate example, but I hope you get the idea. Regards, Martin.
Received on Friday, 12 May 2000 04:29:30 UTC