Re: Make it easy to create signable schemas (was: Re: XML Signature WG's review of XML Schema)

At 00/05/11 13:00 -0400, Joseph M. Reagle Jr. wrote:
>At 04:53 PM 5/11/00 +0900, Martin J. Duerst wrote:
>  >- Because the 'boolean' datatype has four lexical values (true, false,
>  >   1, 0; this is in the spec, no kidding) instead of two lexical values,
>  >   that means that additional effort (at least) is necessary if somebody
>  >   wants to create a schema for some data containing boolean values.
>
>Martin, thank you for reminding me about this. I recall you've mentioned
>this before and I believe we had an agreement from Michael to do something
>about ensuring a consistent lexical representation of data types. I can't
>find a URL for that agreement (I think it was sometime last year) but I can
>find evidence that the WG was trying to satisfy that requirement (for
>floating points at least):
>
>3.2.3 - 3.2.5 Lexical notation of floating-point numbers
>[Where the author requested other notations]
>AM>> This argument was made by several people but there was a strong
>AM>> sentiment for a single
>AM>> lexical representation.
>http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000AprJun/0043.h
>tml
>
>However, I'm not sure to what extent this is a problem (I'm expressing
>ignorance, not arguing it isn't). If an XML instance uses '0' that is
>signed, is there and expectation that since the schema permits a 'false' as
>well, intermediary processors would change it?

Not necessarily, but that may well happen. We already see
in the DSig group that people want to use the DOM, and
don't want to keep around e.g. whether an attribute was
single-quoted or double-quoted. As we move up the semantic
ladder (well, it feels more like a very flat slope, but
that's a different issue), exactly the same will very
easily happen one step higher.


>I appreciate this might
>happen with character code mappings, but I tend to view schema's as
>constraints on permissible values, and not a processor (in the vein of
>infoset/C14N/DOM).

Constraints on permissible values is one function, and probably
the most important one the way the spec is written. But for datatypes,
the 'infoset' aspect is already there, and C14N is what we are
just discussing here, and would be very very easy to add at this
point in time compared to having to start another group,...
in a few months. Something like DOM is not done yet, but
conversion from data to XML streaming and back is an
important application of XML Schema, probably the most important
one.


>(For instance, just because a schema permits an
>unconstrained string, one wouldn't presume it would change the string ...?)

There is a clear difference between changing the value,
and producing a different lexical representation for the
same value. Changing from 0 to false is the later,
changing the string is the former.


>  >- If there is some way to express that elements of the same type
>  >   have to appear in a certain order (don't know whether this is in
>  >   the spec or not), this will also help to create schemata that can
>  >   be used to validate data and then feed that data into XML DSig
>  >   without any or without much processing.
>  >
>  >In other words, try to make sure that for appropriately designed
>  >XML Schemas, no additional 'data canonicalization' step is necessary
>  >to sign some data.
>
>I don't quite follow. Element of the same element type? Can you give an
>example?

Well, let's assume you have a list of students, with student id,
birthday, and a boolean for 'male' (gender). The task is to
produce a signable XML document from this data. In order for
the sign to be reproducable, the XML document has to be exactly
the same for the same data. Assuming that the structure looks
something like
<student><id>...</id><birthday>date</birthday><male>boolean</male></student>
...
...
and is described as an XML Schema, the 'missing pieces' for the
above task are to make sure the students are always in the same
order (e.g. by id) and that date and boolean are always in
a canonical form (and of course that the underlying XML is
in C14N).

Probably the above is not the most appropriate example, but I hope
you get the idea.


Regards,  Martin.

Received on Friday, 12 May 2000 04:29:30 UTC