RE: property value clarification

At 4:45 PM -0400 10/30/98, Babich, Alan wrote:
>Jim Whitehead wrote:

>I also agree with Jim Davis that the server should
>process the xml:lang attribute correctly so that DASL
>can do queries, etc. . However, that is a separate
>discussion from XML attributes "being part of the
>property value".

Alright so far...
>The following is what I believe is necessary for correct
>processing of the xml:lang attribute:
>(1) Most (but perhaps not all) servers will always
>store a particular string property in one character set
>and one language. Many servers will store *all* string
>properties in the same character set and language.
>That implies that, in general, servers need to be free
>to translate string constants to whatever they need
>to store the value.

There are issues here: most character encodings are a subset of Unicode, so
that there are legitimate properties that are unlikely to be translatable
into likely server encodings. This is not to say that we should prohibit
internal translation, but we should require that _if_ the encoding changes,
_then_ the characters will be preserved in the result returned by the
server.

Transcoding is a reality, but it's also a minefield, and servers that do it
should be held accountable for not corrupting data. Furthermore, even
transcodable character-set encodings are not always invertible: the
precomposed characters in unicode are a canonical example.

>(4) I don't believe that there is or should be a
>requirement to get back the exact literal characters
>of a DAV property, let alone some XML attributes as well.
>(a) Given the whitespace handling rules of XML, that is
>already unlikely. Whether XML whitespace is part of the
>value or not is an issue when a property is stored.
>Should DAV require discarded XML whitespace from
>PROPPATCH to be returned for numeric, string, and
>datetime properties? It is a better design if the
>server doesn't bother to remember the discarded XML
>whitespace.

XML exlicitly does _not_ ignore whitespace in any situation. When
validating against a DTD (and only when validating against a DTD) it flags
data (whitespace) that would be ignored by an SGML processor, and that many
applications will also want to ignore.

The flagging often confuses people, but it is an application issue as to
whether the data can be ignored. Since WebDAV _has_ no DTD for all
properties (since additional tags are not an error), there is _no_
ignorable whitespace in any DAV property.

DAV servers are free to reorder attributes and so forth, but they _cannot_
correctly discard whitespace in any situation. Whitespace is a perpetual
problem, mostly because attempts to "intelligently" ignore whitespace
always seem unintelligent for some other application. XML avoids this
problem by removing such "intelligence".

>(b) Most servers are going to store
>numbers and datetimes in binary form. So when you
>get a number or datetime back, it might have a
>mathematically equivalent value, but the exact
>characters in the response might be different. That
>ought to be good enough.

That's an issue that depends on the semantics of the tags (as might the
ignoring of whitespace).

It seems to me that we are conflating some critical issues:

1. Live properties may be rewritten in any arbitrary semantics-preserving
way. Since the server knows the semantics of those tags, there is no
protocol question involved.

2. Dead properties cannot be transformed in the same way (if the server
guessed that 4/28/61 was my birthdate, and it was a filesystem path, there
would be bad consequences for the client, when it attemped to access the
file "April 28, 1961").

Many of the issues you raise are only appropriate for live properties.

Many of the moe conservative issues that I am raising are equally critical
for dead properties. (If the server does not understand the property, it
should keep its hands off).


>(c) Since we never had the invariant condition
>to get back the exact characters of the value,
>we haven't lost anything.

For dead properties, there is only a very few transformations that might be
sensible for some servers, transcoding among them. The information on
"canonical XML" at http://www.jclark.com/ might be a good starting point
for picking these properties.

I think byte-equivalence is a tricky issue, but parser-result equivalence
is a requirement for dead properties.

None of this need apply to the root element of the property, as I said
before, since that is defined by DAV. It's the sub-elements of non-WebDAV
dead properties that are unrestricted, and semantically opaque to the
server.

>So I think we can be silent on whether or not XML
>attributes are considered part of the value
>without causing DASL query problems.

Depends on what you mean by problems, I guess.

  -- David
_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://www.dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________

Received on Saturday, 31 October 1998 10:00:00 UTC