RE: Issue JW17/JW24b from Julian Reschke on 2003-01-09 (www-webdav-dasl@w3.org from January to March 2003)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Thu, 9 Jan 2003 18:23:35 +0100
To: <www-webdav-dasl@w3.org>
Message-ID: <JIEGINCHMLABHJBIGKBCOEEFGCAA.julian.reschke@gmx.de>

Hi,

this issue is on the relation between DAV:contains and character sets.

"5.13 should discuss handling of queries when character set differs.
Some text on handling character sets would be helpful."

Some observations:

a) the character set for the query condition is well known (due to the fact
that we're using XML) -- basically the query processor just has a sequence
of Unicode characters.

b) even text files may be difficult -- the server may not need their
encoding -- even if he does, it's unclear how matching will work -- for
instance take a file of mime type "text/xml; charset=ISO8859-1".

<foo>&amp;</foo>

which of the two below will match?

1) <contains>&amp;</contains>
2) <contains>&amp;amp;</contains>

(in one case the raw text file is searched, in the other case the XML
Infoset after parsing).


Conclusion: it's hard to define, and very hard to mandate a specific
behaviour. IMHO, the spec should be silent on this issue.

Julian



[1]
<http://greenbytes.de/tech/webdav/draft-reschke-webdav-search-latest.html#rf
c.issue.JW17/JW24b>

--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760

Received on Thursday, 9 January 2003 12:24:11 UTC