RE: Issue JW24d (xml:lang) from Julian Reschke on 2003-01-16 (www-webdav-dasl@w3.org from January to March 2003)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Thu, 16 Jan 2003 12:46:00 +0100
To: "Julian Reschke" <julian.reschke@gmx.de>, "Jim Whitehead" <ejw@cse.ucsc.edu>, <www-webdav-dasl@w3.org>
Message-ID: <JIEGINCHMLABHJBIGKBCOEFOGDAA.julian.reschke@gmx.de>

OK,

here's a proposal for adding xml:lang aware operators which doesn't affect
string equality and collations (and therefore IMHO makes more sense).

Operators:

DAV:language-defined (operates on prop)
DAV:language-matches (operates on prop and string literal, according to
XPath 1.0 lang() function)


Example: find all resources where the property "foobar" has an english value
containing the string "ask".

<and xmlns="DAV:">
  <like><prop><foobar xmlns=''/></prop><literal>%ask%</literal></like>
  <language-matches><prop><foobar
xmlns=''/></prop><literal>en</literal></language-matches>
</and>

(Note that the XPath lang() function is language-subtype-aware -- the
condition would also match a property value in language "en_US").

Julian


--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760

> -----Original Message-----
> From: www-webdav-dasl-request@w3.org
> [mailto:www-webdav-dasl-request@w3.org]On Behalf Of Julian Reschke
> Sent: Tuesday, January 14, 2003 6:27 PM
> To: Jim Whitehead; www-webdav-dasl@w3.org
> Subject: RE: Issue JW24d (xml:lang)
>
>
>
> Jim,
>
> we'd have to define what "performing the search using the language
> information" actually means.
>
> Fact is that XQuery/XSLT2.0 currently doesn't handle this issue as well,
> although that working group definitively has much more resources
> to come up
> with a solution. I think we shouldn't try to be more clever than
> those which
> are experts in the query/I18N domain. Instead, we should keeo
> DAV:basicsearch minimal, and ensure that the grammar is
> extensible enough to
> add these features at a later point of time (for instance, by adding a
> "collation" attribute).
>
> Julian
>
> --
> <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
>
> > -----Original Message-----
> > From: www-webdav-dasl-request@w3.org
> > [mailto:www-webdav-dasl-request@w3.org]On Behalf Of Jim Whitehead
> > Sent: Tuesday, January 14, 2003 6:09 PM
> > To: www-webdav-dasl@w3.org
> > Subject: RE: Issue JW24d (xml:lang)
> >
> >
> >
> > > I think Jim already provided a good example, but RFC2277 has another
> > > similar one:  one might reasonably wish to do a large search for
> > > documents with the name of a specific tree in Norwegian.  The name of
> > > the tree is 'ask'.  It's useless to get all the English documents with
> > > the word 'ask' in response to that query.  If there *are* body or
> > > properties typed as Norwegian, then our search syntax must be able to
> > > specify that the search engine should match these first.
> >
> > Though I'm no expert on ideographic languages, I think there
> > might be cases
> > where the meaning of a specific UNICODE character might vary
> depending on
> > the language tag.
> >
> > As for a specific proposal, here's the sketch of one:
> >
> >            |   server can do lang     server cannot do
> >            |   specific searching     lang spec. searching
> > -----------+----------------------------------------------
> > xml:lang   |   perform search using   either:
> > present    |   xml:lang info          (a) reject request
> >            |                          (b) use default search
> >            |                          but inform client that
> >            |                          xml:lang was ignored
> >            |
> > xml:lang   |   perform search using   perform search using
> > not present|   server's default       server's default
> >            |   search technique       search technique
> >            |   (character-match,
> >            |   indep. of language)
> >
> > I don't think it makes sense to make use/non-use of language information
> > discoverable.
> >
> > - Jim
> >
>

Received on Thursday, 16 January 2003 06:46:33 UTC