Re: [Bug 10008] New: Use of Unicode blocks that no longer exist in regular expressions.

On 24 Jun 2010, at 08:11 , C. M. Sperberg-McQueen wrote:
> ... section G.1.1 Character Class Escapes [says]
>
>    When the implementation supports multiple versions of the Unicode  
> database,
>    and they differ in salient respects (e.g. different properties  
> are assigned
>    to the same character in different versions of the database),  
> then it
>    is ·implementation-defined· which set of property definitions is  
> used
>    for any given assessment episode.
>
> ...
>
> XSD 1.1 requires you to document how you determine which version of
> the database to use in interpreting block names.  It does not, as far
> as I can see, require anything further.  (It does not, for example,
> appear to require that you always use the same version within a given
> validation, though as a user I think I'd rather that you did.)

I should read more carefully.  The phrase "is used for any given
assessment episode" does seem to convey the expectation that an
implementation should interpret all regexes in a given validation
according to the same version of the Unicode database.

I'm still not sure that it explicitly *requires* it, though.  If
for example two separately maintained schema documents assume
different versions of the Unicode database -- one writes \p{IsGreek}
and the other \p{IsGreekandCoptic}, say -- then it's hard to see
how an implementation could limit itself to a single version of
the database in a schema composed from those two schema documents.
So I'd argue that it cannot and should not be *required*, though
of course it's probably simpler all around if a single version
of the database is used for any given validation.

Sorry for missing this aspect of the issue in my earlier mail.

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com
* http://cmsmcq.com/mib
* http://balisage.net
****************************************************************

Received on Thursday, 24 June 2010 15:35:02 UTC