Character reference in regular expressions

In XML Schema Datatypes F, the spec says that character references can
appear in regular expressions. However, it seems that there is no need.

First, character references are expanded by an XML processor before passed
to regexpr parser. Who wants to write,

<pattern value="[&amp;#x4d;]">

instead of writing simply,

<pattern value="[&#x4d;]">


Second, "[&#x4d;]" (written "[&amp;#x4d;]" in XML) can be parsed as both
one "XmlCharRef" and six "XmlChar"s.


If there are needs to use UCS code point in regular expressions (for
patterns not written in XML), it should be expressed using escape
character, such as \u... .

---
Satoshi Nakamura <snakamura@infoteria.co.jp>
Infoteria Corporation

Received on Tuesday, 27 March 2001 02:07:50 UTC