- From: Tim Bray <tbray@textuality.com>
- Date: Fri, 11 Apr 1997 08:08:07 -0700
- To: w3c-sgml-wg@w3.org
At 08:04 PM 10/04/97 -0700, Paul Grosso wrote: >In discussions with others over that last couple days, I've come >to the conclusion we should consider added to xml-link the capability >to address into data character content (aka dataloc). I had a conversation on this subject with Adam Bosworth, Microsoft IE VIP - runs the org that Jean works in. Adam: (waving XML-link) What do we have to do to get regular expressions added to the addressing here? Tim: Ask for it. But we're worried that we don't know how to do regular expressions in Unicode. Adam: Oh. (nods, looks looks worried) Tim: Somebody at Microsoft must know about this stuff. Adam: Yes; Notepad and so on in NT do Unicode; we'll check it out and get back to you. BTW, I, like James, am very dubious about character and token counting. Among other things, it's just hopelessly Eurocentric. If we could get regexps, that would solve a lot of problems; but I don't think it would solve Peter M-R's problems; he really seems to want to count molecules. Is there a regexp lib in Java that operates on their 16-bit chars? Hmmm... -Tim
Received on Friday, 11 April 1997 11:20:21 UTC