- From: Wendy Chisholm <wendy@w3.org>
- Date: Thu, 16 Sep 2004 13:20:01 -0400
- To: Martin Duerst <duerst@w3.org>
- Cc: wai-gl <w3c-wai-gl@w3.org>, Richard Ishida <ishida@w3.org>
Hello Martin and Richard, Thank you for your quick responses. >> Proposed definitions to address issue 673 [1]. Notes and references >> at [2]. These are not perfect, but lay the basis for tomorrow's >> teleconference. >> >> text >> A sequence of characters included in the Unicode character set. Refer >> to Characters in Extensible Markup Language (XML) 1.0 (Third Edition) >> for more specific information about the accepted character range. > > > I think it would be better to define 'text' as a sequence of characters, > and then say that characters are those included in Unicode, rather than > to do all this in a single sentence, in order to separate the different > issues. The first part ('text is a sequence of characters') is the 'real' > definition, the second part is pinning down the term 'character' with > some operational means. You could also just refer to e.g. a definition > of 'character' in an abstract sense. > You describe the approach taken in the XML specification. I chose not to do it that way (and to refer to the XML definition) because WCAG 2.0 is a less technical document than XML. Thus, for our audience I think this makes sense. Is there a technical issue with doing it this way? > Also, for the accepted character range, you might want to point > to a specific production, production [2]. But you then have the > problem that this also includes a lot of unassinged codepoints, > which I'm not sure you want to include. > No, we don't want to include unassigned codepoints. My understanding of the XML spec is that it excludes these and that is why I propose referencing the XML spec. In response to Richard's comment, I propose referencing XML 1.1 instead of 1.0: http://w3.org/TR/2004/REC-xml11-20040204/#charsets Is this correct? >> In this document, we use "Unicode" to refer to the Unicode character set > > Unicode > > Please don't use the term 'character set' as such. It has been misused > too often. Better e.g. use "coded character set". > > How about: Unicode: "Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language." The Unicode Consortium http://www.unicode.org/standard/WhatIsUnicode.html There are at least three possible encodings for Unicode, UTF-8/16/32. [not sure if we need to mention encodings?] Best, --wendy -- wendy a chisholm world wide web consortium web accessibility initiative http://www.w3.org/WAI/ /--
Received on Thursday, 16 September 2004 17:21:10 UTC