W3C home > Mailing lists > Public > xmlschema-dev@w3.org > July 2002

Re: Checking a UTF-16 instance against a UTF-8 Schema

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: 02 Jul 2002 09:48:58 +0100
To: Mark Feblowitz <mfeblowitz@frictionless.com>
Cc: "Xmlschema-Dev (E-mail)" <xmlschema-dev@w3.org>
Message-ID: <f5bofdqikg5.fsf@cogsci.ed.ac.uk>

Mark Feblowitz <mfeblowitz@frictionless.com> writes:

> Probably a FAQ, but not one I've encountered.
> Can all available validating parsers validate an XML instance represented in
> UTF-16 against an XML Schema represented in UTF-8? 

Unless they are non-conformant at the XML level, they should.
Infosets contain characters, not encodings, so the encoding should be
invisible at the schema validation level.

> Would anything special need to be done to achieve the validation, or would
> it suffice for each of them to have their encodings appropriately indicated?

That should do it.

> The intent here is to use a given Schema, represented in UTF-8, to validate
> an instance document that uses the same element and attribute labels, yet
> the element and/or attribute content requires UTF-16, e.g., strings
> containing accented French.

(That doesn't require UTF-16 -- UTF-8 works fine for any Unicode character.)

  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
          W3C Fellow 1999--2002, part-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Tuesday, 2 July 2002 04:49:01 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:15:03 UTC