RE: Diacritic Signs from Jon Hanna on 2003-01-22 (www-rdf-interest@w3.org from January 2003)

From: Jon Hanna <jon@spin.ie>
Date: Wed, 22 Jan 2003 18:36:58 -0000
To: <www-rdf-interest@w3.org>
Message-ID: <NDBBLCBLIMDOPKMOPHLHEEMNEMAA.jon@spin.ie>

> So, I added the "iso-8859-1" encoding declaration, and it worked, but ONLY
> when I retrieved the RDF document from a web server using the "Parse URI"
> feature in the RDF Validator.  When I cut and paste via a browser
> window, I
> get the same error.  Any thoughts as to why?

The character set used for the transmission wasn't iso-8859-1?

> Also, I anticipate adding additional languages in the future
> which go beyond
> the characters in 8859.  Thus I would prefer to generate files encoded in
> UTF-8.  Any tips on how to do this?  I'm using PERL and various
> text editors
> to generate my XML.

Any proper XML app/component/module should be able to read in from a variety
of character sets and write out in a variety which AT A MIMIMUM would
include UTF-8 and UTF-16 (if it can't read those it not following the spec).

Probably the most convenient encoding to use for actually coding would be
UCS-2 (which "looks" like UTF-16 as long as you don't use characters above
U+FFFF).

If you are stuck with a character set that doesn't include the character you
want you can always use entities like &#x2122; or &#8482; for the trademark
symbol. As long as your character encoding includes the characters <>"&#;
you will be able to write anything, but this will be inconvenient if you use
many such characters.

Received on Wednesday, 22 January 2003 13:37:17 UTC