W3C home > Mailing lists > Public > www-svg@w3.org > November 2004

Re: SVG 1.2 Comment: image/svg+xml;charset=""

From: Chris Lilley <chris@w3.org>
Date: Wed, 24 Nov 2004 21:57:06 +0100
Message-ID: <1021481323.20041124215706@w3.org>
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: www-svg@w3.org

On Wednesday, November 24, 2004, 5:23:20 PM, Bjoern wrote:

BH> * Chris Lilley wrote:
>>BH> Are you saying that the proposed registration would avoid that, i.e.,
>>BH> that http://www.bjoernsworld.de/temp/utf8-or-iso-8859-1.svg must be
>>BH> considered UTF-8 encoded by all implementations?
>>
>>Tell, me, what happens if you read that file from local disk on your
>>server, or run xslt on it? How do you tell the implementations to ignore
>>the well formedness error?

BH> XML 1.0 defines no constraint (neither wf-constraint nor fatal errors)
BH> that would consider the document in this state non-compliant, the doc
BH> is both legal UTF-8 and legal ISO-8859-1.

Yes, when I read that I had not realized that the document had been
carefully constructed to avoid using byte sequences illegal in either
encoding. So the content would 'merely' be mangled, rather than
producing a well formedness error. In the general case that won't
happen, which is why I asked what you would do when reading from local
disk.

BH>  Regarding the other questions,
BH> I would not do that,

Uh-huh. I should perhaps have said 'what do you propose the software
does'.

BH> but if I did I would tell the relevant tool of the
BH> higher-level encoding information

So basically, you propose never needing or using the xml encoding
declaration *at all*?

BH> just like I do when processing HTML

Ah, I see. As Robin earlier suggested, just because HTML had many
problems in this area is no reason to foist them on XML.

To quote from a quote on Anne's blog:

  We threw away everything but the pieces that were known to work and
  added pretty-good Unicode support, i.e. something else that had been
  proven to work.

I'm trying to preserve the pretty good support, which you already agreed
had good interoperability, and you seem to be trying to move it towards
the known less interoperable area for all processing, not just over the
wire processing.

BH> documents, e.g. when using HTML Tidy on a UTF-8 encoded document I use
BH> the -utf8 command line switch

And you know this is the encoding how?

BH> as I have not yet implemented something
BH> else. For other tools, I've implemented encoding detection for HTML/XML
BH> documents in the HTML::Encoding Perl module (which, btw, would consider
BH> the document cited above ISO-8859-1, honoring the charset parameter).

You get a charset parameter from local disk? What filesystem are you
using, BeOS?


-- 
 Chris Lilley                    mailto:chris@w3.org
 Chair, W3C SVG Working Group
 Member, W3C Technical Architecture Group
Received on Wednesday, 24 November 2004 20:57:07 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:14:52 UTC