Re: [css3-syntax][css21] More problems with determining the character encoding

On Mon, Oct 22, 2012 at 8:56 PM, Henri Sivonen <hsivonen@iki.fi> wrote:

> On Mon, Oct 22, 2012 at 3:39 PM, Glenn Adams <glenn@skynav.com> wrote:
> >>  * Please prohibit authors from using and implementations from
> >> supporting encodings that are not in the Encoding Standard.
> >
> > Can't prohibit author behavior. Can only say what to do if author does
> > something you don't like.
>
> You can also define the artifacts resulting from certain author
> behavior as non-conforming.
>
> >> (http://encoding.spec.whatwg.org/) If normatively referencing the
> >> Encoding Standard is politically or procedurally infeasible, please at
> >> least prohibit implementations from supporting non-ASCII-compatible
> >> encodings other than variants of UTF-16.
> >
> > Can't prohibit implementations from supporting whatever they like.
>
> I meant defining implementations that support encodings not listed in
> the Encoding Standard as non-conforming.
>

That's not how standards work. You can't prohibit an implementation from
doing something that is out of scope of a standard. Or at least, in the
real world of standards and implementations, that's how it works.


>
> > Can't ban author or implementation behavior. Can only define what to do
> when
> > behavior is conformant or not.
>
> Right.
>
> >>  * If there is no BOM, no @charset, no HTTP-level charset and no
> >> charset attribute on the linking element, and the encoding of the
> >> referring document or style sheet is ASCII-compatible, please define
> >> that the encoding is inherited from the referrer. If the encoding of
> >> the referrer is UTF-16, please define that the inherited encoding is
> >> UTF-8.
> >
> > That makes non sense There is no relationship between the encoding of a
> > referring document and a referenced document.
>
> Inheriting the encoding is existing implementation behavior.
>

Really? What implementation(s)? Please provide (or reference) a test file
if so.


> >>  * Please make the encoding declared using @charset have no effect
> >> unless the string "@charset" is represented as its ASCII bytes.
> >
> > If CSS2.1 already defines behavior for a BOMless interpretation of the
> > encoding of @charset that allows inferring encoding, then that definition
> > should be maintained, not removed.
>
> Even when the result is known to be non-sensical?
>

Why do you feel it is bad/wrong behavior?


>
> >>  * If it is determined that supporting BOMless UTF-16 that has
> >> @charset is needed for Web compatibility, please base the sniffing on
> >> the 0x00 bytes intertwined in "@charset" and not on whatever follows
> >> "@charset".
> >
> > What is your rationale for this constraint?
>
> If whatever follows @charset is not an UTF-16 label, honoring the
> label makes @charset itself makes decode into a sequence of characters
> that are non-conforming in CSS.
>

Please provide a sample input byte sequence with explanation.

Received on Monday, 22 October 2012 14:41:16 UTC