Re: [CSS21] WD 4.2: end-of-string vs. end-of-stylesheet from Peter Moulder on 2011-03-07 (www-style@w3.org from March 2011)

From: Peter Moulder <peter.moulder@monash.edu>
Date: Tue, 08 Mar 2011 03:21:07 +1100
To: www-style@w3.org
Message-id: <20110307162107.GA20019@bowman.infotech.monash.edu.au>
On Fri, Jan 07, 2011 at 04:44:31PM -0500, Boris Zbarsky wrote:

> >In any case, the conflict between these rules as currently stated needs
> >removing one way or another.
> 
> Where is the conflict?  You parse one character at a time.  If you
> get to one of [\n\r\f] while inside a string, you use the
> "unexpected end-of-string" rule.  If you get to EOF you use the
> "unexpected end of stylesheet" rule.

The conflict is that the existing tokenization text doesn't require
that BAD_STRING end in (or be followed by) [\r\n\f], so if you get to
EOF then (at least by my understanding) it could validly be considered
to be a BAD_STRING token, whose specified behaviour would conflict with
what the "unexpected end of style sheet" text says.


(The remainder of this message doesn't have much of interest.)

> >and it may be useful to clarify whether the correct behaviour
> >for the end-of-stylesheet example depends on whether there's a newline
> >character between "Hello" and the end of the stylesheet.
> 
> I think that would be obvious if "end of line" were clarified.

It's common to read one bit of the specification (such as just the
"unexpected end of style sheet" bullet point) without reading all the
other text around it.

I agree that a person who was conscious of the "end of line" rule would
be able to work out what the "end of style sheet" text actually meant.
However, in my experience, relying on such consciousness of other parts
of a spec leads to misinterpretations and misimplementations.

Accordingly, I suggest clarifying both parts, so that each may be
understood even without considering the other.

> >It may be useful to clarify what is meant by "reaching the end of a
> >line"
> 
> Yes, agreed.  The intent is pretty clearly "reach a [\n\r\f]"

(Well, I wouldn't have called it clear which takes precedence before
having tested things, but yes, I agree that's what it should be.)

> With regard to your main point, I'm not sure what the "problem" is
> [regarding different operating systems' representations of text
> files and how they're divided into lines].

It's not a very important point, and I don't suggest doing anything
about it now that we're at this point in CSS2.1 development.
I'll explain the concern for the curious, but everyone else can skip
the rest of this message.

I was considering these provisions to be an attempt to deal relatively
gracefully with the case of a truncated file.  (Boris Zbarsky seems to
think otherwise, and really I wouldn't know, but I'll nevertheless go
on to explain what I was thinking.)

Considered this way, the "problem" is just that if the operating
system's text file representation is such that there's no explicit line
terminator character, and more specifically if the file as the user
agent sees it is transformed by the runtime system such that each line
is terminated by a \n character, then this would defeat this attempt
[if in fact that is the intent of this rule]: the truncated string
would always be seen by such a user agent as a BAD_STRING token rather
than as a truncated STRING token.

pjrm.
Received on Monday, 7 March 2011 16:21:39 UTC