Re: skipping and ignoring

On Wed, Sep 23, 2009 at 12:44 PM, Robin Berjon <robin@berjon.com> wrote:
> Hi,
>
> while writing tests, we've hit upon something that could use a little
> clarification: the distinction between skip and ignore.
>
> One interpretation that we can come to is that the two terms means the same
> thing for files and attributes, but for XML element processing "ignore"
> descends into the content whereas "skip" just moves on to the next. This is
> consistent for instance with the specification indicating that element
> content inside <name> has to be ignored. It is, however, not consistent with
> its application to <description> or <icon> examples whereby those that don't
> match the locale are said to be ignored (logically it would be skipped —
> even though descending into the subtree would likely do nothing).
>
> Another interpretation is that when something is ignored the UA must act as
> if it hadn't even been there in the first place, whereas when skipping it
> ought to not process it but remember it has seen it. This interpretation is
> built on the fact that the definitions say that ignore causes the UA to "act
> as if [what is being ignored] is not present" whereas skip is to "proceed to
> the next element".
>
> It becomes less astract if you look at the following conformance statements.
>
> In "10.1.19 Algorithm to Process a Configuration Document", step 11, part "A
> content element", the following normative assertion is made: "If this is not
> the first content element encountered by the user agent, then the user agent
> must skip this element." A few lines later it is followed by "If the src
> attribute of the content element is absent, then the user agent must skip
> this element."
>
> Take the following configuration:
>
>  <content/>
>  <content src="perfectly-good-start-file.html"/>
>
> You see the first. It doesn't have an @src so you skip it. You reach the
> second. It's perfectly serviceable, but it's not the first. Or is it? If you
> consider the first one to have been ignored, then you have to act as if it
> wasn't there.

Ok, I see the confusion.

> But instead of ignored it says skipped — and it's not clear
> whether skipped has the same meaning.

Good point. The second must not be processes because it is not the
first. It don't matter that is serviceable. It might just be that I
used ignore where skip was intended.

> If the second element is not taken into account, then we have a potential
> problem with forward compatibility. Let's imagine that we have v2 out, for
> which the following is correct:
>
>  <content uri='http://berjon.com/cool-widgets/dahut'/>
>  <content src="perfectly-good-start-file.html"/>
>
> Clearly the desired behaviour is for v2 runtimes to process the first, and
> v1 runtimes to fallback to the second.

IMO the correct behavior would be for src attributes to take URIs and
for the second to be skipped. However, I'm sure you can dream up other
examples.

The "only ever use the first, even if b0rked" behavior is based on
HTML's behavior (particularly the <title> element). I'm happy to break
ranks with HTML parsing if that is what the WG thinks would be best.
However, it's a pretty big change to the parsing model, but if it
future proofs us, then it might be worth it.

> The same issue applies to other elements that refer to the skip/ignore
> distinction. We believe that some editorial improvements to those
> definitions would be welcome.

Agreed. I'll work on improving those but that depends on if we change
the parsing behavior or not to match what you suggested above.

-- 
Marcos Caceres
http://datadriven.com.au

Received on Wednesday, 23 September 2009 14:52:48 UTC