Re: [widgets] white space handling

On Mon, Dec 21, 2009 at 9:43 AM, Cyril Concolato
<cyril.concolato@enst.fr> wrote:
> Hi Robin,
>
> Le 18/12/2009 18:01, Robin Berjon a écrit :
>>
>> On Dec 18, 2009, at 16:36 , Cyril Concolato wrote:
>>>
>>> Le 18/12/2009 15:58, Robin Berjon a écrit :
>>>>
>>>> P+C doesn't tie processors to a particular version of XML, and lists its
>>>> white space characters accordingly (and defensively). If you're certain that
>>>> you will only ever get content that comes from a conforming XML 1.0
>>>> implementation, then you probably don't need to check for this.
>>>
>>> I don't read it like that. P&C explicitely references XML 1.0 and never
>>> mentions 1.1. So I thought the behavior was conformant to 1.0. It's fine if
>>> the spec also handles 1.1 but it should be mentioned. Also the rationale for
>>> the choices of space characters should also be indicated and the differences
>>> between XML 1.0 and XML 1.1 should be present.
>>
>> I beg to differ. I think that we should build specifications that can
>> handle future changes to the stack
>
> I'm fine with that.
>
>> without listing all the versions that are supported.
>
> It's not because you cite what you support that you're restricted to that. I
> think it helps understanding a spec.
>
>> P+C is built for XML 1.0, and it's great that it has the resilience to
>> handle changes to 1.1 without a hitch — but who knows what XML 4.2 might
>> add? We can't guarantee that it'll work, but we can try (and if it does
>> work, I don't think that we should list it either). I certainly don't think
>> that it's the right place to document potential differences between versions
>> of XML — as your XHTML example shows, that kind of information goes stale.
>
> If you're explicitely citing dated version of the spec, since they're cast
> in stone, I don't see how they can go stale.
>
>
>> Furthermore, I didn't say that the differences between XML 1.0 and 1.1 are
>> the rationale for this choice — I was merely indicating that using 1.1 you
>> could get such characters and that P+C's robustness against that was a plus.
>> I wasn't in Marcos's brain when that part was written but my specification
>> exegesis antennae suspect that the listed class of characters corresponds to
>> the Unicode white space character class (and therefore to what Unicode-aware
>> processors would consider white space, notably \s in regular expressions).
>
> Well, you know my concern. I want to understand the spec in order to
> implement it properly. I'm not asking for any new normative statement, nor
> any change to the existing ones. I would be fine with informative notes
> explaining the intents of some choices. For example, as you know, I'm
> implementing an SVG UA and an P&C UA, I want to know what's reusable, what's
> common without doing XML archaeology. Such notes would help me and I
> suspected it would help others. Nothing more.
>

In the spec, I made that choice to be compatible with Unicode version
5. My intention was not to break XML parsers, but it sucks if that is
what happened. I personally don't know how to proceed here.


-- 
Marcos Caceres
http://datadriven.com.au

Received on Wednesday, 6 January 2010 20:59:08 UTC