Re: change proposal for issue-86, was: ISSUE-86 - atom-id-stability - Chairs Solicit Proposals from Julian Reschke on 2010-04-15 (public-html@w3.org from April 2010)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Thu, 15 Apr 2010 11:56:07 +0200
To: Maciej Stachowiak <mjs@apple.com>
CC: Sam Ruby <rubys@intertwingly.net>, Ian Hickson <ian@hixie.ch>, "public-html@w3.org WG" <public-html@w3.org>
Message-ID: <4BC6E2B7.6050809@gmx.de>

On 15.04.2010 11:41, Maciej Stachowiak wrote:
>> I think it would be acceptable if different HTML->Atom converters
>> produce different IDs for the same news entries. But it's not easy to
>> tell without fully understanding what this feature is good for.
>>
>> The bigger problem is when the *same* converter produces different IDs
>> for the same input on each run, which Ian's text currently sort-of
>> allows ("SHOULD") under certain circumstances (ref BCP14). That would
>> lead to the kind of problems Sam mentioned already.
>
> I'm not talking about different converters in this case though. I'm
> talking about the same converter (or importer if you prefer) run
> multiple times. Based on what you said, it sounds like it would be
> conforming to Atom to make a tool that lets you import the same plain
> text repeatedly but gives it a different atom:id every time. Yet on the

No, I didn't say that. At least I didn't intend to.

> ...

>>> What in the Atom spec distinguishes importing multiple times with the
>>> same tool, from importing multiple times with different tools? Why would
>>> the latter be exempted from the persistent ID requirement, but not the
>>> former?
>>
>> As I said before: pulling any data into an Atom feed puts it into a
>> certain context, and requires deriving certain metadata. Requiring
>> *ever* converter to produce the same atom:id essentially means that
>> they need to produce the *same* atom entry (for some value of
>> sameness). In general, that will only be possible if the data you
>> start with already has all the metadata required by Atom, which would
>> include the atom:id.
>
> It looks to me like you didn't answer my question. Let me try again.
>
> Different conversion tools are (apparently) allowed by the Atom spec to
> produce different atom:ids if the input doesn't already contain an
> atom:id. Then why aren't multiple invocations of the same tool allowed

It depends on what they do. If, by converting, they produce something 
sufficiently different from the source, then I'd even *expect* a new 
atom:id to be assigned.

> to do so? Why is it "the same entry" when created without full metadata
> by multiple runs of the same tool, but it isn't "the same entry" if you
> use two different tools? How would you even define "the same conversion

Because it's the *point* of producing the same entry when you run the 
same source through the same converter.

> tool" - is it the same tool if I upgrade from 1.0 to 1.1? Is it the same
> if I run it once on my laptop, and then later on my desktop?

It depends. If the upgrade from 1.0 to 1.1 sufficiently changes the 
output, then maybe yes.

Also, "sameness" of the converter may depend on the code base, but also 
on local data.

> (If we can't get down to clear answers here, then I'll abandon this
> thread, since we have seem to have a critical mass of agreement for
> removing the feature from the W3C spec anyway. But I am suspicious of
> the flexible way in which the Atom conformance criteria are being
> interpreted.)

I agree that it's not easy to apply Atom requirements to every possible 
use case, some of which may not have been considered when writing the 
spec. Pointing out areas where that is the case is instructive, but not 
necessarily helpful in talking about the concrete case here.

So I'd recommend to focus on those criteria which *are* clear. And I 
maintain that an atom:id that varies on multiple runs of a given 
algorithm for the same source and the same converter (as in same code, 
same config, same machine, same user, ...) definitively is broken.

Best regards, Julian

Received on Thursday, 15 April 2010 09:56:48 UTC