Re: Extensibility strategies, was: Deciding in public (Was: SVGWG SVG-in-HTML proposal) from Julian Reschke on 2008-08-05 (public-html@w3.org from August 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Tue, 05 Aug 2008 12:52:34 +0200
To: Ian Hickson <ian@hixie.ch>
CC: 'HTML WG' <public-html@w3.org>
Message-ID: <489830F2.6090409@gmx.de>
Ian Hickson wrote:
> ...
> What I don't understand is why while I think it's fine for people who wish 
> to use URIs to do so, you seem to think it's not ok for people who _don't_ 
> wish to use URIs to _not_ do so.
> ...

Because I believe that URIs are the safest and most widely used way for 
disambiguation, and mixing URIs with free-form identifiers just is 
asking for clashes.

> Are you seriously telling me that you are worried that people will mint 
> class names intended for non-private use that happen to look exactly like 
> URIs in your domain space or URN space?

If you think it is so unlikely, why not make a statement that something 
that parses as a URI must be a URI under the control of the party 
minting the identifier?

> If you think that is at all likely, why is it not likely that people will 
> invent class names with your domain name or URN space just to spite you, 
> if we required everyone to use URIs?

Of course that's possible. But that would be intentional.

>>>> It's the problem hat distributed extensibility is solving.
>>> Is it a problem _worth_ solving? Is communicating with a few people 
>>> who might maybe invent clashing class names such a high burden that we 
>>> need to optimise our technology stack to avoid it?
>> Yes.
> 
> Why?

I already told you:

"In the one case, you communicate with other peoples who want to share
this vocabulary. In the other case, you need to communicate with
*everybody else*, including future generations."

> Not at all. You only have to communicate with the people who have any 
> chance of clashing with you, for example if you are using a scheme like 
> "word.example.com", you only have to talk with people on "example.com", or 
> if you are using a scheme like "ab.cd-word" you only have to talk to 
> people on "ab.cd" and "cd.ab".
> 
> Is this an undue burden? Why?

For instance, in many cases it will be hard to find the right people to 
talk to. Or they may not exist, but will 12 months later.

And to understand with whom I may have clashes, I will have to come up 
with an algorithm that actually gives me all possible candidates. How is 
this supposed to work if the format is free-form?

>>>> I agree that some microformats do fine, but I disagree that they're 
>>>> doing fine as a generic solution.
>>> They're not a generic solution. However, they _use_ the generic 
>>> solutions that HTML provides -- the class attribute, the rel 
>>> attribute, and other HTML extension mechanisms -- to extend the 
>>> language in a controlled manner. They don't have to be the only group 
>>> inventing extensions. They are evidence that the HTML extension 
>>> mechanism is functional enough to create entire vocabularies and 
>>> deploy them on major sites.
>> Again, it doesn't scale, as far as I can tell.
> 
> Why would they scale any worse than any other scheme? I don't understand 
> what it is you think doesn't scale here. Don't forget that you don't have 
> to use the naming scheme Microformats use. You can use URIs if you like.

URIs could work, as long as there's an agreement about the format of 
these identifiers. Of course, as others have pointed out as well, it's 
still a totally ugly way to achieve the goal.

>>> To paraphrase jwz, if you try to solve extensibility with URIs, and 
>>> then try to solve the problems of URIs with prefixes, now you've got 
>>> three problems. Prefixes don't solve the URI problem, they just 
>>> exacerbate it,
>> There is no "URI problem".
> 
> If there's no problem, what are prefixes solving?

The verbosity problem. I wasn't aware that you call that "the URI problem".

> The URI problem is that they are too verbose.
> 
> (They have other problems as well, as I said recently and as you quoted 
> above, e.g. people get confused with relative URIs, people who use HTTP 
> URIs -- most people, in practice, for namespaces -- get confused about 
> when or how to derefernce them, etc.)

You don't dereference them (automatically). Put that into the spec, and 
there will be no confusion.

>>> as has been detailed in depth in the last few days -- people find 
>>> prefixes inordinately confusing, they add a level of indirection where 
>>> none is
>> People also find class names for CSS confusing.
> 
> Indeed. Let's learn from our mistakes instead of adding more.

So, out of curiosity, what would be a better design for CSS?

Because, you need to consider both the advantages and disadvantages for 
a solution. In the case of naming, you seem to focus entirely on the 
drawbacks.

>>> needed, they separate the declaration and the use of the name, they 
>>> split names in two, they introduce their own problem with clashes, and 
>>> in
>> Which problem with clashes? Could you elaborate?
> 
> To demonstrate this in XML, consider this document:
> 
>    <a:foo xmlns:a="http://a.example.com/">
>     <a:bar/>
>    </a:foo>
> 
> ...and now consider this fragment:
> 
>    <a:quux xmlns:a="mailto:ian@hixie.ch">
>     <a:bar/>
>    </a:quux>
> 
> If I try to copy the <a:bar/> from one document to another, then it'll 
> change namespace unless I go out of my way to fix up the prefix 
> declarations.
> 
> Or similarly, if I have:
> 
>    <x xmlns:a="http://a.example.com/">
>     <z a:test="..."/>
>    </x>
> 
> ...and:
> 
>    <x xmlns:a="mailto:ian@hixie.ch">
>     <z a:test="..."/>
>    </x>
> 
> ...and I try to merge these two documents, then unless I fix up the 
> prefixes, I end up with a problem with the two attributes.
> 
> None of these problems exist if you don't have prefixes.

Yes, that can happen.

That being said, I've been working with XML + namespaces since these 
specs came out, and that problem never *ever* occurred to me in 
practice. Probably because in practice, people use consistent prefixes.

>>> That's only true if you think that others are likely to actually 
>>> develop class names that look like URIs that clash with your own, 
>>> something that I believe is far, far, far less likely that the risk of 
>>> clashes with URIs themselves, e.g. with a domain expiring, being 
>>> registered by someone else, and having that new owner mint clashing 
>>> names.
>> Again: if you fear that issue, don't use a URI scheme that depends on 
>> the "current" state of DNS, so choose a URN.
>>
>> It seems you're consistently confusing "URI" with "http URL".
> 
> How do you propose to have a distributed extension system with URNs, if 
> you're not using the domain name system to guarantee uniqueness? Aren't 
> you just trading one central repository (the HTML WG) for another (the URN 
> registry)? Could you elaborate on how you see this working?

I recommend reading the URN specs. For instance, the namespaces below 
"urn:isbn", "urn:ietf" and "urn:uuid" are controlled by different 
entities (or for the case of UUIDs not controlled at all).

In particular, the "tag:" URN scheme solves the DNS issue by adding a 
timestamp (I did mention that before, didn't I???)

> At this point, to be honest, I've lost track of what you're trying to 
> solve and why the existing mechanisms don't solve them. If there is no 
> risk of people making up URNs that clash with yours, then why can't you 
> just use URNs in the class attribute as your extension mechanism?

I could. But I'd prefer

- the spec to give rules for the formats of these names, so clashes are 
avoided (if you use a URI, use one you have authority over), and

- a more compact syntax (be it QName, CURIE, whatever).

And of course I'd really prefer a mechanism that uses the same syntax as 
in other markup languages, obviously (or is as close to them as possible).

BR, Julian
Received on Tuesday, 5 August 2008 10:53:24 UTC