RE: Advice on making IRI document suitable for reference by HTML (and other specs)

(personal response)

Erik wrote:
> >
> > Since the IRIbis work is likely to take a long time, my
> recommendation

Really? I thought the goal was to finish this work quickly.....


Roy replied:
> 
> That is not even an option.  The options are to (a) fix HTML5 so
> that it
> only defines what is needed for HTML and defers to the relevant
> (and
> far more standard) specifications for the output; (b) fix IRIbis so
> that
> it defines what is sufficient for HTML5 and replace all such
> definitions
> in HTML5 with pointers to IRIbis; or, (c) terminate HTML5 as a
> standards
> effort.

I think that, ideally, we'd like to manage it so that HTML5 doesn't end up defining some new quondam IRI surrogate that is somehow different from an IRI (we have plenty of these already and they are a problem, cf. "LEIRI"). If HTML5 doesn't end up referencing IRIbis for at least what Roy is calling the "output", then both specifications have, I think, failed. I tend to think that:

(a) IRIbis should focus on defining resource identifiers and that HTML5 should reference IRIbis everywhere that a URI-like-identifier is wanted. That is, IRIbis defines what is (and is not) a valid identifier, how to escape invalid values, and so forth. If HTML5 is forced to define (as opposed to repeating) any of this, then something is very likely wrong with IRIbis.

(b) HTML5 should focus on defining how to process HTML. This includes taking what Roy refers to as the input (the contents of an href, for example) and parsing/processing it to produce an IRI. Some "Web references" are not valid IRIs. HTML5 should say how to handle these. Some "Web references" may contain characters not valid in an IRI. HTML5 should say how (or whether) to handle this. And so forth.

(c) Where IRIbis has done a good job of this already, HTML5's "say how to handle this" will be a pointer to IRIbis. But I rather suspect that many of these cases will be specific to HTML5 and will be difficult to abstract into IRIbis. For example, trimming whitespace from an IRI reference is specific to parsing HTML and has nothing to do with the address itself. By contrast, how to encode a space inside an IRI should probably be dealt with by IRIbis.

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization WG

Internationalization is not a feature.
It is an architecture.

Received on Thursday, 31 December 2009 22:25:38 UTC