RE: IRI Templates and Bidi Characters

James M Snell wrote:
> Brian Smith wrote:
> > [snip]
> > This all seems to make it difficult to create a valid BIDI IRI 
> > template using a regular text editor (one that doesn't know 
> > anything about IRI templates). In particular, the implicit LTR 
> > overrides mean that the editor of a BIDI IRI template will see
> > something differently from how the processor processes it,
> > right? The only way the editor can present the template
> > correctly is if it inserts explicit overrides. If explicit
> > overrides are necessary anyway for accurate editing, why are
> > implicit overrides needed?
> Yes, it does make it more difficult and the explicit 
> overrides are required for the editor to render the template 
> correctly.  The implicit overrides are needed in case the 
> explicit overrides are not provided.

This is the part that I don't understand. Why isn't it sufficient to just use the BIDI algorithm, and require overrides when they are needed? The reason that IRIs need special processing is because the BIDI formatting characters cannot appear in them. And the reason the BIDI formatting characters cannot appear in IRIs is because they make it difficult to compare IRIs and because they are invisible. The reason that IRIs have special BIDI rendering requirements is to make them easier for end-users to read and to make up for the restriction against including formatting characters. But, those concerns don't really apply to IRI templates. Instead, being able to accurately edit IRI templates in source code and documentation is the primary concern.

> > I don't know a lot about BIDI, but I would think it would be a lot 
> > simpler to use the exact same rules as IRIs, remove all implicit 
> > overrides, suggest when explicit overrides should be provided, and 
> > specify when and how overrides are inserted/coalesced 
> during the substitution phase. Is there a reason that wouldn't work?
> > 
> The rules specified by RFC3987 are not sufficient as they 
> lead to some rather unfortunate visual effects in templates 
> that contain a mix of RTL and LTR characters.


> The main difficulty here is the mixture of LTR and RTL 
> characters -- which is specifically why rfc3987 indicates 
> that components SHOULD NOT mix LTR/RTL characters.  With 
> {...} tokens, however, it is impossible to avoid mixing 
> characters so we have to jump through some hoops to get 
> things to render properly.
> Regardless of any of this, explicit bidi formatting codes 
> have to be stripped from the template prior to processig so 
> that part is already covered :-)

This is my point. If the BIDI formatting codes are going to be stripped, and templates must be in logical order anyway, then the spec could just say that IRI templates can contain any valid overrides (or markup, like in [X]HTML) as necessary to display them correctly, but the overrides will be stripped by the IRI template processor. This makes IRI templates WYSIWYG in editors, and much simpler to understand and implement.

- Brian

Received on Monday, 3 December 2007 00:04:47 UTC