Re: Bidir text and Unicode

And it’s not just bidi – it also applies to language specification(s) as well.

From: Laurent Le Meur <laurent.lemeur@edrlab.org>
Date: Tuesday, June 27, 2017 at 8:50 AM
To: Ivan Herman <ivan@w3.org>
Cc: W3C Publishing Working Group <public-publ-wg@w3.org>
Subject: Re: Bidir text and Unicode
Resent-From: <public-publ-wg@w3.org>
Resent-Date: Tuesday, June 27, 2017 at 8:51 AM

Yes, my thoughts are related to the possible choice of JSON (JSON-LD) for expressing metadata.
I'll open an issue in that scope.

L

Le 27 juin 2017 à 17:45, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> a écrit :


On 27 Jun 2017, at 17:27, Laurent Le Meur <laurent.lemeur@edrlab.org<mailto:laurent.lemeur@edrlab.org>> wrote:

A question was raised during the F2F meeting in NYC, about the proper internationalization of UTF-8 metadata values (eg. the book title).

I quote Ivan from the minutes: "On the i18n side, we will need to be careful about ids, uris, iris, etc. w/respect to i18n char-sets. Another area we need to be careful about is metadata, which also have issues with the char-sets for the actual text content. One example is mixing bidi text in the metadata content.",

Reading http://www.iamcal.com/understanding-bidirectional-text/<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.iamcal.com%2Funderstanding-bidirectional-text%2F&data=02%7C01%7C%7Cf55d132a26254f420cc308d4bd745be7%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636341754837597989&sdata=s7X7yLRp8g95O9jUyQ5%2BFy2lsJopxH1LJr%2BHDK34G4w%3D&reserved=0>, I see here a use of the HTML dir attribute, which will not be available natively in a JSON manifest; so we may have to create a JSON dir attribute representing document order. I also see the "implicit marker characters" (Left-to-Right Mark and Right-to-Left Mark) which help tailoring the direction of "neutral" characters. And the existence of "explicit markers" which describe a local text direction.

Therefore it appears that the only item we need to add to a JSON manifest to assure proper rendering of international text is a "document order" (a dir attribute that can be injected in the HTML rendering of the metadata values).

Any thought on this before I create a Github issue on the subject?

That is indeed the core of the problem as I understand it. B.t.w., another resource is:

https://www.w3.org/International/articles/inline-bidi-markup/<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FInternational%2Farticles%2Finline-bidi-markup%2F&data=02%7C01%7C%7Cf55d132a26254f420cc308d4bd745be7%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636341754837597989&sdata=8B%2FxTReFup%2BRyE7T0c1HvVZ7Aie3%2Fgqkytxr%2BKoDumU%3D&reserved=0>

However… although it may be good to have the issue recorded, the exact solution depend on other issues, mainly the kind of serialization that we will use, and we just agreed that this decision will have to be taken later. (Eg, JSON-LD is somewhat different insofar as it reflects RDF and RDF is not really good for this either…). Any solution will have to be seen in a larger context because this is an issue that is not publication specific either; the I18N people may have a preferred general solution for JSON, or they will have one…

Ivan





Laurent Le Meur


----
Ivan Herman, W3C
Publishing@W3C Technical Lead
Home: http://www.w3.org/People/Ivan/<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2FPeople%2FIvan%2F&data=02%7C01%7C%7Cf55d132a26254f420cc308d4bd745be7%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636341754837597989&sdata=sxRiGKoC4gnLDIH966JECwyfzMc4J8a95P5bBTjVUsw%3D&reserved=0>
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Forcid.org%2F0000-0003-0782-2704&data=02%7C01%7C%7Cf55d132a26254f420cc308d4bd745be7%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636341754837597989&sdata=9aH53KCEoMl3ZJoAH8bQKwpvtSw6bIK%2Fpw1HhWuE1ik%3D&reserved=0>

Received on Tuesday, 27 June 2017 15:53:07 UTC