Re: Comment on Requirements for Language and Direction Metadata in Data Formats (Editor's Draft 17 July 2017)

section 4.1, 2nd paragraph.

About examples in Hebrew to avoid issues related to the display of 
cursive characters. Well, tests are about RTL script, whatever the 
language is. However, there are  examples of words like وردة where 
letters doesn't join (composed of letters not joining on their left). In 
reverse order it is ةدرو and can be used as example.

A small sentence at the end of § 4.1.1 test#1, could be "A similar test 
for the Arabic string "ةدرو!" should look like if displayed correctly : 
!وردة

The point is that some Arabic reader or someone aiming Arabic world, 
might give the paper a quick reading, and think that Arabic script is 
not or little affected by the problem.

Najib


On 24/07/17 16:48, Najib Tounsi wrote:
>
> - §2.1 1st para.
>
> "a dedicated system with an interface that *allows* base direction to 
> be specified during input "
> s/allows/*asks for*/
> "if you are lucky" an input interface will ask you to specify the 
> direction of the string you type. If it allows only, you may not 
> specify it. Unless there is a default, but it may not be what user wants.
>
> - 2nd para.
> "When a string is created, it's necessary to [..] take steps [..] to 
> set the string up in a way that communicates the language/base direction."
> Would add "that *ALSO* communicates the language/base direction."
>
> - §2.3 "Decoding information", is in relation with §4.2 "The main issue"
> In 4.2, "The main issue is how a consumer of a string will know what 
> base direction should be used for that string". I understand *if* this 
> base direction is *not* communicated with the string.
>
> If my understanding is correct, I suggest to add something like "See 
> §4.2", at the end of the sentence "Even if no action is taken by the 
> producer, the consumer must decide what rules to follow in order to 
> decide on the appropriate base direction/language." in section 2.3 
> "Decoding information". Indeed, this section is about to decode the 
> [producer] information since "the consumer of the string [has to] 
> understand how the producer did". So, the sentence might seem a little 
> out of context.
>
> Najib
>
>
> On 24/07/17 15:14, Najib Tounsi wrote:
>> Hello,
>>
>> Here are some few comments on the document.
>>
>>
>> - Producer and consumer are used in a well defined way (section2). 
>> These terms are encountered two times earlier in §1.1, and even if 
>> the context is clear, I think it's better to indicate there that 
>> these words will be well defined later (section 2 in this case).
>>
>>
>> - §1.1, 2nd paragraph after the JSON example,
>> "[..] For each of the fields containing natural language text [..] 
>> there will be a language attribute and base direction stored as 
>> metadata [..]
>>  "These *data fields* are used in a variety of ways [..]
>> I understand "These *metadata* are used ...". The wording "data 
>> fields" may create an ambiguity with the JSON fields containing text.
>> Besides, you say "the data structure provides no place to store these 
>> [i.e. metadata]."
>>
>> - Last paragraph before §1.2
>>
>> "They [producer and consumer] may have other considerations, such as 
>> field length, that are affected by the insertion of additional 
>> *controls* or markup".
>> Or (among "other considerations") the fact that those controls may 
>> use different escape sequence, e.g. ‎ instead of \u200e.
>> (Your example : "authors": [ "\u200eHerman Melville" ], // contains 
>> LRM as first character)
>>
>> Regards,
>> Najib
>>
>>
>

Received on Monday, 24 July 2017 22:38:49 UTC