W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > May 2012

Re: An additional data category

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 9 May 2012 21:18:40 +0200
Message-ID: <CAL58czr9k+9pdh0xWuU35t7K4PYTc17hhWh7Jv4WBtxCX7ENwg@mail.gmail.com>
To: public-multilingualweb-lt@w3.org
Arle, all

the i18n core working group discussed this today and gave me an action item
to express concerns about the proposals from the Unicode ULI TC. i18n core
will ask ULI to be involved in the coordination of this. So before this is
resolved, Arle or others, please do not add the new data category to the
draft. It is sufficient to keep track of this as an issue

FYI, I deleted from the draft the following

=== Inline Segmentation Marker ===
:*ITS 2.0 should consider compatibility with inline markup denoting segment
boundaries, such as the joiner/nonjoiner discussion in unicode forum

Since before adding a requirement, the i18n core group will discuss this
with ULI . Arle, we may also save a lot of time discussing this



2012/5/9 Dave Lewis <dave.lewis@cs.tcd.ie>

> Hi Arle,
> I think it would be worth including this. The need for a segment marker as
> mark-up is coming up a lot in discussion on the list in relation to other
> data categoeis, e.g. idValue, targetpointer. To date fragment
> identification in ITS has been opportunistic, i.e. we add attributes if
> there's an existing elements or an xpath concoctions that enables it.
> However, may of the new data categories will really only deliver benefit
> if they can be applied comprehensively across all segment's in a document.
> A clear segment mark-up is then a possible solution for implementers who
> want to fully reap these benefits and are willing to bear of mark-up
> overhead involved.
> cheers,
> Dave
> On 09/05/2012 10:10, Arle Lommel wrote:
>> Hi all,
>> I am going to add one more data category set to the list. I was involved
>> with the meeting of the Unicode Technical Committee (UTC) yesterday in the
>> context of a proposal to add two characters to Unicode to allow for
>> overriding of default UAX #29 segmentation behavior. Because of feedback
>> from the W3C Internationalization Activity, the recommendation for these
>> proposed characters will be that they are for use in plain text
>> environments only. The UTC strongly urged that if Unicode adopts the
>> proposed characters that somebody develop a functionally comparable markup
>> solution so that there is parity in markup and nonmarkup environments.
>> Since I just happened to know of an appropriate standards activity for that
>> sort of thing ;-) I thought I'd make a proposal for consideration. I'll
>> post it in the next few days for discussion and consideration.
>> f you all think it's terrible, then I can say I tried, but if you think
>> it's worth consideration, then we may have a good home for this.
>> Thanks,
>> Arle

Felix Sasaki
DFKI / W3C Fellow
Received on Wednesday, 9 May 2012 19:19:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:16 UTC