W3C home > Mailing lists > Public > public-i18n-its-ig@w3.org > July 2009

Re: ITS as a microformat (proposal)

From: Felix Sasaki <felix.sasaki@fh-potsdam.de>
Date: Thu, 16 Jul 2009 01:59:10 +0900
Message-ID: <ba4134970907150959j30d14abaw3a2d082946f2697b@mail.gmail.com>
To: public-i18n-its-ig@w3.org
Hi all,

I made some progress on implementing microformats in ITS, see
http://fabday.fh-potsdam.de/~sasaki/its/itsmf2xliff.html
this form allows for HTML input, which is processed as follows:

1) conversion to XHTML via tidy
2) processing of ITS "microformats", currently expressed only as the class
attribute with the value "notranslate" or "translate". Processing here means
just conversion to its:translate.
3) further "translate" processing, also taking the global rules for XHTML
from http://www.w3.org/TR/xml-i18n-bp/#relating-its-plus-xhtml into account
4) Conversion of the output to XLIFF

To be done: Conversion back to the possibly non-XML input format,
implementation of more data categories as microformats, and using other
extensibility mechanisms like RDFa.

Comments very welcome.

Felix

P.S.: I also made a fix on the ITS general decorator, see now version 0.3
at http://www.w3.org/International/its/wiki/ITS_General_Decorator#Downloads


2009/7/3 Felix Sasaki <felix.sasaki@fh-potsdam.de>

>
>
> 2009/7/3 Jirka Kosek <jirka@kosek.cz>
>
>> Felix Sasaki wrote:
>> > Interesting approach ... esp. since validation is a kind of lax. What
>> would
>> > you or others think of the following approach: define a grammar (below
>> in
>> > ABNF form) to parse the ITS local data categories, e.g. like this:
>> >
>> > ITSMF = itsprefix [translate] [terminology] [localizationNote]
>> > [directionality]
>> > itsprefix = "its"
>> > translate = "-translate-" ("yes" | "no")
>> > terminology = "-term" ["-termInfoRef:" IRI] ; IRI production from RFC
>> 3987
>> > localizationNote = ...
>> > terminology = ...
>> >
>> > That is, have the translate, terminology, localization note and
>> > directionality data categories all "packed" in a class attribute.
>>
>> Seems little bit like a markup abuse, but microformats are all about
>> abuse, after all ;-)
>
>
> Exactly :)
>
>
>
>>
>>
>> But I don't think that IRIs should encoded inside class name, e.g.
>> content of termInfoRef. This way IRI is not exposed as some kind of link
>> in HTML representation and user agents can't directly act on it.
>
>
>
> Understand, but do we want user agents to act on ITS information, or to
> have that information just for further processing by e.g. localization
> tools. Not sure ...
>
>
>
>> But for
>> other categories your approach might work well.
>
>
> Thanks, I will continue work on this.
>
> Felix
>
>
>>
>>
>>                                Jirka
>>
>> --
>> ------------------------------------------------------------------
>>  Jirka Kosek      e-mail: jirka@kosek.cz      http://xmlguru.cz
>> ------------------------------------------------------------------
>>       Professional XML consulting and training services
>>  DocBook customization, custom XSLT/XSL-FO document processing
>> ------------------------------------------------------------------
>>  OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
>> ------------------------------------------------------------------
>>
>>
>
Received on Wednesday, 15 July 2009 16:59:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 15 July 2009 16:59:51 GMT