[Bug 12417] HTML5 is missing attribute for specifying translatability of content


--- Comment #33 from Felix Sasaki <felix.sasaki@dfki.de> 2011-07-29 10:36:26 UTC ---
Hello Jörn,

(In reply to comment #32)
> This is a very interesting thread. The request for an additional markup element
> or just a new attribute/value pair is an important issue for the global
> multilingual web. This issue is related to the possible consumption process of
> information encoded in HTML, and here we have to distinguish, for example, the
> following three use cases: (1) (human) user wants a translation into her
> language, (2) NLP application (searching, trawling, analyzing) wants to provide
> multilingual results, and (3) integration into a localization and translation
> process chain. In a first approximation, the introduction of a new HTML5
> language element sounds feasible and appropriate. However, this might end up
> with additional requests for the markup of terminology, sentence boundaries,
> semantic constructs, etc. which are all legitimate demands with convincing use
> cases, i.e. to effectively guide (machine) translation applications and to
> enhance the output quality of these applications. We already had the elements
> "acronym" and "abbrev" in HTML 4, and now in HTML5 only "abbr" has survived. So
> for me it is not a good idea to just introduce new syntactic sugar.

HTML5 added the "syntactic sugar" of a spell check attribute (see this thread),
since there is a clear use case and implementations. I think you can say the
same about a translate mechanism, see the list of implementations and groups
interested in this http://www.w3.org/Bugs/Public/show_bug.cgi?id=12417#c30 ,
and the people in this thread. So "translate" is much more central for many
users compared to what you mentioned above, e.g. "semantic constructs" or
"sentence boundaries". 

> Let's analyse a bit more the possible use cases regarding what HTML5 already
> has on board as a potential solution, and also let's bear in mind that HTML5 is
> about web technologies and accessibility (see "wai-aria") which to some extend
> is included in the above translation scenario requirements.
> One solution is with styling, for example: <p class="translatable"
> lang="en-US">...<b class="term">semantic styling</b>...</p>. This solution was
> already proposed in this thread, and it seems not optimal for our intended
> application scenario because it may have side effects with traditional css
> styling apporaches.
> Another possibility is with microdata, for example: constructs with
> "itemscope", "itemprop", "itemref" and "itemid" including itemtype attributes,
> and the use of existing (or new) microdata vocabularies. This approach is
> pretty much inline with the discussions of a semantic HTML5, and is backed by
> the use cases above.
> In summary, it turns out that we need to establish some "best practices" for
> specifying the translatability of content, and that web translation
> applications should be guided through them. Therefore, I suggest that we also
> discuss a possible microdata approach. I am looking forward to your opinions.

As Yves said in this thread, it is OK to think about various mechanisms. The
key is to have agreement on one solution, and to have it available in the HTML
dom, like in the "spell check" use case. This would not be the case for
microdata or also rdfa, since, as you know, both are not part of the HTML5

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Friday, 29 July 2011 10:36:33 UTC