RE: Transliteration-only content

Hello Shaun,

It's good to see the work you are doing with Mallard and ITS.

I did look at Christian's proposal for autoLanguageProcessingRule and it looks like a sensible way to specify some of the action that need to be perform on the source document. I can see how this would be used in your context. But there are two aspects where I'm not sure about:

=== a) What to do with the output?

I know it's not really ITS' problem: ITS job is to identify the nodes that need to be transliterated and stops there. But from a practical viewpoint, how this information can be carried on to the next step? Like you say there is no way currently to represent it in PO. There is none in XLIFF either, or any translation format that I know of.

This shouldn't stop us to have the ITS data category. I'm just wondering how used it'll be since 

=== b) What processing expectation is attached to this?

I wonder about the semantic attached to the values 'transliteration' and 'machineTranslation'. Maybe you or Christian have already some precise idea.

For example: is that means such content must be *only* transliterated for example? Then should it be marked also as translate='no'? If the next step provides MT capability, should any content marked as 'transliteration' be kept only that, even if there is a way to do an actual translation for it?

In other word I'm wondering about what processing people will attach to those labels?

Maybe more importantly, it seems also that both values would apply to 'translatable' text. They really look like additional qualifiers to a translatable content. I would imagine they could also be just a new optional attribute in the translateRule data category.

Cheers,
-ys

Received on Sunday, 1 May 2011 13:45:10 UTC