W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > February 2013

RE: [ISSUE-55] Re: updates provenance mapping and best practive in ITS-XLIFF mapping

From: Yves Savourel <ysavourel@enlaso.com>
Date: Fri, 8 Feb 2013 05:47:52 -0700
To: <public-multilingualweb-lt@w3.org>
Message-ID: <004201ce05fa$82c3b570$884b2050$@com>
Hi Dave, all,

[BTW: I think ITS->XLIFF mapping discussions should probably be visible to the whole group so we have public archives, and others can follow it if they wish too. So I'm posting this to the list.]

Now, a few comments on the BP section on Provenance in the wiki:


> If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains 
> any of the translation or translation revision related attributes, namely: its:person, 
> its:personRef, its:org, its:orgRef, its:tool, its:toolRef, its:revPerson, its:revPersonRef, 
> its:revOrg, its:revOrgRef, its:revTool or its:revToolRef, then the its:provenanceRecordsRef 
> should only be used as local of global annotation selecting xlf:target or xlf:bin-target 
> elements or a xlf:mrk inline markup within either of those XLIFF elements. 
> This is because the provenance mark-up in this case is appropriate only to translated text.

I don't understand "... only be used as local of global annotation selecting xlf:target..."

What "local of global annotation" means? What is a "global annotation" for that matter? Do you mean a global rule?

Do we really want to have global rules?
They are *a lot* more difficult to support than local attributes. And in the case of the XLIFF mapping support this would have to be mandatory for all tools otherwise (if the tools can support only one type of rules, like for ITS in XML/HTML5), two different tools may get different results.

BTW: provRef can have several IRIs right?
If so maybe it should be named provRefs to make that clear.


> If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains 
> only the provRef attribute, then the its:provenanceRecordsRef may be used as local of global annotation 
> selecting any XLIFF elements, since the its:provRef attribute may point to an external provenance 
> records that could relate to an activity that resulted in textual 
> content of any of the elements in an XLIFF file.

I'm really worried about having provenance on any element.

If an application support the 'official' ITS mapping in XLIFF it would have to store and maintain the information all over the place. It's one thing to pass through some user-defined attribute you don't touch and another to actively supporting something.
XLIFF is an interchange format: it is meant to be mapped to whatever internal object model a tool uses: we can't add something like provenance on every single XLIFF element, it would be a nightmare to support.

One can't prevent a tool to put and use provenance info anywhere extension is allowed, and that's fine. But then anything outside the scope of the 'official' ITS->XLIFF mapping is supported only by that tool.


> If, as the result of additional activities upon an XLIFF file results in values in 
> a its:provenanceRecord that forks from that of other elements referencing the same 
> its:provenanceRecords, then that its:provenanceRecords must be copied to a new 
> element with a distinct id, while the reference attribute for the element(s)
> concerned is changed to refer to this new its:provenanceRecords id.

The first its:provenanceRecord should probably its:provenanceRecords no?



-----Original Message-----
From: Yves Savourel [mailto:ysavourel@enlaso.com] 
Sent: Thursday, February 07, 2013 8:52 PM
To: 'Dave Lewis'
Cc: 'Dr. David Filip'; 'Leroy Finn'; 'Phil Ritchie'
Subject: RE: [ISSUE-55] Re: updates provenance mapping and best practive in ITS-XLIFF mapping

Hi Dave, all,

> I was also looking at LQR. This would I guess apply structurally to 
> file and trans-unit. It not in the spirit of the ITS definition that 
> it should apply to something inline, but we can't rule that out so I 
> guess we need a mrk version also?

It's a bit all or nothing: if some data categories like mt-confidence or provenance can be set at the mrk level then LQR should too, no? But then the question is: Do we need that much granularity?

>> - I agree with LQI. Actually it seems LQI could be on trans-unit, 
>> source or target, and mrk.
>
> The only issue for annotation trans-unit is that if fully loaded with 
> source, target, alt-trans, then an ITS process doens't know which 
> content it applies to - though the common sense answer is that in the 
> target.

There are cases that are difficult to map: for example when comparing a target with a source and finding things that should match and don't: we need to be able to highlight both spans.
But I guess in those cases we could put the info on the target and somehow have extensions to highlight the source.


> Also for this data category, I'm not sure if we really need the option 
> to annotate a mrk with the equivalents of its:translate="yes"
> using mtype="x-its-Translate-Yes". There's no its-like override 
> relationship, as I understand it, between a trans-unit translate="no" 
> and mtype="x-its-Translate-Yes" in any children mrk element - the 'no'
> applies presumably to the whole unit. So in the case of a source 
> document unit being:
> <p translate="no">David Cameron <span translate="yes">is the leader of 
> the</span> Tories.</p> this does _not_ map correctly to:
> <trans-unit translate="no">
>  <source>David Cameron <mrk mtype="x-its-Translate-Yes">is the leader 
> of the</mrk> Tories.</source> </trans-unit> correct? So I think 
> mtype="x-its-Translate-Yes" might be redundant.
> Instead I guess we should use the following?
> <trans-unit>
>  <source><mrk mtype="protected">David Cameron</mrk>is the leader of 
> the <mrk mtype="protected">Tories.</mrk></source>
> </trans-unit>

Indeed.
The problem remains for multiple level of embedding...

by the way, many filters will often do something like this:

<source><x id='1'/>is the leader of the<x id='2'/></source>



> - storage size: mmm... it's one of those scoping problem: 
> ...
> For allowedcharacters, doing this as a global rule would in the XLIFF 
> would make sense.

Mmm... global rules in XLIFF would open a big can of worms.
This is points to an important aspect of the mapping.
It should provide the mapping and state that no other ITS construct other than the one defined for the mapping are expected by the tools processing such XLIFF documents.

cheers,
-yves
Received on Friday, 8 February 2013 12:48:23 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:25:08 UTC