- From: David Lewis <dave.lewis@cs.tcd.ie>
- Date: Tue, 01 May 2012 21:54:00 +0100
- To: public-multilingualweb-lt@w3.org
- Message-ID: <4FA04D68.9080102@cs.tcd.ie>
Hi Yves, I think you are right about me thinking more along the lines of XLIFF id rather than XLIFF resname, but perhaps not exactly in the way you characterise it. I am thinking in terms of an id that can be used to track the progress of a specific segment in the content documents (lets park the use of multi-segement translation unit in XLIFF for the moment) against the corresponding XLIFF id. However, I'm specifically concerned with the round trip use cases where the document may pass from a CMS to an XLIFF cycle and back again several times. The use cases I see for this are driven by the need for more continuous translation, pipelined at the granularity of the segment rather than the document, rather than once off hand-overs of documents between processes. Possible use cases might be: 1) a document is having its source revised and is being translated at the same time. Readiness of different elements is signalled in the document using the readiness/processTrigger data category, which is monitored by an LSP which provide updates of segments to be translated based on these flags and distributes translations using XLIFF. Consistent mapping between all segements and xliff translation unit ids is required to ensure that new, modified and deleted trans-units are correctly updated and kept in sequence. 2) Translations from one LSP may be undergoing monolingual review through direct access to the target on the CMS, while selected bi-lingual translation review is being conducted in parallel by another LSP. Feedback from both reviews may need to be routed back to the translating LSP, so document element-to-XLIFF mappings would be need to be reliably maintained for the two sets of XLIFF ids operated by two different LSPs. In these sort of use cases, where their is ongoing round-tripping between the CMS and TMS/XLIFF, then the need for consistent mapping between the source document on the CMS and the versions LSPs have, may soften the assumption that clients won't be willing to add additional elements to the document on the CMS. One can imagine that any augmented versions of the content documents would live on a 'staging' CMS while it is subject to preparation, translation and review, but prior to publication. So, this implies a need for an id that is indeed relevant just to the localization process, but that never-the-less needs to support a persistent mapping between CMS element and trans-unit ID, potentially over several CMS-TMS roundtrips. The difference to resname as I understand it, is that resname is optional and in a sense best effort - if you can't map a trans-unit back to a particular element in the source, you can still try and translate the string, you just loose some contextual info. So it doesn't have the requirement to comprehensively *maintain a mapping between all *trans-units and source content elements in the way I think the above use cases require. Hope that explains the requirement I had in mind a bit more clearly. Finally, I'm not sure in any of these cases we are talking about an explicit id data category are we? Would the implementation in fact be rules for generating and maintaining the mapping between source elements and XLIFF ids. Very speculatively, these could be expressed as some cascading rules for using: 1st) existing ids if present; 2nd) combo rules of ID and element names as your the updated text; 3rd) if allowed new id in existing elements; 4th) if allowed new elements with specific ids; 5th) some sort of external hashing pointer (e.g. http://nlp2rdf.org/nif-1-0#toc-nif-recipe-context-hash-based-uris) ; 6th) some sort of character count-based pointer (e.g. http://nlp2rdf.org/nif-1-0#toc-nif-recipe-offset-based-uris). It would be a ruleset applicable to the document that we would need to record. cheers, Dave On 01/05/2012 15:20, Yves Savourel wrote: > I guess what we need to clarify is what are the requirements of the ID value we are discussing. > > To me it should be: > - unique at least within the document > - the value should be the same in new versions of the document > > That's because the type of tasks I would use it for are tasks across versions of the same document. > > But Dave, you are maybe thinking of something different: how to get an ID valid for a given document during its localization cycle. In other word a value that doesn't need to survive after the document is done. > > In other words you are thinking XLIFF 'id' and I'm thinking XLIFF 'resname'. > > Cheers, > -ys
Received on Tuesday, 1 May 2012 20:54:32 UTC