- From: Yves Savourel <ysavourel@enlaso.com>
- Date: Thu, 6 Nov 2014 08:30:28 -0700
- To: "'public-i18n-its-ig'" <public-i18n-its-ig@w3.org>
Hi Felix, I have nothing against this, but is there a reason for using offset? I suppose it would make the output more compact when you have large chunks of text. Using offset would make the creation of the output slightly more complicated (at least from a Java parser viewpoint). Cheers, -yves -----Original Message----- From: Felix Sasaki [mailto:felix@sasakiatcf.com] Sent: Thursday, November 6, 2014 5:24 AM To: Yves Savourel Cc: public-i18n-its-ig Subject: Re: ACTION-54: Try to come up with example of xliff+its test format / output Hi Yves, all, thanks, this looks good. One suggestion. Currently you are copying strings from the input in the path description, in case of text nodes; e.g. > /xliff/file/unit/source/"DATA " maybe one could also work with character offsets, e.g. /xliff/file/unit/source/#char=0,5" And say that a tool that generates the output should preserve white space then generating the offsets? The syntax #char=0,5 is not important, just a way to identify the offsets. Cheers, - Felix Am 04.11.2014 um 04:19 schrieb Yves Savourel <ysavourel@enlaso.com>: > Hi all, > > Following up on this action item: > > The initial thought was to use a text file with two columns: > - The first one with XLIFF's fragment identifier. > - The second with the ITS data for the given element. > > But I realized since that not all locations with ITS data will have an > ID, so it may be better to use something different for the first column, closer to what we have with the ITS test output. > > The first column would be the 'path' of the element, up to the unit, > then, depending on the type of node, some additional > information: fragId for the markers, quoted text for the text. Because > XLIFF may have overlapping markers, we need to also represent the text nodes as they may show inherited information. > > For example for the Translate data category the following file: > > <?xml version="1.0"?> > <xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" version="2.1" srcLang="en" > xmlns:xits="urn:oasis:names:tc:xliff:xits:2.1"> > <file id="f1" translate="no"> > <unit id="u1"> > <segment> > <source>Source 1.</source> > </segment> > </unit> > <unit id="u2" translate="yes"> > <segment> > <source>Text <mrk id="m1" translate="no">DATA <mrk id="m2" translate="yes">text </mrk>DATA </mrk> text.</source> > </segment> > </unit> > <unit id="u3" translate="yes"> > <segment> > <source><sm id="m1" translate="yes"/>Text <sm id="m2" > translate="no"/>DATA <em startRef="m1"/>DATA <em startRef="m2"/>text.</source> > </segment> > </unit> > </file> > </xliff> > > Would result in the following output: > > /xliff translate=yes > /xliff/file translate=no > /xliff/file/unit translate=no > /xliff/file/unit/source/"Source 1." translate=no > /xliff/file/unit translate=yes > /xliff/file/unit/source/"Text " translate=yes > /xliff/file/unit/source/{START:/f=f1/u=u2/m1} translate=no > /xliff/file/unit/source/"DATA " translate=no > /xliff/file/unit/source/{START:/f=f1/u=u2/m2} translate=yes > /xliff/file/unit/source/"text " translate=yes > /xliff/file/unit/source/{END:/f=f1/u=u2/m2} translate=no > /xliff/file/unit/source/"DATA " translate=no > /xliff/file/unit/source/{END:/f=f1/u=u2/m1} translate=yes > /xliff/file/unit/source/" text." translate=yes > /xliff/file/unit translate=yes > /xliff/file/unit/source/{START:/f=f1/u=u3/m1} translate=yes > /xliff/file/unit/source/"Text " translate=yes > /xliff/file/unit/source/{START:/f=f1/u=u3/m2} translate=no > /xliff/file/unit/source/"DATA " translate=no > /xliff/file/unit/source/{END:/f=f1/u=u3/m1} translate=no > /xliff/file/unit/source/"DATA " translate=no > /xliff/file/unit/source/{END:/f=f1/u=u3/m2} translate=yes > /xliff/file/unit/source/"text." translate=yes > > The start markers would show the metadata for the node, the end > markers would show the metadata for after the marker is closed (or both start and end can show the metadata for the span they denote: it doesn't really matter). > > This is just something to start with, feedback and better ideas are welcome. > > In the spirit of implementing things early and often, I've implemented a new command in the Lynx tool that creates the test file. > You can do for example: > > C:/>lynx -its translate myFile.xlf > > This will generates myFile.xlf.txt with the test results (and output > them on the console). Just type -its ? to get the list of the data categories currently supported. > > The latest version of Lynx is here: > http://okapi.opentag.com/snapshots/okapi-xliffLib_all-platforms_1.1-SN > APSHOT.zip > > Cheers, > -yves > > > > >
Received on Thursday, 6 November 2014 15:31:01 UTC