W3C home > Mailing lists > Public > public-multilingualweb-lt-tests@w3.org > December 2012

Re: Standoff markup issue

From: Leroy Finn <finnle@tcd.ie>
Date: Mon, 10 Dec 2012 17:44:31 +0000
Message-ID: <CAMYWBwv-SmXEzmvrMrG8AzfQ=9qiV6ce8Wkh5sFA8gFqmEzZWA@mail.gmail.com>
To: Pablo Nieto Caride <pablo.nieto@linguaserve.com>
Cc: Fredrik Liden <fliden@enlaso.com>, Yves Savourel <ysavourel@enlaso.com>, Multilingual Web LT-TESTS Public <public-multilingualweb-lt-tests@w3.org>
Hey Yves, Pablo and Fredrik,

I have updated the output for proveance for standoff markup. The updates
are reflected in the git hub.

Leroy


On 5 December 2012 10:15, Leroy Finn <finnle@tcd.ie> wrote:

> I will implement the standoff markup output in the style described by
> Fredrik. I have to make some other minor corrections to other
> data categories so I will do them first and then do this update. Thanks for
> the feedback.
>
> Thanks,
> Leroy
>
>
> On 5 December 2012 09:05, Pablo Nieto Caride <pablo.nieto@linguaserve.com>wrote:
>
>> Hi all,
>>
>> Thank you Yves and Fredrik! I would go for suggestion 1 with removing the
>> [x] part if there is just one record per element.
>>
>> Cheers,
>> Pablo.
>>
>> -----Mensaje original-----
>> De: Fredrik Liden [mailto:fliden@enlaso.com]
>> Enviado el: martes, 04 de diciembre de 2012 21:54
>> Para: Yves Savourel; 'Multilingual Web LT-TESTS Public'
>> Asunto: RE: Standoff markup issue
>>
>> Actually, the output for Provenance and Localization Quality Issue is
>> currently inconsistent. So in line with Yves notes below, perhaps we can
>> just update Provenance to follow the same logic of LocQuality. Unless there
>> was a specific reason to make it different. Guessing we're still making
>> some updates to test files in those categories. So far I can only see one
>> example, Provenance html example 3, that actually has more than 1 standoff
>> record but on the draft page there several examples.
>>
>> As for the format of multiple records:
>>
>> Currently provenance3html output looks like this.
>> .....
>> /html/head[1]/script[1]/its:provenanceRecords[2]
>> /html/head[1]/script[1]/its:provenanceRecords[2]/@its:version
>> /html/head[1]/script[1]/its:provenanceRecords[2]/@xml:id
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]
>>        orgRef="http://www.legaltrans-ex.com/"  person="John Doe"
>> provRef="http://www.vistatec.com/job-12-7-15-X31/reviewed/prov/re8573469"
>>       revOrgRef="http://www.vistatec.com/"    revPerson="Tommy Atkins"
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@orgRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@person
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@provRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revOrgRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revPerson
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]
>>        revOrgRef="http://john-smith.qa.example.com"    revPerson="John
>> Smith"
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revOrgRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revPerson
>> /html/body[1]
>> /html/body[1]/p[1]      provenanceRecordsRef="#pr1"
>> /html/body[1]/p[1]/@its-provenance-records-ref
>> /html/body[1]/p[2]      provenanceRecordsRef="#pr2"
>> /html/body[1]/p[2]/@its-provenance-records-ref
>> .....
>>
>> So assuming we change to follow LocQuality it would be something like:
>> /html/head[1]/script[1]/its:provenanceRecords[2]
>> /html/head[1]/script[1]/its:provenanceRecords[2]/@its:version
>> /html/head[1]/script[1]/its:provenanceRecords[2]/@xml:id
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@orgRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@person
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@provRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revOrgRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revPerson
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revOrgRef
>>
>> /html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revPerson
>> /html/body[1]
>> /html/body[1]/p[1]      provenanceRecordsRef="#pr1"
>> /html/body[1]/p[1]/@its-provenance-records-ref
>> /html/body[1]/p[2]      provenanceRecordsRef="#pr2" orgRef[1]="
>> http://www.legaltrans-ex.com/"   person[1]="John Doe"    provRef[1]="
>> http://www.vistatec.com/job-12-7-15-X31/reviewed/prov/re8573469"
>>  revOrgRef[1]="http://www.vistatec.com/" revPerson[1]="Tommy Atkins"
>> revOrgRef[2]="http://john-smith.qa.example.com"     revPerson[2]="John
>> Smith"
>> /html/body[1]/p[2]/@its-provenance-records-ref
>>
>> So suggestions for format of multiple records per node are:
>>
>> 1.
>> /element        attrRef="#pr2" attr_a[1]="val"  attr_b[1]="val"
>> attr_c[1]="val" attr_b[2]="val" attr_c[2]="val"
>> Note: Assuming the records are read in order and that attributes are
>> listed by the [x] string first and attribute name first.
>>
>> 2.
>> /element        attrRef="#pr2" [attr_a="val", attr_b="val",
>> attr_c="val"][attr_b="val", attr_c="val"]
>> Note: Something like this perhaps. Where the bracket position identifies
>> the record number.
>>
>> Any variation should be fine as long as we determine the syntax and sort
>> order. If there's only one record possibly remove the index part [x].
>>
>> Cheers,
>> Fredrik
>>
>> -----Original Message-----
>> From: Yves Savourel
>> Sent: Tuesday, December 04, 2012 10:15 AM
>> To: 'Multilingual Web LT-TESTS Public'
>> Subject: RE: Standoff markup issue
>>
>> Hi Leroy,
>>
>> Thanks for the example.
>> I have a few notes:
>>
>> There are two issues IMO with this output, or more exactly there are two
>> reasons not to do this representation:
>>
>> -- a) So far we have set the information (the right part after the path)
>> on the element or attribute to which that information applies. So in your
>> example the stand-off information applies to the p elements. that is, IMO,
>> where the information should be output. Not on the stand-off nodes
>> themselves.
>>
>> I even think it's vital to do that because it forces the implementations
>> to resolve the reference. This reference is a reference to ITS data, not a
>> reference to some data carried by an ITS information like most xyzRef
>> attributes are.
>>
>> -- b) We cannot set the output information on the stand-off nodes like in
>> this example because the stand-off nodes may be in a different document.
>> The stand-off reference attributes like locqualityIssuesRef or
>> provenanceRecordsRef may point to an external XML or HTML documents where
>> that stand-off markup resides. So an implementation needs to fetch the
>> document, open it, get the stand-off data and output that on the node to
>> which it applies. And yes: that is not something easy to implement.
>>
>>
>> Look at it from a different viewpoint: We should have the same set of
>> information output on a given node whether the ITS data are represented as
>> local markup or stand-off markup. The only difference is that for stand-off
>> markup we may have a several items of that type of information.
>>
>> Fredrik is looking at some examples of possible output and will post an
>> email later.
>>
>> cheers,
>> -yves
>>
>>
>>
>>
>>
>>
>>
>
Received on Monday, 10 December 2012 17:45:00 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 10 December 2012 17:45:00 GMT