W3C home > Mailing lists > Public > public-multilingualweb-lt-tests@w3.org > December 2012

RE: Standoff markup issue

From: Pablo Nieto Caride <pablo.nieto@linguaserve.com>
Date: Wed, 5 Dec 2012 10:05:24 +0100
To: "'Fredrik Liden'" <fliden@enlaso.com>, "'Yves Savourel'" <ysavourel@enlaso.com>, "'Multilingual Web LT-TESTS Public'" <public-multilingualweb-lt-tests@w3.org>
Message-ID: <1d6101cdd2c7$a98409a0$fc8c1ce0$@linguaserve.com>
Hi all,

Thank you Yves and Fredrik! I would go for suggestion 1 with removing the [x] part if there is just one record per element.

Cheers,
Pablo.

-----Mensaje original-----
De: Fredrik Liden [mailto:fliden@enlaso.com] 
Enviado el: martes, 04 de diciembre de 2012 21:54
Para: Yves Savourel; 'Multilingual Web LT-TESTS Public'
Asunto: RE: Standoff markup issue

Actually, the output for Provenance and Localization Quality Issue is currently inconsistent. So in line with Yves notes below, perhaps we can just update Provenance to follow the same logic of LocQuality. Unless there was a specific reason to make it different. Guessing we're still making some updates to test files in those categories. So far I can only see one example, Provenance html example 3, that actually has more than 1 standoff record but on the draft page there several examples.

As for the format of multiple records:

Currently provenance3html output looks like this.
.....
/html/head[1]/script[1]/its:provenanceRecords[2]
/html/head[1]/script[1]/its:provenanceRecords[2]/@its:version
/html/head[1]/script[1]/its:provenanceRecords[2]/@xml:id
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]	orgRef="http://www.legaltrans-ex.com/"	person="John Doe"	provRef="http://www.vistatec.com/job-12-7-15-X31/reviewed/prov/re8573469"	revOrgRef="http://www.vistatec.com/"	revPerson="Tommy Atkins"
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@orgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@person
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@provRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revPerson
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]	revOrgRef="http://john-smith.qa.example.com"	revPerson="John Smith"
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revPerson
/html/body[1]
/html/body[1]/p[1]	provenanceRecordsRef="#pr1"
/html/body[1]/p[1]/@its-provenance-records-ref
/html/body[1]/p[2]	provenanceRecordsRef="#pr2" 
/html/body[1]/p[2]/@its-provenance-records-ref
.....

So assuming we change to follow LocQuality it would be something like:
/html/head[1]/script[1]/its:provenanceRecords[2]
/html/head[1]/script[1]/its:provenanceRecords[2]/@its:version
/html/head[1]/script[1]/its:provenanceRecords[2]/@xml:id
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]	
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@orgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@person
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@provRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revPerson
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]	
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revPerson
/html/body[1]
/html/body[1]/p[1]	provenanceRecordsRef="#pr1"
/html/body[1]/p[1]/@its-provenance-records-ref
/html/body[1]/p[2]	provenanceRecordsRef="#pr2" orgRef[1]="http://www.legaltrans-ex.com/"	person[1]="John Doe"	provRef[1]="http://www.vistatec.com/job-12-7-15-X31/reviewed/prov/re8573469"	revOrgRef[1]="http://www.vistatec.com/"	revPerson[1]="Tommy Atkins" revOrgRef[2]="http://john-smith.qa.example.com"	revPerson[2]="John Smith"
/html/body[1]/p[2]/@its-provenance-records-ref

So suggestions for format of multiple records per node are:

1. 
/element	attrRef="#pr2" attr_a[1]="val"	attr_b[1]="val" attr_c[1]="val"	attr_b[2]="val" attr_c[2]="val"
Note: Assuming the records are read in order and that attributes are listed by the [x] string first and attribute name first.

2. 
/element	attrRef="#pr2" [attr_a="val", attr_b="val", attr_c="val"][attr_b="val", attr_c="val"]
Note: Something like this perhaps. Where the bracket position identifies the record number.

Any variation should be fine as long as we determine the syntax and sort order. If there's only one record possibly remove the index part [x].

Cheers,
Fredrik

-----Original Message-----
From: Yves Savourel 
Sent: Tuesday, December 04, 2012 10:15 AM
To: 'Multilingual Web LT-TESTS Public'
Subject: RE: Standoff markup issue

Hi Leroy,

Thanks for the example.
I have a few notes:

There are two issues IMO with this output, or more exactly there are two reasons not to do this representation:

-- a) So far we have set the information (the right part after the path) on the element or attribute to which that information applies. So in your example the stand-off information applies to the p elements. that is, IMO, where the information should be output. Not on the stand-off nodes themselves.

I even think it's vital to do that because it forces the implementations to resolve the reference. This reference is a reference to ITS data, not a reference to some data carried by an ITS information like most xyzRef attributes are.

-- b) We cannot set the output information on the stand-off nodes like in this example because the stand-off nodes may be in a different document. The stand-off reference attributes like locqualityIssuesRef or provenanceRecordsRef may point to an external XML or HTML documents where that stand-off markup resides. So an implementation needs to fetch the document, open it, get the stand-off data and output that on the node to which it applies. And yes: that is not something easy to implement.


Look at it from a different viewpoint: We should have the same set of information output on a given node whether the ITS data are represented as local markup or stand-off markup. The only difference is that for stand-off markup we may have a several items of that type of information.

Fredrik is looking at some examples of possible output and will post an email later.

cheers,
-yves
Received on Wednesday, 5 December 2012 09:05:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 5 December 2012 09:05:58 GMT