W3C home > Mailing lists > Public > public-multilingualweb-lt-tests@w3.org > December 2012

RE: Standoff markup issue

From: Fredrik Liden <fliden@enlaso.com>
Date: Tue, 4 Dec 2012 13:54:16 -0700
To: Yves Savourel <ysavourel@enlaso.com>, 'Multilingual Web LT-TESTS Public' <public-multilingualweb-lt-tests@w3.org>
Message-ID: <assp.0685c7218f.4236658BB877A542A66660614300B1858013D0ECCA@orion.helios.local>
Actually, the output for Provenance and Localization Quality Issue is currently inconsistent. So in line with Yves notes below, perhaps we can just update Provenance to follow the same logic of LocQuality. Unless there was a specific reason to make it different. Guessing we're still making some updates to test files in those categories. So far I can only see one example, Provenance html example 3, that actually has more than 1 standoff record but on the draft page there several examples.

As for the format of multiple records:

Currently provenance3html output looks like this.
.....
/html/head[1]/script[1]/its:provenanceRecords[2]
/html/head[1]/script[1]/its:provenanceRecords[2]/@its:version
/html/head[1]/script[1]/its:provenanceRecords[2]/@xml:id
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]	orgRef="http://www.legaltrans-ex.com/"	person="John Doe"	provRef="http://www.vistatec.com/job-12-7-15-X31/reviewed/prov/re8573469"	revOrgRef="http://www.vistatec.com/"	revPerson="Tommy Atkins"
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@orgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@person
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@provRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revPerson
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]	revOrgRef="http://john-smith.qa.example.com"	revPerson="John Smith"
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revPerson
/html/body[1]
/html/body[1]/p[1]	provenanceRecordsRef="#pr1"
/html/body[1]/p[1]/@its-provenance-records-ref
/html/body[1]/p[2]	provenanceRecordsRef="#pr2" 
/html/body[1]/p[2]/@its-provenance-records-ref
.....

So assuming we change to follow LocQuality it would be something like:
/html/head[1]/script[1]/its:provenanceRecords[2]
/html/head[1]/script[1]/its:provenanceRecords[2]/@its:version
/html/head[1]/script[1]/its:provenanceRecords[2]/@xml:id
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]	
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@orgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@person
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@provRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[1]/@revPerson
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]	
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revOrgRef
/html/head[1]/script[1]/its:provenanceRecords[2]/its:provenanceRecord[2]/@revPerson
/html/body[1]
/html/body[1]/p[1]	provenanceRecordsRef="#pr1"
/html/body[1]/p[1]/@its-provenance-records-ref
/html/body[1]/p[2]	provenanceRecordsRef="#pr2" orgRef[1]="http://www.legaltrans-ex.com/"	person[1]="John Doe"	provRef[1]="http://www.vistatec.com/job-12-7-15-X31/reviewed/prov/re8573469"	revOrgRef[1]="http://www.vistatec.com/"	revPerson[1]="Tommy Atkins" revOrgRef[2]="http://john-smith.qa.example.com"	revPerson[2]="John Smith"
/html/body[1]/p[2]/@its-provenance-records-ref

So suggestions for format of multiple records per node are:

1. 
/element	attrRef="#pr2" attr_a[1]="val"	attr_b[1]="val" attr_c[1]="val"	attr_b[2]="val" attr_c[2]="val"
Note: Assuming the records are read in order and that attributes are listed by the [x] string first and attribute name first.

2. 
/element	attrRef="#pr2" [attr_a="val", attr_b="val", attr_c="val"][attr_b="val", attr_c="val"]
Note: Something like this perhaps. Where the bracket position identifies the record number.

Any variation should be fine as long as we determine the syntax and sort order. If there's only one record possibly remove the index part [x].

Cheers,
Fredrik

-----Original Message-----
From: Yves Savourel 
Sent: Tuesday, December 04, 2012 10:15 AM
To: 'Multilingual Web LT-TESTS Public'
Subject: RE: Standoff markup issue

Hi Leroy,

Thanks for the example.
I have a few notes:

There are two issues IMO with this output, or more exactly there are two reasons not to do this representation:

-- a) So far we have set the information (the right part after the path) on the element or attribute to which that information applies. So in your example the stand-off information applies to the p elements. that is, IMO, where the information should be output. Not on the stand-off nodes themselves.

I even think it's vital to do that because it forces the implementations to resolve the reference. This reference is a reference to ITS data, not a reference to some data carried by an ITS information like most xyzRef attributes are.

-- b) We cannot set the output information on the stand-off nodes like in this example because the stand-off nodes may be in a different document. The stand-off reference attributes like locqualityIssuesRef or provenanceRecordsRef may point to an external XML or HTML documents where that stand-off markup resides. So an implementation needs to fetch the document, open it, get the stand-off data and output that on the node to which it applies. And yes: that is not something easy to implement.


Look at it from a different viewpoint: We should have the same set of information output on a given node whether the ITS data are represented as local markup or stand-off markup. The only difference is that for stand-off markup we may have a several items of that type of information.

Fredrik is looking at some examples of possible output and will post an email later.

cheers,
-yves



Received on Tuesday, 4 December 2012 20:55:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 4 December 2012 20:55:11 GMT