W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > November 2012

RE: [ACTION-314] Check global rules for provenance

From: Pablo Nieto Caride <pablo.nieto@linguaserve.com>
Date: Fri, 23 Nov 2012 10:32:01 +0100
To: "'Yves Savourel'" <ysavourel@enlaso.com>, <public-multilingualweb-lt@w3.org>
Message-ID: <14f401cdc95d$640baa40$2c22fec0$@linguaserve.com>
Thank you Yves, +1 to everything you say, anyway I think that people will still use IDs in the selectors because sometimes, as you said, can be very handy.

As for the similarity of toolRef and toolsRef, I don't know, I think that with toolsRef would work, but I also think that they mean different things, perhaps if we just change the name because it can be confusing... besides you can associate an ID to a tool but to a person or organization too.


Thanks for the pointer Felix.

So looking at http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#provenance and the examples, here are a few notes on provenance:

I don't think we can use tool-id as an equivalent to toolRef. in XLIFF tool-id value is a "text" that is an ID. In ITS toolRef is an IRI. And the example 62 shows that: we have its:toolRefPointer="//phase-group[@phase-name='P1']/@tool-id" but tool-id="T1" not tool-id="#T1".

Also I believe it would be difficult to access the <tool> element just based on the value of phase@tool-id from an ITS processor viewpoint: You have the value of the ID to look for, but tool@tool-id is not defined as an XML ID and you don't know tool@tool-id is the attribute that hold the ID or that tool is the element to look for. So I'm not sure how you can lookup the element without some XLIFF-specific knowledge.

So if we can't map toolRef, I'm not sure we can use pointers at all with XLIFF since we would miss one type of information.

(incidentally: that toolRef looks a lot like its:toolsRef, shouldn't we use toolsRef instead of a data category-specific attribute?)

If we can live without a toolRef-equivalent in XLIFF then we could use pointers. But then I'd like to see if there is a generic way to map the <phase> element. To me using global rules specific to a given document instance (like in example 62) is not really a good practice.

In my opinion pointers should be used to set up general mapping between an existing vocabulary and ITS. As soon as we have ID values or any parts of the selector's XPath that is coming from a document instance, it's a red flag. It means the rule is specifically written for that document rather than for a vocabulary.
I know it technically works, and it can be very handy, and we have examples showing that. But I think it's not a good practice and in many cases it's an abuse of the mechanism. Any use of global rules (with pointers or not) that is geared for a given document instance is probably not quite right, except if it is to address attributes.

To illustrate this you can look at the Localization Note data category: Examples 33 and 35 are (IMO) "good" use of the global rules. And examples 32 and 34 are "bad" use of the global rules. In hindsight defining <its:locNote> was probably not a good idea. More than 5 years later I still have not seen real cases like examples 32 or 34. Which is a good thing.

So ideally the selector should be ignorant of the IDs but still, somehow link the <target> element with the given <phase-group> element. Maybe our XPath gurus can find a way to do this in XPath 1.0?

By the way the expressions like:
in the examples are incorrect it should be:

I don't think something like:

selector = "//target[@phase-name=//phase-group/@phase-name]"

would be valid. But that's the general idea: the two elements have each an attribute with the same value, we "just" need to link them in the XPath expression.

If we can't do that, then I would recommend to use the standoff markup, like for 2.0.
Note that if there are several revisions cycles, standoff may be required any way too...

Received on Friday, 23 November 2012 09:32:27 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:24 UTC