[ISSUE-34] Problem with mandatory attributes for quality from Arle Lommel on 2012-08-07 (public-multilingualweb-lt@w3.org from August 2012)

From: Arle Lommel <arle.lommel@dfki.de>
Date: Tue, 7 Aug 2012 14:09:19 +0200
To: Multilingual Web LT Public List <public-multilingualweb-lt@w3.org>
Message-Id: <881664D0-447A-486C-BA7B-90DE9D1176FC@dfki.de>

Hi all,

As I'm working on turning my proposal into something closer to ITS prose style, I've realized there is a problem with the current proposal that has been discussed and for which Felix discussed making some things mandatory and so forth. I'm hoping there are some clever thoughts.

If we make locQualityProfile mandatory for all instances, we run into the problem that ideally the value is inherited. The most likely scenario we have is on in which one quality profile is used for the entire document. So ideally it is declared once (in, e.g., an HTML meta tag, but it could also be in global rule that selects something like a <body> node), with the possibility of local overrides as needed, and the value is inherited by everything else that uses the qname (if we don't have that, there is no need for the qname).

Related to this, because ITS 2.0 rules overwrite each other, what would we do if I want to embed data from two different quality models in one text? I don't want to redeclare the quality model for every span, but it would seem that our current model would not provide a way to say that BOTH Quality Model A AND Quality Model B are apply to <body> and that the qname lets you select which one in fact is relevant to a particular piece of markup within body.

There is a similar issue for the locQualityScore attribute in that the most general case will be for marking results for large pieces of text, not spans and so forth. In fact, for this case, the value makes no sense outside of its full context and inheritance makes no sense. It is not as if saying that a 1000-word document has a quality score of 89% tells me anything at all about any particular span in the document. The particular span, if considered in isolation, might have a score of 100% or a score of 3%. So when used in most places the locQualityScore attribute is meaningless. Instead, it corresponds to the sum output of the application of whatever is declared in locQualityProfile to wherever locQualityProfile applies. So to make it mandatory is problematic, regardless of default value, because there are scopes within ITS 2.0 where the very concept doesn't make real sense.

I think the solution here is to make locQualityScore optional rather than mandatory, as it seems that that would solve the problem.

Going further, I am not sure what it would mean, however, if a locQualityScore value were declared inside the context of another one. Under out normal precedence rules it would be as if the "outer" one did not exist, but it could be that someone it expressing partial scores (e.g., the overall score is 85, but this <div> has a score of 98 and that <div> is a 42. In this case it is not like translate where the meaning is clear: a translate="yes" bit nested in a translate="no" bit completely overrides the translation="no" intent. But here the semantics are not so clear. Maybe it would solve it to state that the attribute does NOT apply to daughter elements. It's clear that semantically it does, but syntactically, we cannot see this as normal inheritance if the value from the whole cannot be seen as applying to the parts

As I'm looking at this, it seems that locQualityProfile and locQualityScore would most often be used together to describe a larger unit than the other attributes, which can be very fine-grained. I wonder if we really don't have two data categories: one for finegrained quality markup and one for aggregate data. Granted, the boundaries are not super clear, but it seems to be mixing levels to treat a quality score as being like a note that says that there is a grammar error at a particular point in the text.

I'm going to work more with my text and see where it goes, and I may have to make some examples to highlight the problems I see.

Best,

Arle

Received on Tuesday, 7 August 2012 12:09:53 UTC