Re: Fwd: [ISSUE 34] Potential problem with high-level quality issues

<quote>My point is just: what useful thing can a tool do when all it knows 
is that something is e.g. a grammar error? See the workflow I tried to 
explain at
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html

</quote>

Let's not forget that having tools "understand" and act upon very specific 
definitions is not the only goal here. There are definite justifications 
in my mind for having this metadata available so that it can be aggregated 
and used in metrics at a much higher level.

Phil.





From:   Felix Sasaki <fsasaki@w3.org>
To:     Yves Savourel <ysavourel@enlaso.com>, Arle Lommel 
<arle.lommel@dfki.de>, public-multilingualweb-lt@w3.org, 
Date:   02/08/2012 06:24
Subject:        Fwd: [ISSUE 34] Potential problem with high-level quality 
issues



2012/8/1 Yves Savourel <ysavourel@enlaso.com>
Ok, sorry I missed the distinction in Arle’s note et read your email too 
fast.

So this is a requirement that we put upon ourselves.

Yes.
 

> The test cases must be more robust that simply seeing
> that a tool identifies an issue and passes it on:
> we also need to see that they do this consistently with
> each other, which is hard since the set of issues
> from the various tools only partially overlap.

I’m not sure I get "we also need to see that they do this consistently 
with each other". Each tool has its own set of issues. The only exchange 
part between tools is when a tool A generates a list of qa notes and those 
are then read into a tool B which do something with them.

My point is just: what useful thing can a tool do when all it knows is 
that something is e.g. a grammar error? See the workflow I tried to 
explain at
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html

 

The interoperability I can see is that, for example, when tool A and B 
filter the same list of qa notes on the 'omission' type we get the same 
sub-list.

If you mean that we must make sure that tool A map its issue that we see 
as omissions to the 'omission' top-level types, that seems to be out of 
our purview. Or am I missing something?

I am probably asking for mapping in the sense of
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html


For other data categories, we have a small set of allowed values like 
"yes" or "no". So even if we don't test that tools do the same stuff with 
theses values, the value set is so small that the interpretation becomes 
very clear. I just don't understand what useful and testable thing (one or 
two) tools can do with a high level information like "this is a grammar 
error". Maybe you or others can draft an example, filling 1-4 at
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html

in? That would help me a lot.

Best, 

Felix 
 

Cheers,
-ys




From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: Wednesday, August 01, 2012 7:07 PM
To: Yves Savourel
Cc: Arle Lommel; Multilingual Web LT Public List
Subject: Re: [ISSUE 34] Potential problem with high-level quality issues


2012/8/1 Yves Savourel <ysavourel@enlaso.com>
I’m not sure I completely understand the requirement. For each value we 
need two applications that use it?

Did we have such requirement for 1.0?

No, we didn't, since - see below - the number of values was very small and 
easy to understand.

With the need (more on that later) to convince people working on the 
content production side of the usefulness of our metadata, I think we have 
a higher bar than for locNoteType.

Best,

Felix


For example we have a locNoteType with ‘alert’ or ‘description’. Do we 
have two applications that generate those two values?

Just wondering.
-ys

From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: Wednesday, August 01, 2012 5:22 PM
To: Arle Lommel
Cc: Multilingual Web LT Public List

Subject: Re: [ISSUE 34] Potential problem with high-level quality issues

Hi Arle, all,

let me just add that for other data categories, we have only small set of 
predefined values - e.g. for "Translate" only "yes" or "no", or for 
localization note type "alert" or "description". Also, these values are 
distinct - you have either "yes" or "no", so there is no danger of doing 
the wrong thing then an application produces or consumes the values. 
Finally, the categorization of an error seems to be difficult, with so 
many categories being proposed.

This situation led me to the thinking that we should set a high bar for 
the normative values - otherwise there won't be any interoperability of 
what implementations produce or consume, as Arle described. I don't see a 
clear way out, and I'm looking very much forward to feedback from 
implementors - Yves, Phil etc.

Best,

Felix

2012/8/1 Arle Lommel <arle.lommel@dfki.de>
Hello all,

I was discussing the high-level quality issues with Felix this morning and 
we have an issue. If they are to be normative, then we will need to find 
at least two interoperable implementations for each value, not just for 
the mechanism as a whole, and to test those implementations against test 
cases. While that would not be hard for some like terminology, it would be 
difficult for others like legal, because, while they are used in metrics, 
they are not particularly embedded in tools that would produce or consume 
ITS 2.0 markup.

One solution is to put the issue names in an informative annex and very 
strongly recommend that they be used. That approach is, I realize, 
unlikely to satisfy Yves, for good reason: if we cannot know what values 
are allowed in that slot, then we cannot reliably expect interoperability. 
At the same time, if we only go with those values for which we can find 
two or more interoperable implementations, that list of 26 issues will 
probably become something like six or eight, thereby leaving future tools 
that might address the other issues out in the cold.

I have to confess that I do not see a solution to this issue right now 
since we really need the values to be normative but if we cannot test them 
in fairly short order they cannot be normative. The test cases must be 
more robust that simply seeing that a tool identifies an issue and passes 
it on: we also need to see that they do this consistently with each other, 
which is hard since the set of issues from the various tools only 
partially overlap.

If anyone has any brilliant ideas on how to solve the issue, please feel 
free to chime in. We're still working on this and hope to find a way to 
move forward with normative values.

Best,

Arle




--
Felix Sasaki
DFKI / W3C Fellow







************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail.

www.vistatec.com
************************************************************

Received on Thursday, 2 August 2012 09:38:14 UTC