- From: Phil Ritchie <philr@vistatec.ie>
- Date: Thu, 2 Aug 2012 10:37:42 +0100
- To: Felix Sasaki <fsasaki@w3.org>
- Cc: Arle Lommel <arle.lommel@dfki.de>, public-multilingualweb-lt@w3.org, Yves Savourel <ysavourel@enlaso.com>
- Message-ID: <OF9E1F53E7.53A59794-ON80257A4E.00349B1A-80257A4E.0034E447@vistatec.ie>
<quote>My point is just: what useful thing can a tool do when all it knows is that something is e.g. a grammar error? See the workflow I tried to explain at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html </quote> Let's not forget that having tools "understand" and act upon very specific definitions is not the only goal here. There are definite justifications in my mind for having this metadata available so that it can be aggregated and used in metrics at a much higher level. Phil. From: Felix Sasaki <fsasaki@w3.org> To: Yves Savourel <ysavourel@enlaso.com>, Arle Lommel <arle.lommel@dfki.de>, public-multilingualweb-lt@w3.org, Date: 02/08/2012 06:24 Subject: Fwd: [ISSUE 34] Potential problem with high-level quality issues 2012/8/1 Yves Savourel <ysavourel@enlaso.com> Ok, sorry I missed the distinction in Arle’s note et read your email too fast. So this is a requirement that we put upon ourselves. Yes. > The test cases must be more robust that simply seeing > that a tool identifies an issue and passes it on: > we also need to see that they do this consistently with > each other, which is hard since the set of issues > from the various tools only partially overlap. I’m not sure I get "we also need to see that they do this consistently with each other". Each tool has its own set of issues. The only exchange part between tools is when a tool A generates a list of qa notes and those are then read into a tool B which do something with them. My point is just: what useful thing can a tool do when all it knows is that something is e.g. a grammar error? See the workflow I tried to explain at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html The interoperability I can see is that, for example, when tool A and B filter the same list of qa notes on the 'omission' type we get the same sub-list. If you mean that we must make sure that tool A map its issue that we see as omissions to the 'omission' top-level types, that seems to be out of our purview. Or am I missing something? I am probably asking for mapping in the sense of http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html For other data categories, we have a small set of allowed values like "yes" or "no". So even if we don't test that tools do the same stuff with theses values, the value set is so small that the interpretation becomes very clear. I just don't understand what useful and testable thing (one or two) tools can do with a high level information like "this is a grammar error". Maybe you or others can draft an example, filling 1-4 at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0032.html in? That would help me a lot. Best, Felix Cheers, -ys From: Felix Sasaki [mailto:fsasaki@w3.org] Sent: Wednesday, August 01, 2012 7:07 PM To: Yves Savourel Cc: Arle Lommel; Multilingual Web LT Public List Subject: Re: [ISSUE 34] Potential problem with high-level quality issues 2012/8/1 Yves Savourel <ysavourel@enlaso.com> I’m not sure I completely understand the requirement. For each value we need two applications that use it? Did we have such requirement for 1.0? No, we didn't, since - see below - the number of values was very small and easy to understand. With the need (more on that later) to convince people working on the content production side of the usefulness of our metadata, I think we have a higher bar than for locNoteType. Best, Felix For example we have a locNoteType with ‘alert’ or ‘description’. Do we have two applications that generate those two values? Just wondering. -ys From: Felix Sasaki [mailto:fsasaki@w3.org] Sent: Wednesday, August 01, 2012 5:22 PM To: Arle Lommel Cc: Multilingual Web LT Public List Subject: Re: [ISSUE 34] Potential problem with high-level quality issues Hi Arle, all, let me just add that for other data categories, we have only small set of predefined values - e.g. for "Translate" only "yes" or "no", or for localization note type "alert" or "description". Also, these values are distinct - you have either "yes" or "no", so there is no danger of doing the wrong thing then an application produces or consumes the values. Finally, the categorization of an error seems to be difficult, with so many categories being proposed. This situation led me to the thinking that we should set a high bar for the normative values - otherwise there won't be any interoperability of what implementations produce or consume, as Arle described. I don't see a clear way out, and I'm looking very much forward to feedback from implementors - Yves, Phil etc. Best, Felix 2012/8/1 Arle Lommel <arle.lommel@dfki.de> Hello all, I was discussing the high-level quality issues with Felix this morning and we have an issue. If they are to be normative, then we will need to find at least two interoperable implementations for each value, not just for the mechanism as a whole, and to test those implementations against test cases. While that would not be hard for some like terminology, it would be difficult for others like legal, because, while they are used in metrics, they are not particularly embedded in tools that would produce or consume ITS 2.0 markup. One solution is to put the issue names in an informative annex and very strongly recommend that they be used. That approach is, I realize, unlikely to satisfy Yves, for good reason: if we cannot know what values are allowed in that slot, then we cannot reliably expect interoperability. At the same time, if we only go with those values for which we can find two or more interoperable implementations, that list of 26 issues will probably become something like six or eight, thereby leaving future tools that might address the other issues out in the cold. I have to confess that I do not see a solution to this issue right now since we really need the values to be normative but if we cannot test them in fairly short order they cannot be normative. The test cases must be more robust that simply seeing that a tool identifies an issue and passes it on: we also need to see that they do this consistently with each other, which is hard since the set of issues from the various tools only partially overlap. If anyone has any brilliant ideas on how to solve the issue, please feel free to chime in. We're still working on this and hope to find a way to move forward with normative values. Best, Arle -- Felix Sasaki DFKI / W3C Fellow ************************************************************ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail. www.vistatec.com ************************************************************
Received on Thursday, 2 August 2012 09:38:14 UTC