W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > November 2012

Question on qi types (Fwd: Re: ITS 2.0 Acks)

From: Felix Sasaki <fsasaki@w3.org>
Date: Mon, 26 Nov 2012 06:17:32 +0100
Message-ID: <50B2FB6C.2040501@w3.org>
To: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
See below. I hope that we can discuss these on the call today.

- Felix


-------- Original-Nachricht --------
Betreff: 	Re: ITS 2.0 Acks
Datum: 	Sun, 25 Nov 2012 23:18:46 +0100
Von: 	Daniel Naber <naber@danielnaber.de>
An: 	Felix Sasaki <fsasaki@w3.org>
Kopie (CC): 	Arle Lommel <arle.lommel@dfki.de>



On 22.11.2012, 07:13:36 Felix Sasaki wrote:

Hi Felix, Hi Arle,

> Once you have any pointers about the support in language tool, please
> let us know. We'd be more than happy to make people aware of this!

the quality types have now been prototypically implemented for English in
the latest LT snapshot[1]. That means that the XML we return with our
results has been extended with a "locqualityissuetype" attribute.

For example:

<error fromy="-1" fromx="-2" toy="-1" tox="-2" ruleId="TRANSLATION_LENGTH"
msg="Source and target translation lengths are very different!"
replacements="" context="My foo blah foo " contextoffset="1" offset="1"
errorlength="15" locqualityissuetype="length"/>

You can only see this if you're using our API or the XML output.

Some questions came up implementing this:

* whitespace is described as "There is a mismatch in whitespace between
source and target content." -> we use this when there's a whitespace
problem in the text. We do have some rules which compare source and target
text, but this one does not. I assume it still makes sense to use this
type?

* Typos like "way" instead of "was", i.e. both legal words, are considered
to be in "terminology". Is that correct? When we first talked about this I
think it was mentioned that the first value from the table that fits should
be selected (going from top to bottom). I cannot find that in the appendix
now, maybe this should be mentioned explicitly?

* register is described as "The text is written in the wrong linguistic
register of uses slang or other language variants inappropriate to the
text" -> does this also refer to variants like British English vs. American
English? If so, it should maybe added as an example, as this might be quite
common.

Please let me know if you have any feedback.

Regards
  Daniel

[1] http://www.languagetool.org/download/snapshots/?C=M;O=D

-- 
http://www.danielnaber.de
Received on Monday, 26 November 2012 05:17:54 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:25:03 UTC