- From: Mârcis Pinnis <marcis.pinnis@Tilde.lv>
- Date: Wed, 21 Nov 2012 10:23:52 +0200
- To: Felix Sasaki <fsasaki@w3.org>
- CC: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, dave lewis <dave.lewis@cs.tcd.ie>, "tadej.stajner@ijs.si" <tadej.stajner@ijs.si>
- Message-ID: <AC6FD4BB9BB02540AC7322091A6C3B5472B025A62D@postal.Tilde.lv>
Hi Felix,
If the base is xs:decimal, then - yes.
The xs:decimal requires that the decimal separator is a point and not a comma as used in some (including Latvian) locales... If that is not explicitly required, users could (mis)use locale specific formats, but that would cause compatibility problems. So ... (in my opinion) it is important that the format is a restricted (inclusive interval of [0;1]) variant of “xs:decimal”.
Best regards,
Mârcis ;o)
From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: Wednesday, November 21, 2012 10:15 AM
To: Mârcis Pinnis
Cc: public-multilingualweb-lt@w3.org; dave lewis; tadej.stajner@ijs.si
Subject: Re: Atb.: [action 265] data category specific confidence scores
Thanks, Marcis. Just to be sure: you are saying that the mtConfidence, disambigConfidence and termConfidence should follow a definition like this:
<simpleType name='confidence'>
<restriction base='decimal'>
<minInclusive value='0'/>
<maxInclusive value='1'/>
</restriction>
</simpleType>
Best,
Felix
Am 21.11.12 08:55, schrieb Mârcis Pinnis:
Hi Felix,
I went over the wording, there is only one thing that I find misleading and not well disambiguated.
The following are all valid mathematical representations of rational numbers: 0.3; 1/3; 0.(3) ... and with locale specifics: 0,3; 0,(3)
If the format specifies "a rational number in the interval 0 to 1" then all these I would assume valid (as a mathematician). Maybe the wording should be changed to: "a finite decimal number" (this would also hardcode the number base...). Maybe it is irrelevant, maybe not, but if there is already a limitation of 0 to 1, then I think there should be also a clear indication to the base and format (maybe it is possible to borrow the definition of xs:decimal http://www.w3.org/TR/xmlschema-2/#decimal).
Best regards,
Mârcis ;o)
-----Original Message-----
From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: Tuesday, November 20, 2012 11:07 PM
To: public-multilingualweb-lt@w3.org<mailto:public-multilingualweb-lt@w3.org>; dave lewis; tadej.stajner@ijs.si<mailto:tadej.stajner@ijs.si>; Mârcis Pinnis
Subject: Re: Atb.: [action 265] data category specific confidence scores
Hi Dave, Marcis, Tadej, all,
Am 14.11.12 19:27, schrieb Tadej Stajner:
Hi, Dave, Marcis,
(see below)
On 11/14/2012 5:47 PM, Dave Lewis wrote:
thanks for the feedback, comment inline.
On 13/11/2012 19:56, Mârcis Pinnis wrote:
Hi Dave,
1) I support your suggestion as drafted in the attachment.
2) Although I believe there is a typing mistake:
<p>And he said: you need a new <quote its:term="yes"
its-info-term-ref=”http://www.directron.com/motherboards1.html”
its-term-confidence=”0.5”>motherboard</quote></p>
I believe its-info-term-ref should actually be its-term-info-ref?!
thanks for spotting that, we'll fix it.
This should be fixed now at
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-terms-selector-4
3) Also, just a comment (our systems won't be affected, but...) -
why do you want to restrict the values to be from 0 to 1? In
statistics it is quite common to use also LOG-scale probabilities
(because of otherwise small numbers in some cases). Is it necessary
to restrict users to a 0 to 1 interval? I would suggest leaving the
decision up to the user's. Also - the tools will have to be
identified anyway. This means that the users will be able to
identify (if needed) from the systems how to parse (understand) the
confidence scores. This is a general question that applies to other
confidence scores as well.
In general, we do not attach inter-tool significance to the
confidence scores, hence the requirement to specify the tool using
its-tools- ref. Normalising the score 0-1 is therefore not intended
to support inter-tool comparisons, but more give the the presenting
software a stable range/value to display.
On that note, I'd suggest explicitly adding a sentence that the scores
are comparable only in the context of the same tool. It might be
obvious to us, but it's an important point.
I tried to add that to the 1st paragraph at http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation
for terminology, mt confidence and disambiguation.
Best,
Felix
-- Tadej
For Mt confidence score the concerned implementers suggested 0..1,
the use of log-scale didn't come up. So for no deeper reason that
consistency i'd then suggest we keep the same for term and disambig
confidence scores, unless there is a pressing reason to do otherwise.
cheers,
Dave
4) I agree that in the current proposal it would not be reasonable
to add a confidence score as in multiple domain scenario it would be
misleading/wrong and it would require a different solution (For
instance, similar to how domains can be marked).
Best regards,
Mârcis ;o)
________________________________________
No: Dave Lewis [dave.lewis@cs.tcd.ie<mailto:dave.lewis@cs.tcd.ie>]
Nosűtîts: otrdiena, 2012. gada 13. novembrî 19:30
Kam: Multilingual Web LT Public List
Tçma: Fwd: [action 265] data category specific confidence scores
Hi all,
To try and wrap up this point:
Summary of Discussion so far:
1) text analytics annotation was proposed as a way of offering a
confidence score for text analytics results. As with mtconfidence
score, the tools annotaiton is now covered by the itsTool feature,
but the proposal for confidence scores remains
2) Marcis pointed out, using real world terminology use cases, that
we may have several annotations operating on the same fragment, so
applying a confidence score to different text analytics annotations
with a single data category won't work in these cases because of
complete override.
Also, if we used text analytics annotation with annotation from
other data categories we are breaking our 'no dependencies between
data category rules'.
3) We could overcome the complete override problems using standoff
mark up as in loc quality issue and provenance. But as confidence
score would be different for each annotated fragment, that would
result in very big stand-off records, and we would still be breaking
the data cat dependencies rule. So this doesn't seem a realistic
option
4) so the suggestion discussed in Lyon was to drop text analytics
annotation altogether as a separate data category and focus on
adding confidence attributes to the existing data categories that
would benefit from it.
so.....
Proposal:
I therefore suggest the following and we need your feedback by
friday 16th Nov so we can wrap this up on the monday call!
For those extended with confidence score (terminology,
disambiguation) please express your support and any comments by
friday - if we don't receive any we will definitely drop these
suggestions. Marcis, Tadej in particular, please consider review these.
For exclusions (domain, localizationQualityissue), this is your last
chance to counter-argue in favour of including, otherwise assume
these are dropped also.
i) confidence for terminology: as suggested by Marcis
(http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012N
ov/0028.html), revised data category as word revisions attached
(addition to local definition, note on its-tools and example 38)
ii) confidence for disambiguation: revised data category as word
revisions attached (addition to local definition, note on its-tools
and ex 52)
iii) domain: I suggest excluding this as an annotation to which we
attach a confidence score. Its not clear that the use of text
analytics to identify domain, while feasible, actually represents a
real use case for interoperability mark-up. If use it would probably
be internalized by the MT engine. Also, since there are multiple
domain values the semantics of a single confidence score is unclear.
iv) localizationQualityIssue: i suggest also excluding this as an
annotation to which we attach confidence scores. The use of
statistical text analytics doesn't seem common for QA tasks. One
exception is the recent innovation by digital lingusitics whose
Review Sentinel product ranks translation but a TA assessment for QA
purposes - but this innovative and not current practice, so its
probably not yet a concrete use case.
cheers,
Dave
Received on Wednesday, 21 November 2012 08:24:22 UTC