Re: Solution via standoff markup ? (Re: Markup for quality)

I don't know on this. I think Felix’s idea makes sense as a way to handle this need with existing markup. But I see both sides on this issue and I think we need more discussion because we don't want to go down the wrong road to meet a deadline.

As I see it we have two options:

We use Felix’s concatenation method and specify that if two spanning elements share an id that that should be appended. (Where do we do this? In the spec? Because that is a fairly substantive addition).
We use an external namespace for the carrier elements and use empty elements.

My initial reaction to Felix’s idea was much like Yves’, but as I consider it, I think it actually has one advantage that might not be bad at all: it allows non-contiguous selection, which would be nice for a case like this:

The man, whom I had seen walking out of Mr. Jones’ store on a last Friday when I had gone to visit my friend who lives by the dock, were big.

There are two issues in this (aside from the general awkwardness of the high level of embedding):

agreement: man… were
grammar: a last Friday

While spans could be nested here (so these aren't overlapping spans), the problem is that the the first one doesn't contain the second one and in fact it doesn't actually “contain” any of the embedded clause, so this version is not right: 

The <span its-loc-quality-issues-ref="#lqi1">man, whom I had seen walking out of Mr. Jones’ store on <span its-loc-quality-issues-ref="#lqi2">a last Friday</span> when I had gone to visit my friend who lives by the dock, were</span> big.

This seriously overstates the scope of the first error. Color coded, it might appear as:

The man, whom I had seen walking out of Mr. Jones’ store on a last Friday when I had gone to visit my friend who lives by the dock, were big.

This shows the problem: it actually says that error 1 relates to everything in red, but, by overriding, somehow doesn't relate to the bit in green, even though it does to everything around it. That isn't the intended semantics at all.

But if we take Felix’s approach, we can end up with this:

The <span its-loc-quality-issues-ref="#lqi1">man</span>, whom I had seen walking out of Mr. Jones’ store on <span its-loc-quality-issues-ref="#lqi2">a last Friday</span> when I had gone to visit my friend who lives by the dock, <span its-loc-quality-issues-ref="#lqi1">were</span> big.

If we were to color code this for display, it might look like this: 

The man, whom I had seen walking out of Mr. Jones’ store on a last Friday when I had gone to visit my friend who lives by the dock, were big.

Which is the precise intended meaning. While most users won't be marking non-contiguous selections like that, it is the most accurate way to represent this issue.

So I think It's worth considering Felix’s solution, because we actually gain something besides being able to represent overlapping spans: we gain precise scoping control.

-Arle

On 2013 Jun 17, at 11:27 , "Yves Savourel" <yves.savourel@gmail.com> wrote:

> Hum… it’s basically duplicating the info.
> But how do you know two spans with that info need to be merged? Nothing currently prevent the re-use of reference IDs if the issue is the same. So how do we distinguish the two cases?
>  
> Overall that solution feels like a hack, compare to have two empty elements. I understand that introducing 2 empty elements is not really possible because of the timeline, but aren’t we introducing a bad way to do one thing just because of a deadline? In 5 years from now that may look like a bad idea.
>  
> Just thinking aloud
> -ys
>  
>  
> From: Aljoscha Burchardt [mailto:aljoscha.burchardt@dfki.de] 
> Sent: Monday, June 17, 2013 9:48 AM
> To: Felix Sasaki
> Cc: Arle Lommel; public-i18n-its-ig@w3.org; kim_harris@textform.com; Hans Uszkoreit
> Subject: Re: Solution via standoff markup ? (Re: Markup for quality)
>  
> Hi Felix,
>  
> this sounds good. Let's see whether Arle sees any issues.
>  
> Best,
> Al
>  
> On 16.06.2013, at 20:50, Felix Sasaki <fsasaki@w3.org> wrote:
> 
> 
> Hi Arle, all,
> 
> I have given this another thought, and maybe ITS 2.0 already has the solution to the overlap problem.
> 
> This is what you proposed for mqm:
> 
> <p>Fifteen <mqm-startIssue type="markup, misplaced" id="1" /><em>relays <mqm-startIssue type="agreement" id ="2" />is</em><mqm-endIssue id="1" /> involved</mqm-endIssue id="2" /> in the operation.</p>
> 
> Now, in ITS 2.0 we have standoff markup. So far we haven't used it for representing overlap, but it seems to be straightforward:
> 
> <p>Fifteen <span its-loc-quality-issues-ref="#lqi1"><em>relays <span its-loc-quality-issues-ref="#lqi2">is</span></em></span><span its-loc-quality-issues-ref="#lqi2"> involved</span> in the operation.</p>
> 
> Here are the targets of the its-loc-quality-issues-ref attributes:
> 
> <its:locQualityIssues xml:id="lqi1" xmlns:its="http://www.w3.org/2005/11/its">
>         <its:locQualityIssue itsx:mqmType="markup, misplaced"/>
> </its:locQualityIssues>
> 
> <its:locQualityIssues xml:id="lqi2" xmlns:its="http://www.w3.org/2005/11/its">
>         <its:locQualityIssue itsx:mqmType="agreement"/>
> </its:locQualityIssues>
> 
> A query via e.g. XPath concatenating all content that has the standoff markup with xml:id="lq1" will give you this content (markup stripped out)
> "relays is "
> For xml:id="lq2" you get this
> "is involved"
> And that is what you want, no?
> 
> We don't say what an ITS 2.0 application should do with identical "its-loc-quality-issues-ref" values. Concatenating them like above seems like a reasonable interpretation for MQM. Thoughts?
> 
> Also, would you be availalbe to dial in for the f2f Monday afternoon or Tuesday afternoon to move this forward?
> 
> Best,
> 
> Felix
> 
> Am 10.06.13 11:26, schrieb Arle Lommel:
> Hi all,
>  
> One of the issues Felix and I discussed for improving compatibility between Mutlidimensional Quality Metrics (MQM) (the QTLaunchPad quality system originally derived from ITS 2.0) and ITS 2.0 is the following:
>  
> We need a way to mark up overlapping spans. For example, if you have the following HTML5 segment:
>  
> <p>Fifteen <em>relays is</em> involved in the operation.</p>
>  
> Which should be
>  
> <p><em>Fifteen relays</em> are involved in the operation.</p>
>  
> You have two issues:
>  
> The markup is misplaced (ITS 2.0 markup and MQM markup, misplaced, which is a subtype ofmarkup)
> There is an agreement error (ITS 2.0 grammar and MQM agreement, which is a subtype of grammar)
>  
> The mapping from MQM to ITS 2.0 is clear here, but we need a way to mark up the overlapping spans. So far we have internally used something like this:
>  
> <p>Fifteen <mqm-startIssue type="markup, misplaced" id="1" /><em>relays <mqm-startIssue type="agreement" id ="2" />is</em><mqm-endIssue id="1" /> involved</mqm-endIssue id="2" /> in the operation.</p>
>  
> We want a good path to interoperability with ITS. So we need a way to put the following information in the document on overlapping spans using local markup:
>  
> its-loc-quality-issue-type="grammar" itsx-mqm-issue-type="agreement" its-loc-quality-comment="should be &quot;relays are&quot;" (etc…)
>  
> Any suggestions for how to handle this use case? We want to make it as easy as possible to use MQM and ITS together, where MQM provides mechanisms for greater granularity while still retaining compatibility with ITS and ITS provides a way to share MQM data at a common granularity with other systems.
>  
> Right now we are working to ensure that ITS 2.0 will be fully conformant to MQM (with a few simple mappings for things like issue type names) and that MQM will have a clean mapping to ITS 2.0. (Note as well that MQM will provide ways to define quality profiles and handle some things not covered by ITS, like sharing scoring methods, possible data category selections, etc., so MQM adds significant capability to ITS 2.0 and isn't just an alternative, but rather a larger way of handling some details out of scope for ITS 2.0.
>  
> I'll write more up later, but if anyone has good ideas for how to hand the overlapping spans in an ITS 2.0-friendly way, please make suggestions.
>  
> Best,
>  
> Arle
>  

Received on Monday, 17 June 2013 11:40:05 UTC