Re: [ISSUE 34] Quality error category proposal

Hi Yves, all,

"overlap" solutions for XML or SGML (see an overview at
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html
) always have drawbacks like: XPath expressions get complicated ("give me
all pieces of content that are not translatable, that are a term etc.), you
cannot use various kinds of metadata that relies on the tree structure (in
addition to ITS, e.g. "dir" attribute or lang and xml:lang, including XPath
"lang()" function), and validation with schema languages gets useless,
since the references are not part of the content models.

ITS 2.0 has XLIFF roundtripping as one scenario, but others scenarios
(general XML and HTML5) that rely on the tree structure. From my
understanding, the standoff approach of XLIFF creates problems for various
kinds of metadata (see above). So one could define a round-tripping
approach that moves the metadata to referenceable sections, e.g. from HTML5
<span its-term="yes" its-loc-note-type="alert"
its-loc-note-description="...">...</span>
to
<mrk id='m1' type='x-itsinfo' ref="#itsinfo1">...</mrk>
<metadata id="itsinfo" its-term="yes" its-loc-note-type="alert"
its-loc-note-description="...">
...
</metadata>

I think round tripping wouldn't be an issue here.

Best,

Felix

2012/7/11 Yves Savourel <ysavourel@enlaso.com>

> Hi Felix, Arle, Phil, all,
>
> There is something that doesn't feel right about that simple solution.
> I still need to think more about it.
>
> But meanwhile, I wanted to point out something else:
>
> Obviously having multiple attribute on a span-like element is the less
> intrusive way to annotate in HTML5, and possibly in many XML formats. But
> this pauses a challenge in XLIFF 2.0.
>
>
> In XLIFF 1.2 we would possibly use the ITS attribute directly in <mrk> and
> get something like this:
>
> <target>Insert the <mrk mtype='x-itsqa' its:error="yes"
> its:errorInfo="Should be USB key" its:error????="URI to machine readable
> information">Pen Drive</span> in the USB port</target>
>
>
> In XLIFF 2.0 the pattern to annotate a content is quite different.
>
> Using attributes for annotation is not working very well when the data
> reaches a certain level of complexity.
>
> - if the information is very short we would use the 'type' and 'value'
> attribute of <mrk> to hold the information.
>
> - if the information is complex we would point to an outside element. So
> something like this:
>
> <unit>
>  <segment>
>   <source>...</source>
>   <target>Insert the <mrk id='m1' type='x-itsqa' ref="#qa1">Pen
> Drive</mrk> in the USB port</target>
>  </segment>
>  <its:qaEntry id="qa1">
>   <its:qaNote>Should be USB key</its:Note>
>   ... any extra info
>  </ext:itsQA>
> </unit>
>
> In addition one thing that was not provided in 1.2 but is in 2.0: how to
> deal with overlapping spans.
>
> Some background:
>
> <mrk> can be used for many things, including annotating things that
> overlap: for example a QA annotation and a user comment:
>
> <target>Insert <mrk id='m1' type="comment" value="Not sure about 'the
> pen'" >the <mrk id='m2' type='x-itsqa' ref="#qa1">Pen</mrk> Drive</mrk> in
> the USB port<target>
>
> But such notation obviously doesn't work because it's not well-formed.
> (Remember that we can't control how annotations are created: XLIFF is just
> an input/output, we can only provide a way to represent these cases. And
> even if we had some control on the creation, <mrk> can be broken by
> segmentation, so there is no easy way to always use <mk>...</mrk>).
>
> To represent non-well-formed annotations, we use the same principle as for
> inline spanning codes: we change the spanning element to two empty elements
> that indicate the start and end of the span:
>
> <target>Insert <sm id='m1' type="comment" value="Not sure about 'the
> pen'"/>the <sm id='m2' type='x-itsqa' ref="#qa1"/>Pen<em rid='m1'/>
> Drive<em rid='m2'/> in the USB port<target>
>
> So what's the problem? We can still use the ITS attribute no?
> No, because the scope of the ITS local attributes is the content of the
> element where the attribute is. So when we use <sm/>...<em/> instead of
> <mrk>...<mrk> the scope on <sm/> is an empty content.
>
> This is not an issue when we use type and value (the simple annotation)
> because the semantic of those attributes takes into account the <sm>/<em>
> notation.
>
> In anycase, since we do have a way to point to an element somewhere else
> in the <unit>, it would be the natural way to annotate XLIFF 2.0 for
> 'complex' annotations, like the QA info.
>
> Cheers,
> -yves
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Wednesday, 11 July 2012 09:49:30 UTC