Re: Quality markup sample

Thanks for this Arle. Good catch on the multiple errors. I'm not familiar with how additive markup is achieved. Sounds like you have to end up with some sort of external file that has multiple pointers to the same element?

Phil



On 30 Jul 2012, at 13:38, "Arle Lommel" <arle.lommel@dfki.de> wrote:

> Hi all
> 
> I have a sample to follow up on the list of categories from earlier today. I've marked up a sample using the top-level categories plus some codes from Okapi (Yves may want to refine it) using local, inline markup.
> 
> We'll follow-up with the formal write-up of the categories later, but I did want to get a piece of marked-up text out there for examination. Note that some of these would not be actual errors (like the "untranslated" title).
> 
> I've bolded all of the added markup and made it red. Note that the its-qualitycomment bit is a prose explanation (either manually or automatically generated) of the annotation.
> 
> I already seen one issue. If we do not use additive markup, we don't have a way to handle multiple "errors" that apply to the same piece of content since attributes cannot be repeated. Anyone have any thoughts?
> 
> Arle
> 
> <!DOCTYPE html>
> <html lang="en">
>  <head>
>   <meta charset="utf-8" />
>   <!-- NOTE: The following line would point to the online description of the error categories utilized by the tool -->
>    <meta its-qualityprofile="http://someURI.org/somestring/" />
>   <meta its-qualityscore="93" />
>    <title its-qualitytype="untranslated;okapi:TARGET_SAME_AS_SOURCE">Telharmonium 1897</title>
>  </head>
>  <body>
>   <!-- NOTE: This translation is an unreviewed and unrevised draft with a number of intentionally introduced errors used for testing purposes -->
>   <h1 id="h0001" its-qualitytype="untranslated;okapi:TARGET_SAME_AS_SOURCE">Telharmonium (1897)</h1>
>   <p id="p0001">
>    <span class="segment" id="s0001">Thaddeus Cahill (1867–1934) conceived of an instrument that could transmit
>     its sound from a power plant for hundreds of miles to listeners over telegraph wiring.</span>
>    <span class="segment" id="s0002">Beginning in 1889 the sound quality of regular telephone concerts was very
>     poor on account of the buzzing generated by carbon-granule microphones. As a result Cahill decided to
>     set a new standard in perfection of sound quality with his instrument, a standard that would not only
>     satisfy listeners but that would overcome all the flaws of traditional instruments.</span>
>   </p>
>   <img src="http://estrip.org/content/users/paul/0104/telharmonium260.jpg" alt="A telharmonium képe"
>    style="float:left;margin:6px 12px 6px 6px;" height="181" width="260"
>    its-qualitytype="omission;okapi:EMPTY_SOURCE_SEGMENT"
>    its-qualitycomment="It appears that the graphic was moved in the English version to this location" />
>   <p id="p0002" its-qualitytype="markup;okapi:MISSING_TAG_IN_TARGET"
>    its-qualitycomment="The tag for the previous image was moved from this &lt;p> tag to the one above">
>    <span class="segment" id="s0003">He experimented for twelve years so that it would not only be perfect
>     instrument but would also be able to produce the emotional range of a piano or a violin, and even be
>     superior in the possibilities of its sound to church organs.</span>
>    <span class="segment" id="s0004" its-qualityType="characters;okapi:ALLOWED_CHARACTERS"
>     its-qualitycomment="The expected language is English but CJK characters appear in this segment.">He
>     produced a particularly detailed patent description, but in 1895 his petition was rejected on the
>     grounds that it contained a number of discoveries patented by earlier 発明者.</span>
>    <span class="segment" id="s0005">Cahill struggled with the patent őffice to establish the unique
>     contributions of his instrument, so it was only on April 6, 1897 that they granted his patent.</span>
>   </p>
>  </body>
> </html>
> 
> 
> 

************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail.

www.vistatec.com
************************************************************

Received on Tuesday, 31 July 2012 18:44:40 UTC