- From: Phil Ritchie <philr@vistatec.ie>
- Date: Tue, 2 Oct 2012 12:22:02 +0100
- To: Felix Sasaki <fsasaki@w3.org>
- Cc: public-multilingualweb-lt@w3.org
- Message-ID: <OFF7F947C4.2D10AFF8-ON80257A8B.003E4BE5-80257A8B.003E71C6@vistatec.ie>
OK, understood. Hmm. I think use of the script element will break my
implementation. I'll have to check.
Phil.
From: Felix Sasaki <fsasaki@w3.org>
To: Phil Ritchie <philr@vistatec.ie>,
Cc: public-multilingualweb-lt@w3.org
Date: 02/10/2012 11:04
Subject: Re: ACTION-233: Update quality issue example to use the
solution (XML in "script" tag) for standoff markup
2012/10/2 Phil Ritchie <philr@vistatec.ie>
Felix
Before I can answer the question can you tell me what the motivation for
using the script tags is?
There are two motivations. One is based on
https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration
here you have ITS rules files inside HTML5. It seems that this is a
requirement from Linguaserve: not rules linked, but inside HTML5. So far
Linguaserve has put the rules files "just somewhere". That makes it
invalid HTML5. With the rules in the "script" element, it becomes valid
again.
The other motivation is that the standoff we had so far for HTML5 looked
like this:
<span its-loc-quality-issues-ref=#lq1>c'es</span> le contenu</p>
<span id=lq1
its-loc-quality-issues=its-loc-quality-issues>
<span
its-loc-quality-issue=its-loc-quality-issue
its-loc-quality-issue-coment="Sentence without
capitalization"
its-loc-quality-issue-severity=30
its-loc-quality-issue-type=typographical></span>
<span
its-loc-quality-issue=its-loc-quality-issue
its-loc-quality-issue-coment="'c'es' is unknown.
Could be 'c'est'"
its-loc-quality-issue-severity=50
its-loc-quality-issue-type=misspelling></span>
</span>
"span" is mis-used to "transport" standoff metadata in the "body"
element. It works, but is not very clean. Hence "script" which is defined
for that purpose, see
http://dev.w3.org/html5/spec/the-script-element.html
about "application/xml" and other types:
"These types are explicitly listed here because they are poorly-defined
types that are nonetheless likely to be used as formats for data blocks,
and it would be problematic if they were suddenly to be interpreted as
script by a user agent."
Jirka had mentioned this solution afternonn 26
http://www.w3.org/2012/09/26-mlw-lt-minutes.html
search for "current recommendation is to put the tool info xml into script
in html"
and pointed us to the related DOM methods
https://developer.mozilla.org/en-US/docs/DOM/DOMParser
Felix
My demo in Prague used standoff without needing to wrap them in script
tags.
Phil.
From: Felix Sasaki <fsasaki@w3.org>
To: public-multilingualweb-lt@w3.org,
Date: 02/10/2012 09:17
Subject: ACTION-233: Update quality issue example to use the
solution (XML in "script" tag) for standoff markup
Hi all,
I updated the qaissue example to use XML in the script element, see
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2
the standoff metadata is now in a dedicated "script" element. See also
http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html
http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js
So this works, but I have a question to the implementors using HTML5 as an
input for processing outside the browser.
If you process
http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html
with the validator.nu HTML5 parser, the content of "script" is not "seen"
as XML. The output then is
<html xmlns="http://www.w3.org/1999/xhtml">...
<script type="application/xml" id="its-standoff-1">
<its:locQualityIssues xml:id="lq1" xmlns:its="
http://www.w3.org/2005/11/its">
<its:locQualityIssue
locQualityIssueType="misspelling"
locQualityIssueComment="'c'es' is unknown. Could be 'c'est'"
locQualityIssueSeverity="50"/>
<its:locQualityIssue
locQualityIssueType="typographical"
locQualityIssueComment="Sentence without capitalization"
locQualityIssueSeverity="30"/>
</its:locQualityIssues>
</script>...</html>
So if we would have an XML-based tool that wants to pick up the ITS
standoff information, it won't work.
Currently, Linguaserve is using this approach
https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration
to embed ITS rules into an HTML file. I had hoped that the "script"
element would have been an alternative - is it?
I'm sure this is not a difficult problem, but we probably need some
guidance for implementors who are not used to process HTML5.
Felix
--
Felix Sasaki
DFKI / W3C Fellow
************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail.
www.vistatec.com
************************************************************
--
Felix Sasaki
DFKI / W3C Fellow
************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail.
www.vistatec.com
************************************************************
Received on Tuesday, 2 October 2012 11:22:38 UTC