- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 2 Oct 2012 13:33:13 +0200
- To: Phil Ritchie <philr@vistatec.ie>
- Cc: public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czo41UozTdAoGmyq=NhjvRyNVQuoAFtpzkG6Z5t03STFVw@mail.gmail.com>
2012/10/2 Phil Ritchie <philr@vistatec.ie>
> OK, understood. Hmm. I think use of the script element will break my
> implementation.
Just to be sure - does your implementation rely on javascript processing
with this standoff approach:
<span its-loc-quality-issue=its-loc-quality-issue
its-loc-quality-issue-coment="Sentence without capitalization"
its-loc-quality-issue-severity=30
its-loc-quality-issue-type=typographical></span>
FYI, the change in the toy example
http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js
basically meant: adding a call of the XML parser and using different names
for getting attributes, e.g. its:locQualityIssueSeverity instead of
its-loc-quality-issue-severity. See the diff here:
[
- var qielem = document.getElementById(qiref.substr(1));
- var issues = qielem.childNodes;
var issueslist = new String;
- for(i=0; i<issues.length; i++) {
- if(issues[i].nodeType==1) { issueslist = issueslist +
- issues[i].getAttribute('its-loc-quality-issue-type') + " "; } }
+ var parser = new DOMParser();
+ var standoffits = document.getElementById('its-standoff-1').textContent;
+ var doc = parser.parseFromString(standoffits,'application/xml');
+ var locqualityissues =
doc.getElementsByTagNameNS('http://www.w3.org/2005/11/its','locQualityIssues');
+ for(i=0; i<locqualityissues.length; i++)
+ {
+ if (locqualityissues[i].getAttribute('xml:id') == qiref.substr(1));
+ {
+ var issues = locqualityissues[i].childNodes;}
+ var issueslist = new String;
+ for(i=0; i<issues.length; i++) {
+ if(issues[i].nodeType==1) { issueslist = issueslist +
+ issues[i].getAttribute('locQualityIssueType') + " "; } }
+ }
]
Felix
> I'll have to check.
>
> Phil.
>
>
>
>
>
> From: Felix Sasaki <fsasaki@w3.org>
> To: Phil Ritchie <philr@vistatec.ie>,
> Cc: public-multilingualweb-lt@w3.org
> Date: 02/10/2012 11:04
> Subject: Re: ACTION-233: Update quality issue example to use the
> solution (XML in "script" tag) for standoff markup
> ------------------------------
>
>
>
>
>
> 2012/10/2 Phil Ritchie <*philr@vistatec.ie* <philr@vistatec.ie>>
> Felix
>
> Before I can answer the question can you tell me what the motivation for
> using the script tags is?
>
> There are two motivations. One is based on
> *
> https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration
> *<https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration>
> here you have ITS rules files inside HTML5. It seems that this is a
> requirement from Linguaserve: not rules linked, but inside HTML5. So far
> Linguaserve has put the rules files "just somewhere". That makes it invalid
> HTML5. With the rules in the "script" element, it becomes valid again.
> The other motivation is that the standoff we had so far for HTML5 looked
> like this:
>
> <span its-loc-quality-issues-ref=#lq1>c'es</span> le contenu</p>
>
> <span id=lq1 its-loc-quality-issues=its-loc-quality-issues>
>
> <span
>
> its-loc-quality-issue=its-loc-quality-issue
>
> its-loc-quality-issue-coment="Sentence without
> capitalization"
>
> its-loc-quality-issue-severity=30
>
> its-loc-quality-issue-type=typographical></span>
>
> <span
>
> its-loc-quality-issue=its-loc-quality-issue
>
> its-loc-quality-issue-coment="'c'es' is unknown.
> Could be 'c'est'"
>
> its-loc-quality-issue-severity=50
>
> its-loc-quality-issue-type=misspelling></span>
>
> </span>
>
>
>
> "span" is mis-used to "transport" standoff metadata in the "body"
> element. It works, but is not very clean. Hence "script" which is defined
> for that purpose, see
> *http://dev.w3.org/html5/spec/the-script-element.html*<http://dev.w3.org/html5/spec/the-script-element.html>
>
> about "application/xml" and other types:
> "These types are explicitly listed here because they are poorly-defined
> types that are nonetheless likely to be used as formats for data blocks,
> and it would be problematic if they were suddenly to be interpreted as
> script by a user agent."
> Jirka had mentioned this solution afternonn 26
> *http://www.w3.org/2012/09/26-mlw-lt-minutes.html*<http://www.w3.org/2012/09/26-mlw-lt-minutes.html>
> search for "current recommendation is to put the tool info xml into script
> in html"
> and pointed us to the related DOM methods
> *https://developer.mozilla.org/en-US/docs/DOM/DOMParser*<https://developer.mozilla.org/en-US/docs/DOM/DOMParser>
>
> Felix
>
>
> My demo in Prague used standoff without needing to wrap them in script
> tags.
>
> Phil.
>
>
>
>
>
> From: Felix Sasaki <*fsasaki@w3.org* <fsasaki@w3.org>>
> To: *public-multilingualweb-lt@w3.org*<public-multilingualweb-lt@w3.org>,
>
> Date: 02/10/2012 09:17
> Subject: ACTION-233: Update quality issue example to use the
> solution (XML in "script" tag) for standoff markup
> ------------------------------
>
>
>
>
> Hi all,
>
> I updated the qaissue example to use XML in the script element, see *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2>
> the standoff metadata is now in a dedicated "script" element. See also *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html>
> *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js>
>
> So this works, but I have a question to the implementors using HTML5 as an
> input for processing outside the browser.
> If you process *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html>
> with the *validator.nu* <http://validator.nu/> HTML5 parser, the content
> of "script" is not "seen" as XML. The output then is
>
> <html xmlns="*http://www.w3.org/1999/xhtml* <http://www.w3.org/1999/xhtml>">...
>
> <script type="application/xml" id="its-standoff-1">
> <its:locQualityIssues xml:id="lq1" xmlns:its="*
> http://www.w3.org/2005/11/its* <http://www.w3.org/2005/11/its>">
> <its:locQualityIssue
> locQualityIssueType="misspelling"
> locQualityIssueComment="'c'es' is unknown. Could be 'c'est'"
> locQualityIssueSeverity="50"/>
> <its:locQualityIssue
> locQualityIssueType="typographical"
> locQualityIssueComment="Sentence without capitalization"
> locQualityIssueSeverity="30"/>
> </its:locQualityIssues>
> </script>...</html>
>
> So if we would have an XML-based tool that wants to pick up the ITS
> standoff information, it won't work.
> Currently, Linguaserve is using this approach *
> **
> https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration
> *<https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration>
> to embed ITS rules into an HTML file. I had hoped that the "script"
> element would have been an alternative - is it?
> I'm sure this is not a difficult problem, but we probably need some
> guidance for implementors who are not used to process HTML5.
>
> Felix
>
> --
> Felix Sasaki
> DFKI / W3C Fellow
>
>
> ************************************************************
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the sender immediately by e-mail.
>
> *www.vistatec.com* <http://www.vistatec.com/>
> ************************************************************
>
>
>
>
> --
> Felix Sasaki
> DFKI / W3C Fellow
>
>
> ************************************************************
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the sender immediately by e-mail.
>
> www.vistatec.com
> ************************************************************
>
>
--
Felix Sasaki
DFKI / W3C Fellow
Received on Tuesday, 2 October 2012 11:33:44 UTC