- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 2 Oct 2012 13:33:13 +0200
- To: Phil Ritchie <philr@vistatec.ie>
- Cc: public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czo41UozTdAoGmyq=NhjvRyNVQuoAFtpzkG6Z5t03STFVw@mail.gmail.com>
2012/10/2 Phil Ritchie <philr@vistatec.ie> > OK, understood. Hmm. I think use of the script element will break my > implementation. Just to be sure - does your implementation rely on javascript processing with this standoff approach: <span its-loc-quality-issue=its-loc-quality-issue its-loc-quality-issue-coment="Sentence without capitalization" its-loc-quality-issue-severity=30 its-loc-quality-issue-type=typographical></span> FYI, the change in the toy example http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js basically meant: adding a call of the XML parser and using different names for getting attributes, e.g. its:locQualityIssueSeverity instead of its-loc-quality-issue-severity. See the diff here: [ - var qielem = document.getElementById(qiref.substr(1)); - var issues = qielem.childNodes; var issueslist = new String; - for(i=0; i<issues.length; i++) { - if(issues[i].nodeType==1) { issueslist = issueslist + - issues[i].getAttribute('its-loc-quality-issue-type') + " "; } } + var parser = new DOMParser(); + var standoffits = document.getElementById('its-standoff-1').textContent; + var doc = parser.parseFromString(standoffits,'application/xml'); + var locqualityissues = doc.getElementsByTagNameNS('http://www.w3.org/2005/11/its','locQualityIssues'); + for(i=0; i<locqualityissues.length; i++) + { + if (locqualityissues[i].getAttribute('xml:id') == qiref.substr(1)); + { + var issues = locqualityissues[i].childNodes;} + var issueslist = new String; + for(i=0; i<issues.length; i++) { + if(issues[i].nodeType==1) { issueslist = issueslist + + issues[i].getAttribute('locQualityIssueType') + " "; } } + } ] Felix > I'll have to check. > > Phil. > > > > > > From: Felix Sasaki <fsasaki@w3.org> > To: Phil Ritchie <philr@vistatec.ie>, > Cc: public-multilingualweb-lt@w3.org > Date: 02/10/2012 11:04 > Subject: Re: ACTION-233: Update quality issue example to use the > solution (XML in "script" tag) for standoff markup > ------------------------------ > > > > > > 2012/10/2 Phil Ritchie <*philr@vistatec.ie* <philr@vistatec.ie>> > Felix > > Before I can answer the question can you tell me what the motivation for > using the script tags is? > > There are two motivations. One is based on > * > https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration > *<https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration> > here you have ITS rules files inside HTML5. It seems that this is a > requirement from Linguaserve: not rules linked, but inside HTML5. So far > Linguaserve has put the rules files "just somewhere". That makes it invalid > HTML5. With the rules in the "script" element, it becomes valid again. > The other motivation is that the standoff we had so far for HTML5 looked > like this: > > <span its-loc-quality-issues-ref=#lq1>c'es</span> le contenu</p> > > <span id=lq1 its-loc-quality-issues=its-loc-quality-issues> > > <span > > its-loc-quality-issue=its-loc-quality-issue > > its-loc-quality-issue-coment="Sentence without > capitalization" > > its-loc-quality-issue-severity=30 > > its-loc-quality-issue-type=typographical></span> > > <span > > its-loc-quality-issue=its-loc-quality-issue > > its-loc-quality-issue-coment="'c'es' is unknown. > Could be 'c'est'" > > its-loc-quality-issue-severity=50 > > its-loc-quality-issue-type=misspelling></span> > > </span> > > > > "span" is mis-used to "transport" standoff metadata in the "body" > element. It works, but is not very clean. Hence "script" which is defined > for that purpose, see > *http://dev.w3.org/html5/spec/the-script-element.html*<http://dev.w3.org/html5/spec/the-script-element.html> > > about "application/xml" and other types: > "These types are explicitly listed here because they are poorly-defined > types that are nonetheless likely to be used as formats for data blocks, > and it would be problematic if they were suddenly to be interpreted as > script by a user agent." > Jirka had mentioned this solution afternonn 26 > *http://www.w3.org/2012/09/26-mlw-lt-minutes.html*<http://www.w3.org/2012/09/26-mlw-lt-minutes.html> > search for "current recommendation is to put the tool info xml into script > in html" > and pointed us to the related DOM methods > *https://developer.mozilla.org/en-US/docs/DOM/DOMParser*<https://developer.mozilla.org/en-US/docs/DOM/DOMParser> > > Felix > > > My demo in Prague used standoff without needing to wrap them in script > tags. > > Phil. > > > > > > From: Felix Sasaki <*fsasaki@w3.org* <fsasaki@w3.org>> > To: *public-multilingualweb-lt@w3.org*<public-multilingualweb-lt@w3.org>, > > Date: 02/10/2012 09:17 > Subject: ACTION-233: Update quality issue example to use the > solution (XML in "script" tag) for standoff markup > ------------------------------ > > > > > Hi all, > > I updated the qaissue example to use XML in the script element, see * > ** > http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2 > *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2> > the standoff metadata is now in a dedicated "script" element. See also * > ** > http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html > *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html> > * > ** > http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js > *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js> > > So this works, but I have a question to the implementors using HTML5 as an > input for processing outside the browser. > If you process * > ** > http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html > *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html> > with the *validator.nu* <http://validator.nu/> HTML5 parser, the content > of "script" is not "seen" as XML. The output then is > > <html xmlns="*http://www.w3.org/1999/xhtml* <http://www.w3.org/1999/xhtml>">... > > <script type="application/xml" id="its-standoff-1"> > <its:locQualityIssues xml:id="lq1" xmlns:its="* > http://www.w3.org/2005/11/its* <http://www.w3.org/2005/11/its>"> > <its:locQualityIssue > locQualityIssueType="misspelling" > locQualityIssueComment="'c'es' is unknown. Could be 'c'est'" > locQualityIssueSeverity="50"/> > <its:locQualityIssue > locQualityIssueType="typographical" > locQualityIssueComment="Sentence without capitalization" > locQualityIssueSeverity="30"/> > </its:locQualityIssues> > </script>...</html> > > So if we would have an XML-based tool that wants to pick up the ITS > standoff information, it won't work. > Currently, Linguaserve is using this approach * > ** > https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration > *<https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration> > to embed ITS rules into an HTML file. I had hoped that the "script" > element would have been an alternative - is it? > I'm sure this is not a difficult problem, but we probably need some > guidance for implementors who are not used to process HTML5. > > Felix > > -- > Felix Sasaki > DFKI / W3C Fellow > > > ************************************************************ > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the sender immediately by e-mail. > > *www.vistatec.com* <http://www.vistatec.com/> > ************************************************************ > > > > > -- > Felix Sasaki > DFKI / W3C Fellow > > > ************************************************************ > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the sender immediately by e-mail. > > www.vistatec.com > ************************************************************ > > -- Felix Sasaki DFKI / W3C Fellow
Received on Tuesday, 2 October 2012 11:33:44 UTC