- From: Phil Ritchie <philr@vistatec.ie>
- Date: Tue, 2 Oct 2012 13:03:20 +0100
- To: Felix Sasaki <fsasaki@w3.org>
- Cc: public-multilingualweb-lt@w3.org
- Message-ID: <OF1D2B611F.24089B05-ON80257A8B.0041D35C-80257A8B.00423983@vistatec.ie>
My approach relies on javascript manipulating the DOM and constructing standoff but from a quick hack it looks as though I can construct it within the script tag. Phil. From: Felix Sasaki <fsasaki@w3.org> To: Phil Ritchie <philr@vistatec.ie>, Cc: public-multilingualweb-lt@w3.org Date: 02/10/2012 12:33 Subject: Re: ACTION-233: Update quality issue example to use the solution (XML in "script" tag) for standoff markup 2012/10/2 Phil Ritchie <philr@vistatec.ie> OK, understood. Hmm. I think use of the script element will break my implementation. Just to be sure - does your implementation rely on javascript processing with this standoff approach: <span its-loc-quality-issue=its-loc-quality-issue its-loc-quality-issue-coment="Sentence without capitalization" its-loc-quality-issue-severity=30 its-loc-quality-issue-type=typographical></span> FYI, the change in the toy example http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js basically meant: adding a call of the XML parser and using different names for getting attributes, e.g. its:locQualityIssueSeverity instead of its-loc-quality-issue-severity. See the diff here: [ - var qielem = document.getElementById(qiref.substr(1)); - var issues = qielem.childNodes; var issueslist = new String; - for(i=0; i<issues.length; i++) { - if(issues[i].nodeType==1) { issueslist = issueslist + - issues[i].getAttribute('its-loc-quality-issue-type') + " "; } } + var parser = new DOMParser(); + var standoffits = document.getElementById('its-standoff-1').textContent; + var doc = parser.parseFromString(standoffits,'application/xml'); + var locqualityissues = doc.getElementsByTagNameNS(' http://www.w3.org/2005/11/its','locQualityIssues'); + for(i=0; i<locqualityissues.length; i++) + { + if (locqualityissues[i].getAttribute('xml:id') == qiref.substr(1)); + { + var issues = locqualityissues[i].childNodes;} + var issueslist = new String; + for(i=0; i<issues.length; i++) { + if(issues[i].nodeType==1) { issueslist = issueslist + + issues[i].getAttribute('locQualityIssueType') + " "; } } + } ] Felix I'll have to check. Phil. From: Felix Sasaki <fsasaki@w3.org> To: Phil Ritchie <philr@vistatec.ie>, Cc: public-multilingualweb-lt@w3.org Date: 02/10/2012 11:04 Subject: Re: ACTION-233: Update quality issue example to use the solution (XML in "script" tag) for standoff markup 2012/10/2 Phil Ritchie <philr@vistatec.ie> Felix Before I can answer the question can you tell me what the motivation for using the script tags is? There are two motivations. One is based on https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration here you have ITS rules files inside HTML5. It seems that this is a requirement from Linguaserve: not rules linked, but inside HTML5. So far Linguaserve has put the rules files "just somewhere". That makes it invalid HTML5. With the rules in the "script" element, it becomes valid again. The other motivation is that the standoff we had so far for HTML5 looked like this: <span its-loc-quality-issues-ref=#lq1>c'es</span> le contenu</p> <span id=lq1 its-loc-quality-issues=its-loc-quality-issues> <span its-loc-quality-issue=its-loc-quality-issue its-loc-quality-issue-coment="Sentence without capitalization" its-loc-quality-issue-severity=30 its-loc-quality-issue-type=typographical></span> <span its-loc-quality-issue=its-loc-quality-issue its-loc-quality-issue-coment="'c'es' is unknown. Could be 'c'est'" its-loc-quality-issue-severity=50 its-loc-quality-issue-type=misspelling></span> </span> "span" is mis-used to "transport" standoff metadata in the "body" element. It works, but is not very clean. Hence "script" which is defined for that purpose, see http://dev.w3.org/html5/spec/the-script-element.html about "application/xml" and other types: "These types are explicitly listed here because they are poorly-defined types that are nonetheless likely to be used as formats for data blocks, and it would be problematic if they were suddenly to be interpreted as script by a user agent." Jirka had mentioned this solution afternonn 26 http://www.w3.org/2012/09/26-mlw-lt-minutes.html search for "current recommendation is to put the tool info xml into script in html" and pointed us to the related DOM methods https://developer.mozilla.org/en-US/docs/DOM/DOMParser Felix My demo in Prague used standoff without needing to wrap them in script tags. Phil. From: Felix Sasaki <fsasaki@w3.org> To: public-multilingualweb-lt@w3.org, Date: 02/10/2012 09:17 Subject: ACTION-233: Update quality issue example to use the solution (XML in "script" tag) for standoff markup Hi all, I updated the qaissue example to use XML in the script element, see http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2 the standoff metadata is now in a dedicated "script" element. See also http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js So this works, but I have a question to the implementors using HTML5 as an input for processing outside the browser. If you process http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html with the validator.nu HTML5 parser, the content of "script" is not "seen" as XML. The output then is <html xmlns="http://www.w3.org/1999/xhtml">... <script type="application/xml" id="its-standoff-1"> <its:locQualityIssues xml:id="lq1" xmlns:its=" http://www.w3.org/2005/11/its"> <its:locQualityIssue locQualityIssueType="misspelling" locQualityIssueComment="'c'es' is unknown. Could be 'c'est'" locQualityIssueSeverity="50"/> <its:locQualityIssue locQualityIssueType="typographical" locQualityIssueComment="Sentence without capitalization" locQualityIssueSeverity="30"/> </its:locQualityIssues> </script>...</html> So if we would have an XML-based tool that wants to pick up the ITS standoff information, it won't work. Currently, Linguaserve is using this approach https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration to embed ITS rules into an HTML file. I had hoped that the "script" element would have been an alternative - is it? I'm sure this is not a difficult problem, but we probably need some guidance for implementors who are not used to process HTML5. Felix -- Felix Sasaki DFKI / W3C Fellow ************************************************************ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail. www.vistatec.com ************************************************************ -- Felix Sasaki DFKI / W3C Fellow ************************************************************ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail. www.vistatec.com ************************************************************ -- Felix Sasaki DFKI / W3C Fellow ************************************************************ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail. www.vistatec.com ************************************************************
Received on Tuesday, 2 October 2012 12:03:50 UTC