- From: Felix Sasaki <fsasaki@w3.org>
- Date: Fri, 22 Feb 2013 14:48:47 +0100
- To: "Lieske, Christian" <christian.lieske@sap.com>
- CC: "public-multilingualweb-lt-comments@w3.org" <public-multilingualweb-lt-comments@w3.org>
- Message-ID: <5127773F.2010704@w3.org>
Hi Christian, all, I have spoken to Sebastian Hellmann, see below. The below has some additional replies I had sent, but hadn't get feedback from Christian yet (or I may have missed it). Am 10.01.13 10:50, schrieb Lieske, Christian: > > Hi, > > Please find below comments/observations/questions/ideas concerning the > ITS 2.0 working draft dated December 6, 2012 > (http://www.w3.org/TR/2012/WD-its20-20121206/). Please feel free to > contact me for clarifications if anything is unclear. > > The objectives of the NLP Interchange Format (NIF) -- such as > interoperability between Natural Language Processing (NLP) tools, > language resources and annotations, and easy conversion to Resource > Description Format (RDF) -- from my point of view are important ones. > Accordingly, relating ITS 2.0 - with its direction to move ITS 1.0 > closer to Natural Language Processing (NLP) - to NIF may help to > realize synergies. > > While looking at the relation between ITS 2.0 and NIF in the current > Working Draft (WD), I have come up with the observations/questions > below. I apologize in advance if a reply to this comment may require > that discussions which presumably already took place may have to be > summarized. > > 1. Does the WD refer to NIF 1.0, or 2.0? NIF 2.0 already seems to be > under development. > > 2. I am a bit unsure about the approval procedure, the official > status, and the organizational home of NIF 1.0 (and NIF 2.0). My > assumption is that the LOD2 Consortium declared NIF 1.0 as finished, > and hasn't handed it over to an accredited standardization > organization such as ISO. > Sebastian will provide NIF2.0 under a stable CCBY license and will have it hosted with a persistence policy by University. of Leipzig. This will not be hosting by a standards body, but the licensign will allow re-use of NIF, and the hosting will provide stability. Not 100% related, but FYI: Sebastian will provide - in addition to my implementation of http://www.w3.org/TR/2012/WD-its20-20121206/#conversion-to-nif an additional implementation of the conversion. So we won't have to declare this as feature at risk. > 3. Wouldn't the ITS2NIF mapping benefit from/need the following as > prerequisites? > > a. Input and output have to be Canonical XML (for XML-based formats) > > b. Input and output have to consider Unicode Normalization > Forms/Unicode Equivalence (e.g. so that the algorithm does produce > identical results for sentences that contain "Äffin" and "A\u0308ffin") > A few weeks ago I had provided an answer to normalization (which I would like to extend to Canonical XML) - taken from http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0210.html [ <trackbot> Created ACTION-430 - Draft text explaining importance of Unicode normalization and best practices on ISSUE-67 [on Shaun McCance - due 2013-02-04]. >FS: Christian, would such a BP note also help with your concerns about the NIF conversion? See your comment 3a at http://lists.w3.org/Archives/Public/public-multilingualweb-lt-comments/2013Jan/0101.html in that you mail "have to consider". But you don't say "require", "make a testable assertation", "provide tests" etc. Can you clarify whether a note would be sufficent? Also as a reply to issue-85, 3b: if your answer to my question is "require", "make a testable assertation": why? We of course won't to be good citiczens with regards to normalization, but why require more than XQuery, XPath, HTML5, SPARQL ...? ] One additional thought to this: from my implementation experience, normalization or caniconalization are not the problem. It is white space handling. And for this we have a note already http://www.w3.org/TR/2012/WD-its20-20121206/#conversion-to-nif "It is recommended to normalize whitespace in the input XML/HTML/DOM in order to minimize such phantom predicates." Christian, would the above resolve the three comments? Best, Felix
Received on Friday, 22 February 2013 13:49:16 UTC