- From: Felix Sasaki <fsasaki@w3.org>
- Date: Fri, 26 Apr 2013 20:57:05 +0200
- To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- CC: Denny Vrandečić <denny.vrandecic@wikimedia.de>, John Flynn <jflynn12@verizon.net>, semantic-web at W3C <semantic-web@w3c.org>
- Message-ID: <517ACE01.6000601@w3.org>
Am 26.04.13 17:15, schrieb Sebastian Hellmann: > Hi Denny, > they are just several months away of becoming a recommendation, so it > will happen soon. They are starting implementation within some weeks. > For exact details you would have to ask the mailing list or just wait > for a while ;) > > There should be an xslt stylesheet somewhere, that retrieves NIF RDF > from ITS within HTML. Thanks for the ping, Sebastian - you encouraged me to finally put that online. See http://www.w3.org/People/fsasaki/its20-general-processor/tools/its-ta-2-nif.xsl with some mini documentation in the stylesheet and a sample transformation of an HTML document http://www.w3.org/People/fsasaki/its20-general-processor/sample/nif-conversion/inputfile.html here: http://tinyurl.com/clwd64n I think it provides the right triples http://tinyurl.com/btkvkvy Let me know if you need more. I saw that in this thread there was also discussion about "term annotation" - this table http://www.w3.org/TR/its20/#textAnalysis-info-pieces and the note below the table might be helpful for you as well. Felix > > All the best, > Sebastian > > > Am 26.04.2013 16:05, schrieb Denny Vrandečić: >> Sebastian, >> >> thanks! its-ta-ident-ref is perfect! That's exactly what I have been >> looking for. >> >> Only drawbacks are, that it is not a Recommendation yet (what's the >> timeline here?), but that's not so terrible, and that this is the >> possibly worst attribute name I have seen so far in HTML. >> >> Still, that's what I am going to use! Thanks, >> Cheers, >> Denny >> >> >> >> >> >> 2013/4/26 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de >> <mailto:hellmann@informatik.uni-leipzig.de>> >> >> Hi John and Denny, >> the problem is well known and RDFa has its limits. Please see the >> new ITS 2.0 spec [1], which provides a solution for this. ITS 2.0 >> will likely be widely adopted by CMS and translation industry and >> it has an RDF transition using NIF[2] . >> >> @Denny: For your request RDFa should be fine, if you just want to >> include: >> <http://sws.geonames.org/4951788> >> <http://sws.geonames.org/4951788> a owl:Thing . >> >> Note that the resulting RDF does not contain any provenance >> information, so I am unsure, whether calling it an "annotation" >> is appropriate. It is rather an inclusion of extra triples in HTML. >> You are loosing any reference to "Springfield" as RDFa parsers >> don't support this. >> Turtle in HTML would also be an easy option: >> http://www.w3.org/TR/turtle/#xhtml >> >> ITS 2.0 example: >> <p>It is well known, that <span >> its-ta-ident-ref="http://sws.geonames.org/4951788" >> <http://sws.geonames.org/4951788> >Springfield</span> has mild >> summers and short, but hard winters.</p> >> NIF: >> ... >> <http://example.com/doc.html#xpath(/p[1]/span[1]/text()[1])> >> <http://example.com/doc.html#xpath%28/p[1]/span[1]/text%28%29[1]%29> >> itsrdf:xpath2nif <http://example.com/doc.html#char=23,34> >> <http://example.com/doc.html#char=23,34> . >> <http://example.com/doc.html#char=23,34> >> <http://example.com/doc.html#char=23,34> >> rdf:type nif:RFC5147String ; >> itsrdf:taIdentRef <http://sws.geonames.org/4951788> >> <http://sws.geonames.org/4951788> ; >> ... >> >> Well, NIF is more for natural language processing tools and >> middleware, so it's overkill for just including the occasional >> triple now and then ... >> >> All the best, >> Sebastian >> >> >> >> [1] http://www.w3.org/TR/its20/ >> [2] http://www.w3.org/TR/its20/#conversion-to-nif >> >> Am 24.04.2013 22 <tel:24.04.2013%2022>:08, schrieb John Flynn: >>> >>> I have long thought that a clean and simple method for >>> identifying terms in HTML that are instances of a specific >>> ontology would be a very valuable adjunct to the growth of the >>> Semantic Web. A number of years ago I proposed an approach to a >>> solution I called Instance Markup Language (1) which gained no >>> traction. The consensus at the time was that RDFa would provide >>> the solution for this need and also that it wasn't really >>> important because the great bulk of instance data would come >>> from large data bases and not from HTML. I don't think RDFa has >>> in fact provided a "clean and simple" way to identify specific >>> terms in HTML text and link those terms to classes or properties >>> in a specific ontology. I never thought my proposed approach was >>> exactly right, but I did have hope it would inspire someone come >>> forward with a similar, but cleaner, way to do this. Even though >>> the subject still occasionally come up, after all these years >>> it's pretty clear I was wrong about this being an important >>> component of Semantic Web technology. >>> >>> (1) http://mysite.verizon.net/jflynn12/IML.htm >>> >>> *From:*Denny Vrandečić [mailto:denny.vrandecic@wikimedia.de] >>> *Sent:* Wednesday, April 24, 2013 1:59 PM >>> *To:* semantic-web at W3C >>> *Subject:* How to put an annotation in HTML? >>> >>> Sorry, probably a stupid questions: >>> >>> Let us say, I have some HTML like this... >>> >>> <p>It is well known, that Springfield has mild summers and >>> short, but hard winters.</p> >>> >>> And now, for example in order to simplify extraction, I want to >>> annotate Springfield with an URI, maybe like this, to make sure >>> that the computer understands I mean the Springfield >>> in Massachusetts: >>> >>> <p>It is well known, that <span >>> about="http://sws.geonames.org/4951788/">Springfield</span> has >>> mild summers and short, but hard winters.</p> >>> >>> How do I actually do that? >>> >>> Mind you, I don't want to add whole triples, but just annotate >>> the HTML and say "this element refers to the following URI". >>> >>> Cheers, >>> >>> Denny >>> >>> -- >>> Project director Wikidata >>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin >>> Tel. +49-30-219 158 26-0 <tel:%2B49-30-219%20158%2026-0> | >>> http://wikimedia.de >>> >>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien >>> Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts >>> Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig >>> anerkannt durch das Finanzamt für Körperschaften I Berlin, >>> Steuernummer 27/681/51985 <tel:27%2F681%2F51985>. >>> >> >> >> -- >> Dipl. Inf. Sebastian Hellmann >> Department of Computer Science, University of Leipzig >> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , >> http://dbpedia.org/Wiktionary , http://dbpedia.org >> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann >> Research Group: http://aksw.org >> >> >> >> >> -- >> Project director Wikidata >> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin >> Tel. +49-30-219 158 26-0 | http://wikimedia.de >> >> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens >> e.V. Eingetragen im Vereinsregister des Amtsgerichts >> Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig >> anerkannt durch das Finanzamt für Körperschaften I Berlin, >> Steuernummer 27/681/51985. > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, > Deadline: *July 8th*) > Projects: http://nlp2rdf.org , http://linguistics.okfn.org , > http://dbpedia.org/Wiktionary , http://dbpedia.org > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org
Received on Friday, 26 April 2013 18:57:37 UTC