Re: {Disarmed} Re: How to put an annotation in HTML? from Hugh Glaser on 2013-04-27 (semantic-web@w3.org from April 2013)

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Sat, 27 Apr 2013 12:22:26 +0000
To: Denny Vrandečić <denny.vrandecic@wikimedia.de>, Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
CC: semantic-web at W3C <semantic-web@w3c.org>
Message-ID: <EF8B744A-DDE4-49EE-A6AD-9169DF6C07E3@soton.ac.uk>
Great discussion, finding out about ITS in this context - thanks Sebastian.
I would never have found section 5.4 otherwise, or even thought that ITS had much direct relevance to RDF - the abstract certainly doesn't fire me with RDF enthusiasm :-)

That's some serious algorithm to do for its-ta-ident-ref (is that a Eurovision Song Contest entry?).
But it starts from a seriously simple annotation (which Denny asked for), which is exactly what we should be providing.
I can imagine (in some universe!) lots of documents getting things like
<span its-ta-ident-ref="http://dbpedia.org/resource/Dublin">Dublin</span>
all over the place, to great advantage for us consumers.

By hand, or really simple tools to use.
I am guessing there are such tools of which I am ignorant?

Am I right in thinking I can have things like?:
Am <span its-ta-ident-ref="http://id.ecs.soton.ac.uk/person/21">I</span> right in thinking that <span its-ta-ident-ref="http://www.w3.org/People/Berners-Lee/card#i">Tim</span> is <span its-ta-ident-ref="http://usefulinc.com/ns/doap#developer">someone who works on</span> the <span its-ta-ident-ref="http://dig.csail.mit.edu/2005/ajar/ajaw/data#Tabulator">Tabulator</span>? <span its-ta-ident-ref="http://www.w3.org/People/Berners-Lee/card#i">He</span> says so in his <span its-ta-ident-ref="http://www.w3.org/People/Berners-Lee/card">personal profile</span>.

Cheers
On 26 Apr 2013, at 15:05, Denny Vrandečić <denny.vrandecic@wikimedia.de> wrote:

> Sebastian,
> 
> thanks! its-ta-ident-ref is perfect! That's exactly what I have been looking for.
> 
> Only drawbacks are, that it is not a Recommendation yet (what's the timeline here?), but that's not so terrible, and that this is the possibly worst attribute name I have seen so far in HTML.
> 
> Still, that's what I am going to use! Thanks,
> Cheers,
> Denny
> 
> 
> 
> 
> 
> 2013/4/26 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
> Hi John and Denny,
> the problem is well known and RDFa has its limits. Please see the new ITS 2.0 spec [1], which provides a solution for this. ITS 2.0 will likely be widely adopted by CMS and translation industry and it has an RDF transition using NIF[2] . 
> 
> @Denny: For your request RDFa should be fine, if you just want to include:
> <http://sws.geonames.org/4951788> a owl:Thing .
> 
> Note that the resulting RDF does not contain any provenance information, so I am unsure, whether calling it an "annotation" is appropriate. It is rather an inclusion of extra triples in HTML. 
> You are loosing any reference to "Springfield" as RDFa parsers don't support this. 
> Turtle in HTML would also be an easy option: http://www.w3.org/TR/turtle/#xhtml

> 
> ITS 2.0 example:
> <p>It is well known, that <span its-ta-ident-ref=MailScanner has detected a possible fraud attempt from "sws.geonames.org" claiming to be "http://sws.geonames.org/4951788" >Springfield</span> has mild summers and short, but hard winters.</p>
> NIF:
> ...
> <http://example.com/doc.html#xpath(/p[1]/span[1]/text()[1])> 
>    itsrdf:xpath2nif <http://example.com/doc.html#char=23,34> .
> <http://example.com/doc.html#char=23,34>
>    rdf:type              nif:RFC5147String ;
>    itsrdf:taIdentRef  <http://sws.geonames.org/4951788> ;
> ...
> 
> Well, NIF is more for natural language processing tools and middleware, so it's overkill for just including the occasional triple now and then ...
> 
> All the best,
> Sebastian
> 
> 
> 
> [1] http://www.w3.org/TR/its20/

> [2] http://www.w3.org/TR/its20/#conversion-to-nif

> 
> Am 24.04.2013 22:08, schrieb John Flynn:
>> I have long thought that a clean and simple method for identifying terms in HTML that are instances of a specific ontology would be a very valuable adjunct to the growth of the Semantic Web. A number of years ago I proposed an approach to a solution I called Instance Markup Language (1) which gained no traction. The consensus at the time was that RDFa would provide the solution for this need and also that it wasn't really important because the great bulk of instance data would come from large data bases and not from HTML. I don't think RDFa has in fact provided a "clean and simple" way to identify specific terms in HTML text and link those terms to classes or properties in a specific ontology. I never thought my proposed approach was exactly right, but I did have hope it would inspire someone come forward with a similar, but cleaner, way to do this. Even though the subject still occasionally come up, after all these years it's pretty clear I was wrong about this being an important component of Semantic Web technology.
>> 
>> 
>> 
>> (1) http://mysite.verizon.net/jflynn12/IML.htm

>> 
>> 
>> 
>> From: Denny Vrandečić [mailto:denny.vrandecic@wikimedia.de] 
>> Sent: Wednesday, April 24, 2013 1:59 PM
>> To: semantic-web at W3C
>> Subject: How to put an annotation in HTML?
>> 
>> 
>> 
>> Sorry, probably a stupid questions:
>> 
>> 
>> 
>> Let us say, I have some HTML like this...
>> 
>> 
>> 
>> <p>It is well known, that Springfield has mild summers and short, but hard winters.</p>
>> 
>> 
>> 
>> And now, for example in order to simplify extraction, I want to annotate Springfield with an URI, maybe like this, to make sure that the computer understands I mean the Springfield in Massachusetts: 
>> 
>> 
>> 
>> <p>It is well known, that <span about="http://sws.geonames.org/4951788/">Springfield</span> has mild summers and short, but hard winters.</p>
>> 
>> 
>> 
>> How do I actually do that?
>> 
>> 
>> 
>> Mind you, I don't want to add whole triples, but just annotate the HTML and say "this element refers to the following URI".
>> 
>> 
>> 
>> Cheers,
>> 
>> Denny
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Project director Wikidata
>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
>> Tel. +49-30-219 158 26-0 | http://wikimedia.de

>> 
>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>> 
> 
> 
> -- 
> Dipl. Inf. Sebastian Hellmann
> Department of Computer Science, University of Leipzig 
> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , http://dbpedia.org/Wiktionary , http://dbpedia.org

> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann

> Research Group: http://aksw.org

> 
> 
> 
> -- 
> Project director Wikidata
> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> Tel. +49-30-219 158 26-0 | http://wikimedia.de

> 
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

-- 
Hugh Glaser
  20 Portchester Rise
  Eastleigh
  SO50 4QS
Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
Received on Saturday, 27 April 2013 12:23:06 UTC