Re: {Disarmed} Re: How to put an annotation in HTML?

Hi Hugh,

Am 27.04.2013 18:47, schrieb Hugh Glaser:
> Actually, your example <span 
> its-ta-ident-ref="http://usefulinc.com/ns/doap#developer">someone who 
> works on</span> is quite interesting as #developer is an rdf:property. 
> This might actually be problematic later in RDF as it causes OWL Full, 
> when used as an object.
> Ah - I think that is why I put it in - to see what happened :-)
> I was thinking of putting a Class in as well, but I guess that makes less difference.

Classes are tackled with its-ta-class-ref . Named Entity Recognition and 
Linking (i.e class (Person, etc.) and entity link) are a much more 
common use case than relation extraction, which is why we included it 
from the start. This was a given separation done by language tools, any 
how. Making a distinction between instances, properties (object, 
datatype), classes and annotations is OWL specific, so the 
motivation+rationale comes from a different domain.
-- Sebastian

>> <http://example.com/doc.html#char=x,y>
>>    rdf:type              nif:RFC5147String ;
>>    itsrdf:taIdentRef<http://usefulinc.com/ns/doap#developer>  ;
>>
>> One solution would be to make http://www.w3.org/2005/11/its/rdf#taIdentRef an rdf:Property instead  of and ObjectProperty, leaving it underspecified.
>> Clearly this is not ideal for reasoners, but quite acceptable.
>>
>> The other solution would be to add one more attribute to ITS for properties.
>>
>> Any other ideas?
> Sorry, I wouldn't dare - I am more of a parasite, feeding on the hard work of people who do the standards service.
> And my apologies to people who are having severe indigestion at the example I used, as opposed to encoding it properly in RDF.
> Best
>> All the best,
>> Sebastian
>>
>>
>> Am 27.04.2013 14:22, schrieb Hugh Glaser:
>>> Great discussion, finding out about ITS in this context - thanks Sebastian.
>>> I would never have found section 5.4 otherwise, or even thought that ITS had much direct relevance to RDF - the abstract certainly doesn't fire me with RDF enthusiasm :-)
>>>
>>> That's some serious algorithm to do for its-ta-ident-ref (is that a Eurovision Song Contest entry?).
>>> But it starts from a seriously simple annotation (which Denny asked for), which is exactly what we should be providing.
>>> I can imagine (in some universe!) lots of documents getting things like
>>> <span its-ta-ident-ref="http://dbpedia.org/resource/Dublin">Dublin</span>
>>> all over the place, to great advantage for us consumers.
>>>
>>> By hand, or really simple tools to use.
>>> I am guessing there are such tools of which I am ignorant?
>>>
>>> Am I right in thinking I can have things like?:
>>> Am <span its-ta-ident-ref="http://id.ecs.soton.ac.uk/person/21">I</span> right in thinking that <span its-ta-ident-ref="http://www.w3.org/People/Berners-Lee/card#i">Tim</span> is <span its-ta-ident-ref="http://usefulinc.com/ns/doap#developer">someone who works on</span> the <span its-ta-ident-ref="http://dig.csail.mit.edu/2005/ajar/ajaw/data#Tabulator">Tabulator</span>? <span its-ta-ident-ref="http://www.w3.org/People/Berners-Lee/card#i">He</span> says so in his <span its-ta-ident-ref="http://www.w3.org/People/Berners-Lee/card">personal profile</span>.
>>>
>>> Cheers
>>> On 26 Apr 2013, at 15:05, Denny Vrandečić<denny.vrandecic@wikimedia.de>  wrote:
>>>
>>>> Sebastian,
>>>>
>>>> thanks! its-ta-ident-ref is perfect! That's exactly what I have been looking for.
>>>>
>>>> Only drawbacks are, that it is not a Recommendation yet (what's the timeline here?), but that's not so terrible, and that this is the possibly worst attribute name I have seen so far in HTML.
>>>>
>>>> Still, that's what I am going to use! Thanks,
>>>> Cheers,
>>>> Denny
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2013/4/26 Sebastian Hellmann<hellmann@informatik.uni-leipzig.de>
>>>> Hi John and Denny,
>>>> the problem is well known and RDFa has its limits. Please see the new ITS 2.0 spec [1], which provides a solution for this. ITS 2.0 will likely be widely adopted by CMS and translation industry and it has an RDF transition using NIF[2] .
>>>>
>>>> @Denny: For your request RDFa should be fine, if you just want to include:
>>>> <http://sws.geonames.org/4951788>  a owl:Thing .
>>>>
>>>> Note that the resulting RDF does not contain any provenance information, so I am unsure, whether calling it an "annotation" is appropriate. It is rather an inclusion of extra triples in HTML.
>>>> You are loosing any reference to "Springfield" as RDFa parsers don't support this.
>>>> Turtle in HTML would also be an easy option:http://www.w3.org/TR/turtle/#xhtml
>>>>
>>>> ITS 2.0 example:
>>>> <p>It is well known, that <span its-ta-ident-ref=MailScanner has detected a possible fraud attempt from "sws.geonames.org" claiming to be"http://sws.geonames.org/4951788"  >Springfield</span> has mild summers and short, but hard winters.</p>
>>>> NIF:
>>>> ...
>>>> <http://example.com/doc.html#xpath(/p[1]/span[1]/text()[1])>      itsrdf:xpath2nif<http://example.com/doc.html#char=23,34>  .
>>>> <http://example.com/doc.html#char=23,34>
>>>>     rdf:type              nif:RFC5147String ;
>>>>     itsrdf:taIdentRef<http://sws.geonames.org/4951788>  ;
>>>> ...
>>>>
>>>> Well, NIF is more for natural language processing tools and middleware, so it's overkill for just including the occasional triple now and then ...
>>>>
>>>> All the best,
>>>> Sebastian
>>>>
>>>>
>>>>
>>>> [1]http://www.w3.org/TR/its20/
>>>> [2]http://www.w3.org/TR/its20/#conversion-to-nif
>>>>
>>>> Am 24.04.2013 22:08, schrieb John Flynn:
>>>>> I have long thought that a clean and simple method for identifying terms in HTML that are instances of a specific ontology would be a very valuable adjunct to the growth of the Semantic Web. A number of years ago I proposed an approach to a solution I called Instance Markup Language (1) which gained no traction. The consensus at the time was that RDFa would provide the solution for this need and also that it wasn't really important because the great bulk of instance data would come from large data bases and not from HTML. I don't think RDFa has in fact provided a "clean and simple" way to identify specific terms in HTML text and link those terms to classes or properties in a specific ontology. I never thought my proposed approach was exactly right, but I did have hope it would inspire someone come forward with a similar, but cleaner, way to do this. Even though the subject still occasionally come up, after all these years it's pretty clear I was wrong about this
>>>>>   being an important component of Semantic Web technology.
>>>>>
>>>>>
>>>>>
>>>>> (1)http://mysite.verizon.net/jflynn12/IML.htm
>>>>>
>>>>>
>>>>>
>>>>> From: Denny Vrandečić [mailto:denny.vrandecic@wikimedia.de]
>>>>> Sent: Wednesday, April 24, 2013 1:59 PM
>>>>> To: semantic-web at W3C
>>>>> Subject: How to put an annotation in HTML?
>>>>>
>>>>>
>>>>>
>>>>> Sorry, probably a stupid questions:
>>>>>
>>>>>
>>>>>
>>>>> Let us say, I have some HTML like this...
>>>>>
>>>>>
>>>>>
>>>>> <p>It is well known, that Springfield has mild summers and short, but hard winters.</p>
>>>>>
>>>>>
>>>>>
>>>>> And now, for example in order to simplify extraction, I want to annotate Springfield with an URI, maybe like this, to make sure that the computer understands I mean the Springfield in Massachusetts:
>>>>>
>>>>>
>>>>>
>>>>> <p>It is well known, that <span about="http://sws.geonames.org/4951788/">Springfield</span> has mild summers and short, but hard winters.</p>
>>>>>
>>>>>
>>>>>
>>>>> How do I actually do that?
>>>>>
>>>>>
>>>>>
>>>>> Mind you, I don't want to add whole triples, but just annotate the HTML and say "this element refers to the following URI".
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Denny
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Project director Wikidata
>>>>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
>>>>> Tel. +49-30-219 158 26-0 |http://wikimedia.de
>>>>>
>>>>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>>>>>
>>>> -- 
>>>> Dipl. Inf. Sebastian Hellmann
>>>> Department of Computer Science, University of Leipzig
>>>> Projects:http://nlp2rdf.org  ,http://linguistics.okfn.org  ,http://dbpedia.org/Wiktionary  ,http://dbpedia.org
>>>> Homepage:http://bis.informatik.uni-leipzig.de/SebastianHellmann
>>>> Research Group:http://aksw.org
>>>>
>>>>
>>>>
>>>> -- 
>>>> Project director Wikidata
>>>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
>>>> Tel. +49-30-219 158 26-0 |http://wikimedia.de
>>>>
>>>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>>
>> -- 
>> Dipl. Inf. Sebastian Hellmann
>> Department of Computer Science, University of Leipzig
>> Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Deadline: *July 8th*)
>> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf Projects: http://nlp2rdf.org , http://linguistics.okfn.org , http://dbpedia.org/Wiktionary , http://dbpedia.org
>> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
>> Research Group: http://aksw.org
>>
>


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, 
Deadline: *July 8th*)
Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf 
Projects: http://nlp2rdf.org , http://linguistics.okfn.org , 
http://dbpedia.org/Wiktionary , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Received on Saturday, 27 April 2013 18:13:37 UTC