W3C home > Mailing lists > Public > public-annotation@w3.org > November 2015

Re: Question on annotation of HTML content

From: Felix Sasaki <fsasaki@w3.org>
Date: Fri, 6 Nov 2015 21:53:26 +0100
Cc: Robert Sanderson <azaroth42@gmail.com>, Ivan Herman <ivan@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Message-Id: <4D884772-2035-496E-B08D-F1AEA304918C@w3.org>
To: Benjamin Young <bigbluehat@hypothes.is>
Hi all,

thanks a lot for your feedback on this, and thanks to Ivan for pointing out the github issue related to the thread.

I should have given some background: a linked data approach for annotating HTML documents with ITS information so far used NIF <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html> which several of you are aware of. To see how ITS inline (= using mostly attributes) and NIF approaches are related, we created an XML document <https://github.com/w3c/itsrdf/blob/master/its-in-markup.xml> and its NIF counterpart <https://github.com/w3c/itsrdf/blob/master/its-in-linked-data.json>. During the TPAC f2f we had a session in which I took an action to have several ITS examples using web annotation. I had hoped to convert the XML document to web annotation, so that in this way also web annotation and NIF easily can be compared.

You will see btw. in the NIF example that it uses a character offset approach in the identifier - which is here actually not needed since the offsets are made explicit via nif:beginIndex and endIndex predicates.

- Felix
 
> Am 06.11.2015 um 20:32 schrieb Benjamin Young <bigbluehat@hypothes.is>:
> 
> On Fri, Nov 6, 2015 at 2:26 PM, Robert Sanderson <azaroth42@gmail.com <mailto:azaroth42@gmail.com>> wrote:
> 
> That's also quite possible, but the Annotation is a bit irrelevant :) Not a bad thing, per se. 
> 
> This issue of whether, and if so how, to make assertions with Annotations.  We have tagging, but potential implementers and adopters should consider whether the annotation machinery is actually required, or whether using the SpecificResource pattern is sufficient.
> 
> Right. That. :)
> 
> So. How/Where should we call that out?
> 
> It's been one of the primary topics on this list for nearly a week now, so it's likely to come up again. :)
> 
> Maybe we define the core things we're adding to the world's vocabulary (SpecificResource, a few selectors, etc) and *then* explain what we're adding to those for various use cases (body, creator, etc)?
> 
> Maybe a "Specific Resource Data Model" and then an "Web Annotation Data Model" based on those bits?
> 
> Or does that cut too deeply?
> 
> Just an idea. ^_^
> 
> 
> 
> 
> On Fri, Nov 6, 2015 at 11:16 AM, Benjamin Young <bigbluehat@hypothes.is <mailto:bigbluehat@hypothes.is>> wrote:
> Would this be better served by making an optionally body-less annotation (or "just RDF") that uses the target, SpecificResource, and selector system we've defined to add triples to that?
> 
> So that Ivan's example becomes:
> ```
> {
>   "@context" :  [
>     "http://www.w3.org/ns/anno.jsonld <http://www.w3.org/ns/anno.jsonld>",
>     {
>       "itsrdf" : "http://www.w3.org/2005/11/its/rdf# <http://www.w3.org/2005/11/its/rdf#>"
>     }
>   ],
>   "target" : {
>     "source": "A URI TO THE TARGET",
>     "selector": {
>       "type": "TextQuoteSelector",
>       "prefix": "...", "exact": "...", "suffix": "..."
>     },
>     "itsrdf:translate" : "no"
> }
> ```
> 
> So that the resulting triples would shake out to:
> _:t0 itsrdf:translate "no"
> 
> Where `_:t0` is the auto-generated blank node identifier for the SpecificResource classed Target.
> 
> I don't think at any point you'd want to say that the body shouldn't be translated...it's the target you care about translating or not...though once you've determined that you might use the body to convey the translation (but that's a separate set of examples, I'd reckon).
> 
> Thoughts?
> 
> On Fri, Nov 6, 2015 at 1:56 PM, Robert Sanderson <azaroth42@gmail.com <mailto:azaroth42@gmail.com>> wrote:
> 
> I don't *disagree* but I'm not sure that it's the best way either, as the interpretation is ambiguous as to what should not be translated.
> 
> To add an explicit id to the body, another property and anonymize the itsrdf assertion:
> 
> {
>   "body": {
>     "id": "_:b0",
>     "format": "text/plain",
>     "some:property": "some value"
>   }
> }
> 
> The property and value are about the body, not about the target, just like format is.  Now if you put back the translate: no ... you would be saying not to translate the body. However, you want to say that the *target* should not be translated. 
> 
> In natural language you would do:
> 
> {
>   "body": {
>     "format": "text/plain",
>     "content": "This string should not be translated"
>   }
> }
> 
> So in machine readable form, you could say:
> 
> {
>   "body": {
>     "format": "text/turtle",
>     "content": "<uri-of-specific-resource> itsrdf:translate \"no\" . "
>   }
> }
> 
> Does that help?
> 
> Rob
> 
> 
> On Fri, Nov 6, 2015 at 8:32 AM, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
> Hm. 
> 
> I believe that, in fact, what you wrote is almost correct as it is, provided that you have added an additional context for that namespace. Ie, in terms of JSON-LD, what you would do is:
> 
> {
> 	"@context" :  [
> 		"http://www.w3.org/ns/anno.jsonld <http://www.w3.org/ns/anno.jsonld>",
> 		{
> 			"itsrdf" : "http://www.w3.org/2005/11/its/rdf# <http://www.w3.org/2005/11/its/rdf#>"
> 		}
> 	],
> 	"target" : "A URI TO THE TARGET",
>         "body" : {
> 		"itsrdf:translate" : "no"
>         }
> }
> 
> The trick is that JSON-LD allows multiple contexts to be mixed in. I believe that should be a bona fide (albeit unusual) annotation in the model, but maybe Rob will disagree.
> 
> However, if it actually *is* a correct annotation, we may want to call out this type of example somewhere in the document… Annotations may want to use terms from other vocabularies after all…
> 
> Ivan
> 
> 
> 
>> On 6 Nov 2015, at 17:07, Felix Sasaki <fsasaki@w3.org <mailto:fsasaki@w3.org>> wrote:
>> 
>> 
>>> Am 06.11.2015 um 16:31 schrieb Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>>:
>>> 
>>>> 
>>>> On 6 Nov 2015, at 15:35, Felix Sasaki <fsasaki@w3.org <mailto:fsasaki@w3.org>> wrote:
>>>> 
>>>> Hello all,
>>>> 
>>>> apologies for this newbie question. I am looking for an example of annotating HTML content. Imagine I have the following document:
>>>> 
>>>> <!DOCTYPE html>
>>>> <html lang="en">
>>>> <head>
>>>>   <meta charset="utf-8">
>>>>   <title>some html doc</title>
>>>> 
>>>> </head>
>>>> <body>
>>>>  <p>Welcome to <strong>Berlin</strong>!</p>
>>>> </body>
>>>> </html>
>>>> 
>>>> I want to create an annotation that uses the web annotation model, uses a text selector for the string „Berlin“ and adds an annotation body containing a triple with the „translate“ predicate from the ITS 2.0 ontology, see
>>>> http://www.essepuntato.it/lode/https://raw.githubusercontent.com/w3c/itsrdf/master/its-rdf.rdf#d4e52 <http://www.essepuntato.it/lode/https://raw.githubusercontent.com/w3c/itsrdf/master/its-rdf.rdf#d4e52>
>>>> expressing that the string should not be translated. How would this look like?
>>> 
>>> I am not sure what you intend to do. Do you mean that the target should be a graph containing a specific triple?
>> 
>> 
>> the target should be a selector selecting the string „Berlin“. The annotation body should contain a tripe like
>> 
>> "body": {
>>     
>> "itsrdf:translate" : "no",
>> 
>> … }
>> 
>> So I am wondering how to express this target and how the body should look like.
>> 
>> - Felix
>> 
>> 
>>> 
>>> Ivan
>>> 
>>> 
>>>> 
>>>> Thanks for the feedback in advance,
>>>> 
>>>> Felix
>>> 
>>> 
>>> ----
>>> Ivan Herman, W3C
>>> Digital Publishing Lead
>>> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
>>> mobile: +31-641044153 <tel:%2B31-641044153>
>>> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
> 
> 
> ----
> Ivan Herman, W3C 
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
> mobile: +31-641044153 <tel:%2B31-641044153>
> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
> 
> 
> 
> 
> 
> 
> 
> -- 
> Rob Sanderson
> Information Standards Advocate
> Digital Library Systems and Services
> Stanford, CA 94305
> 
> 
> 
> 
> -- 
> Rob Sanderson
> Information Standards Advocate
> Digital Library Systems and Services
> Stanford, CA 94305
> 


Received on Friday, 6 November 2015 20:53:39 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 18:54:42 UTC