Re: Web annotations for physical texts from Benjamin Young on 2018-10-15 (public-openannotation@w3.org from October 2018)

From: Benjamin Young <byoung@bigbluehat.com>
Date: Mon, 15 Oct 2018 20:33:02 +0000
To: Christopher Blackwell <cwblackwell@gmail.com>, "public-openannotation@w3.org" <public-openannotation@w3.org>
CC: Steven Harms <sgharms@stevengharms.com>
Message-ID: <BN6PR06MB27700364053AEB3FCD25F28CB2FD0@BN6PR06MB2770.namprd06.prod.outlook.com>
It's likely best--given the vast array of options--that one store as many matching target expressions as one is able to generate at the time the annotation is recorded (or perhaps later with machines).

As in...
```
{
  "target": [
    "urn:cts:....",
    {
      "source": "urn:isbn:...",
      "selector": {
        ...some nifty new selector for physical dimensions, pages, etc...
      }
  ]
}
```

There's also EPUB CFI's of course...and likely many more we've missed... >_>

As this exploration goes along, if anyone wants to write these findings up on the wiki, that'd be super amazing:
https://www.w3.org/community/openannotation/wiki/Main_Page

Cheers!
Benjamin


--

http://bigbluehat.com/

http://linkedin.com/in/benjaminyoung

________________________________
From: Christopher Blackwell <cwblackwell@gmail.com>
Sent: Saturday, October 13, 2018 5:39 PM
To: public-openannotation@w3.org
Cc: Steven Harms
Subject: Re: Web annotations for physical texts

Hi Steven,

Some thoughts on your questions…

CTS URNs are for machine-actionable identification and retrieval of passages of text, so their job really is different from that of a human-readable label. In our projects we use the plain text CEX format ( https://cite-architecture.github.io/citedx/CEX-spec-3.0.1/ ) for capturing data and loading it into services, and it is at that level that we can attach human-readable labels to works and editions.

Here’s a link that will (after a short delay, the server seems a little slow today) deliver a passage of text, with a label attached (and linked commentaries and some other stuff):

http://www.homermultitext.org/hmt-digital/index.html?urn=urn:cts:greekLit:tlg0012.tlg001.msA:1.1-1.5

As for citing a page of a book… CTS really is about _texts_ rather than _books_. A CTS-URN captures the semantics of a “text” defined as “an ordered hierarchy of citation objects”.

For our texts, at least, pages in a physical edition constitute a structure orthogonal to the citation-hieararchy of a work.

So I don’t think there is a low-friction way to bend CTS away from canonically citable (= citations independent of any particular expression of a text) texts to texts citable only by pages in a particular printed edition.

We associate CTS texts with “pages”, but it involves quite a bit of integration. This might be way more than you want to get into, but to give an example…

http://www.homermultitext.org/hmt-digital/index.html?urn=urn:cite2:hmt:msA.v1:12r

The above is a URL that will display an object in an ordered collection of manuscript folios; "urn:cite2:hmt:msA.v1:12r” identifies folio 12-recto of a physical manuscript.

And this is a record that identifies a graph of (a) a passage of text (CtsUrn), (b) a physical folio (Cite2Urn), and (c) a digital image mapping the passage on the folio:

http://www.homermultitext.org/hmt-digital/index.html?urn=urn:cite2:hmt:va_dse.v1:il10

Cheers,
Chris B.


--
Christopher W. Blackwell
The Louis G. Forgione University Professor
Department of Classics
Furman University

On Oct 13, 2018, at 4:14 AM, Steven Harms <sgharms@stevengharms.com> wrote:

Given two endorsements for CTS in short order, I read the description and it seemed intuitive and to cover the required specificity easily. As such:

urn:cts:CTSNAMESPACE:WORK:PASSAGE@SUBREFERENCE

Would become

urn:cts:isbn:###:<PASSAGE>

Pros:

1. Intuitive!

Cons:

1. With ISBN we lose the human friendliness of say, “JK Rowling wrote HP&Philospher’s stone.” This can be remedied, of course, by a higher container holding human-friendly data, but it seems like an obvious nit to address. MLA and other citation schemes preserve this visibly in the citation.

Question:

1. How to handle <PASSAGE> in a book?

Pasting the full text seems onerous. To annotate passage p, I don’t want to have to type in passage p *and* my annotation. This would also set one afoul of copyright holders.

Further, range offsets, while completely reasonable are not given generally outside of epic poetry or other classics.

Certainly many e-readers make this calculation possible and that will surely be the correct scheme for annotations from that medium. However, my focus remains real books ;)

The most common scheme for a popular book would be the page. The docs state, failing an offset:

> A reference to an individual passage is formatted as dot-separated components representing one or more levels of the citation hierarchy defined in a CTS TextInventory for that work.

Now for most popular works, there is no CTS TextInventory — to the best of my knowledge.

So: is there a low-friction way to refer to a page?

Thanks for the suggestions to now,

Steven


(Typos and blunders my own as i’m On vacation without access to a keyboard ;))





On Thu, Oct 11, 2018 at 3:54 AM Christopher Blackwell <cwblackwell@gmail.com<mailto:cwblackwell@gmail.com>> wrote:
Dear Steven,

The CTS URN might be helpful:

http://cite-architecture.github.io/ctsurn/

Part of the CITE Architecture: http://cite-architecture.github.io<http://cite-architecture.github.io/>

(Disclosure: This is a thing I’ve worked on over the years.)

This blog post points to some live examples of real data integrated with CTS URNs:

http://homermultitext.blogspot.com/2018/07/the-homer-multitext-microservice-homer.html

If this looks at all interesting, please don’t hesitate to send along further questions.

Cheers,
Chris B.


--
Christopher W. Blackwell
The Louis G. Forgione University Professor
Department of Classics
Furman University

On Oct 10, 2018, at 1:57 PM, Steven Harms <sgharms@stevengharms.com<mailto:sgharms@stevengharms.com>> wrote:

Greetings,

I am interested in creating annotations on physical books [1<https://stevengharms.com/research/semweb-topic/problem_statement/>].

As the name "web annotations" suggests, the default target of the Web Annotation Working Group would be, of course, to annotation IRI-referable targets with IRI-identifiable Annotations.

1. Is there a model whereby we could point to a physical resource in a URI / IRI format (and thus join the existing Web Annotation universe, *or*
2. Is there a framework that might support referring to physical books that I've simply not found
3. Or should I plan to use JSON-LD to create "forge my own path?"

I hope to post an example of what #3 might look like, but I'd like to double check my understanding before engaging in in such an effort, tabula rasa.

Regards,

Steven


[1]: https://stevengharms.com/research/semweb-topic/problem_statement/

--
Steven G. Harms
PGP: E6052DAF<https://pgp.mit.edu/pks/lookup?op=get&search=0x337AF45BE6052DAF>

--
Steven G. Harms
PGP: E6052DAF<https://pgp.mit.edu/pks/lookup?op=get&search=0x337AF45BE6052DAF>
Received on Monday, 15 October 2018 20:33:30 UTC