Review of future/core.html

This is a review of on http://www.openannotation.org/spec/future/
which calls itself
http://www.openannotation.org/spec/core/20130128.html and the first
section 2, http://www.openannotation.org/spec/future/core.html

I am sorry this comes a bit late, but these email drafts just hang
around and take forever to write.. so
I'm just sending what I have.


BTW - which of the annotation tools could be good for doing exactly
this kind of email and review? :)


Summary:

This reads very well! The specification is beautiful. I am getting
prouder and prouder of the high quality of this specification, and I
am getting such feedback from others as well.

Below I have clarified a few terminology things, some technical tweaks
for examples, some relaxation on fragmentation URIs for semantic
resources, and clarification on what the provenance terms are to be
used on.



First of all - I think the splitting into levels and getting rid of
the OAX spec is a great improvement. Good job!

> http://www.openannotation.org/spec/future/

1) This Version link and Previous Version link and text are wrong.
(Can we please try to get these right..? Those links are most
important to exactly this group)


2) The document is split into several HTML pages, but there is no
obvious link to section 2 etc. from the bottom of the front page -
it's not very obvious where to go next.  Propose "previous   contents
 next" links for top *and* bottom of every page - however the index
page only needs it at the bottom.


> http://www.openannotation.org/spec/future/core.html#BodyTarget

3) "The Body and Target MAY be of any media type" - I would change
this to lower-case "may" - or are you suggesting there are cases when
they are not of any media type?


4) "See Further Examples" links don't work.


http://www.openannotation.org/spec/future/core.html#BodyEmbed

 5)   dc:format "mimetype1" .

"mimetype1" is not a valid type. Change example to an actual mime type, like:

   dc:format "text/plain" .


> If known, the MIME type of the text SHOULD be given using the dc:format property

6) The 'correct' term is "media type", and the link should rather go
to http://www.iana.org/assignments/media-types - "mime type" is also
mentioned later in this page, seach-replace to media type.

(I know we should 'really' be using dct:format to formally say this is
a media type. dc:format also allows physical formats like "brochure"
and "political poster" -
http://dublincore.org/documents/1998/10/23/format-element/  -- However
dct:format becomes a bit more verbose: https://gist.github.com/4635250
- I use the second form, but would not be pushing for this here.)



>  oa:hasBody <body1> ;

>  <body1> a cnt:ContentAsText, dctypes:Text ;
>    cnt:chars "content1" ;

7) Could I suggest :body1  as the identifier here instead of <body> to
indicate that the URIs for embedded bodies typically would be
non-resolvable?



> Query: Find all of the annotations with embedded, textual comments.

8) Could I suggest to change it to "find all textual comments" ? It is
slightly more realistic, and should make it easier to see that this is
not a particularly tricky model.


 SELECT ?comment WHERE {
    ?anno oa:hasBody ?body .
    ?body a dctypes:Text ;
        cnt:chars ?comment }



> Most fragments are defined with respect to individual MIME types, and not every MIME type has a fragment specification.
> Even if a MIME type does have a fragment definition, it is often not possible to describe the segment of interest sufficiently precisely. For example, fragments for HTML cannot be used to describe an arbitrary range of text.

9) As above, "MIME type" -> "media type"


> Fragment URIs Identifying Body or Target
> It is not possible to determine with certainty what is being identified, as the same fragment string might be possible in different specifications. For example, the same fragment could identify either a semantic resource in RDFa or a section of the HTML document.

10) RDF 1.1 will however clarify this:
http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/#section-fragID

In cases where other specifications constrain the semantics of
fragment identifiers in RDF-bearing representations, the encoded RDF
graph should use fragment identifiers in a way that is consistent with
these constraints. For example, in an HTML+RDFa document [HTML-RDFA],
the fragment chapter1 may identify a document section via the
semantics of HTML's @name or @id attributes. The IRI <#chapter1>
should then be taken to denote that same section in any RDFa-encoded
triples within the same document. Similarly, if the @xml:id attribute
[XML-ID] is used in an RDF/XML document, then the corresponding IRI
should be taken to denote an XML element.


11) This section should clarify that semantic terms, such as semantic
tags with oa:Tag would often be in the form of fragment URIs, but as
this is not for the purpose of selecting a part of a resource, but
identifying a concept, such URIs are perfectly OK and SHOULD NOT be
specified using a Selector.

In addition, a resource containing oa:Annotation's might be using such
fragment URIs instead of bnodes to identify embedded textual bodies
and other elements of OA such as agents.

 <http://www.example.com/anno1> a oa:Annotation ;
    oa:hasBody :body1 ;
    oa:hasTarget <target1> .

<http://www.example.com/anno1#body1> a cnt:ContentAsText, dctypes:Text ;
    cnt:chars "content1" ;
    dc:format "mimetype1" .



> 2.2 Annotation provenance
> It is important to note that the provenance information applies only to the Annotation, and not necessarily the Body, Target or any other resource in the Annotation graph. Provenance information may also be attached to those resources separately.

12) This sounds contradicting, the provenance information applies only
to the Annotation, but can be attached to body and target separately?


I think we need to clarify the two things separately - what can we
attach Provenance information to (Annotation, Body, Target, and other
resources), and what is the scope of the provenance model we make
here. I suggest:

> Provenance information can be attached to the Annotation, Body, Target or any other resource in the Annotation graph. Thus, the provenance information attached to an Annotation is not necessarily true for the body or the target. For instance, a PhD student in 2013 could be formalizing Charles Darwin's notebooks from 1836 as Annotations with textual comments, and so the student would be the author of the Annotation, while Darwin would be the author of the Body.


13) As the model below only works on oa:Annotation, I would clarify
add something like:

> It is considered out of scope for this specification to model provenance at such an abstraction level, as existing vocabularies such as [DCTerms] and [PAV2] give sufficient coverage. However for convenience a minimal model for specifying provenance of the Annotation is provided below:

(I think we should provide a similar best-practice on how to record
such provenance)

Re PAV - Me and Paolo are preparing to release PAV 2.1 at
http://purl.org/pav before end of month (I'll try to squeeze it in
today!) - it includes PROV bindings and HTML view of the ontology, and
would easily do the Darwin example.



> " The datetime MUST be expressed in ISO 8601 format." - this is very vague, if you read ISO 8601 you will understand.  Is "2009-W53-7" ok?

14) I think this (occurs twice) should be:


> The datetime MUST be expressed in xsd:dateTime (ISO 8601 extended date time) format:  [http://www.w3.org/TR/xmlschema-2/#dateTime ] and SHOULD have time zone specified.


>  <anno1> a oa:Annotation ;

>     oa:annotatedAt "datetime1" ;
>     oa:serializedAt "datetime2" .

15) Well, let's at least practice what we preach!

This should be:

>     oa:annotatedAt "2005-12-24T03:18:56-0500"^^xsd:dateTime ;
>     oa:serializedAt "2013-01-28T02:24:56Z"^^xsd:dateTime .


===

I have not reviewed section 3, etc.


-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

Received on Monday, 28 January 2013 02:36:45 UTC