- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Mon, 28 Jan 2013 02:35:57 +0000
- To: public-openannotation <public-openannotation@w3.org>
This is a review of on http://www.openannotation.org/spec/future/ which calls itself http://www.openannotation.org/spec/core/20130128.html and the first section 2, http://www.openannotation.org/spec/future/core.html I am sorry this comes a bit late, but these email drafts just hang around and take forever to write.. so I'm just sending what I have. BTW - which of the annotation tools could be good for doing exactly this kind of email and review? :) Summary: This reads very well! The specification is beautiful. I am getting prouder and prouder of the high quality of this specification, and I am getting such feedback from others as well. Below I have clarified a few terminology things, some technical tweaks for examples, some relaxation on fragmentation URIs for semantic resources, and clarification on what the provenance terms are to be used on. First of all - I think the splitting into levels and getting rid of the OAX spec is a great improvement. Good job! > http://www.openannotation.org/spec/future/ 1) This Version link and Previous Version link and text are wrong. (Can we please try to get these right..? Those links are most important to exactly this group) 2) The document is split into several HTML pages, but there is no obvious link to section 2 etc. from the bottom of the front page - it's not very obvious where to go next. Propose "previous contents next" links for top *and* bottom of every page - however the index page only needs it at the bottom. > http://www.openannotation.org/spec/future/core.html#BodyTarget 3) "The Body and Target MAY be of any media type" - I would change this to lower-case "may" - or are you suggesting there are cases when they are not of any media type? 4) "See Further Examples" links don't work. http://www.openannotation.org/spec/future/core.html#BodyEmbed 5) dc:format "mimetype1" . "mimetype1" is not a valid type. Change example to an actual mime type, like: dc:format "text/plain" . > If known, the MIME type of the text SHOULD be given using the dc:format property 6) The 'correct' term is "media type", and the link should rather go to http://www.iana.org/assignments/media-types - "mime type" is also mentioned later in this page, seach-replace to media type. (I know we should 'really' be using dct:format to formally say this is a media type. dc:format also allows physical formats like "brochure" and "political poster" - http://dublincore.org/documents/1998/10/23/format-element/ -- However dct:format becomes a bit more verbose: https://gist.github.com/4635250 - I use the second form, but would not be pushing for this here.) > oa:hasBody <body1> ; > <body1> a cnt:ContentAsText, dctypes:Text ; > cnt:chars "content1" ; 7) Could I suggest :body1 as the identifier here instead of <body> to indicate that the URIs for embedded bodies typically would be non-resolvable? > Query: Find all of the annotations with embedded, textual comments. 8) Could I suggest to change it to "find all textual comments" ? It is slightly more realistic, and should make it easier to see that this is not a particularly tricky model. SELECT ?comment WHERE { ?anno oa:hasBody ?body . ?body a dctypes:Text ; cnt:chars ?comment } > Most fragments are defined with respect to individual MIME types, and not every MIME type has a fragment specification. > Even if a MIME type does have a fragment definition, it is often not possible to describe the segment of interest sufficiently precisely. For example, fragments for HTML cannot be used to describe an arbitrary range of text. 9) As above, "MIME type" -> "media type" > Fragment URIs Identifying Body or Target > It is not possible to determine with certainty what is being identified, as the same fragment string might be possible in different specifications. For example, the same fragment could identify either a semantic resource in RDFa or a section of the HTML document. 10) RDF 1.1 will however clarify this: http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/#section-fragID In cases where other specifications constrain the semantics of fragment identifiers in RDF-bearing representations, the encoded RDF graph should use fragment identifiers in a way that is consistent with these constraints. For example, in an HTML+RDFa document [HTML-RDFA], the fragment chapter1 may identify a document section via the semantics of HTML's @name or @id attributes. The IRI <#chapter1> should then be taken to denote that same section in any RDFa-encoded triples within the same document. Similarly, if the @xml:id attribute [XML-ID] is used in an RDF/XML document, then the corresponding IRI should be taken to denote an XML element. 11) This section should clarify that semantic terms, such as semantic tags with oa:Tag would often be in the form of fragment URIs, but as this is not for the purpose of selecting a part of a resource, but identifying a concept, such URIs are perfectly OK and SHOULD NOT be specified using a Selector. In addition, a resource containing oa:Annotation's might be using such fragment URIs instead of bnodes to identify embedded textual bodies and other elements of OA such as agents. <http://www.example.com/anno1> a oa:Annotation ; oa:hasBody :body1 ; oa:hasTarget <target1> . <http://www.example.com/anno1#body1> a cnt:ContentAsText, dctypes:Text ; cnt:chars "content1" ; dc:format "mimetype1" . > 2.2 Annotation provenance > It is important to note that the provenance information applies only to the Annotation, and not necessarily the Body, Target or any other resource in the Annotation graph. Provenance information may also be attached to those resources separately. 12) This sounds contradicting, the provenance information applies only to the Annotation, but can be attached to body and target separately? I think we need to clarify the two things separately - what can we attach Provenance information to (Annotation, Body, Target, and other resources), and what is the scope of the provenance model we make here. I suggest: > Provenance information can be attached to the Annotation, Body, Target or any other resource in the Annotation graph. Thus, the provenance information attached to an Annotation is not necessarily true for the body or the target. For instance, a PhD student in 2013 could be formalizing Charles Darwin's notebooks from 1836 as Annotations with textual comments, and so the student would be the author of the Annotation, while Darwin would be the author of the Body. 13) As the model below only works on oa:Annotation, I would clarify add something like: > It is considered out of scope for this specification to model provenance at such an abstraction level, as existing vocabularies such as [DCTerms] and [PAV2] give sufficient coverage. However for convenience a minimal model for specifying provenance of the Annotation is provided below: (I think we should provide a similar best-practice on how to record such provenance) Re PAV - Me and Paolo are preparing to release PAV 2.1 at http://purl.org/pav before end of month (I'll try to squeeze it in today!) - it includes PROV bindings and HTML view of the ontology, and would easily do the Darwin example. > " The datetime MUST be expressed in ISO 8601 format." - this is very vague, if you read ISO 8601 you will understand. Is "2009-W53-7" ok? 14) I think this (occurs twice) should be: > The datetime MUST be expressed in xsd:dateTime (ISO 8601 extended date time) format: [http://www.w3.org/TR/xmlschema-2/#dateTime ] and SHOULD have time zone specified. > <anno1> a oa:Annotation ; > oa:annotatedAt "datetime1" ; > oa:serializedAt "datetime2" . 15) Well, let's at least practice what we preach! This should be: > oa:annotatedAt "2005-12-24T03:18:56-0500"^^xsd:dateTime ; > oa:serializedAt "2013-01-28T02:24:56Z"^^xsd:dateTime . === I have not reviewed section 3, etc. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Monday, 28 January 2013 02:36:45 UTC