Review of Ontology for Media Resource 1.0

Hi,

this review is personal (i.e. neither as Vodafone nor DAP chair), and is based on the http://www.w3.org/TR/2010/WD-mediaont-10-20100309/ draft.

• Your CSS reduces the margins between <p> elements throughout the document, which makes it harder to read. In general, please don't override very generic core styles from W3C as it defeats the purpose of having a common style.

• You state multiple times variants on the fact that the document is "mostly targeted to media resources on the Web". As opposed to what?

• Typos: "propertydc:creator", "propertyexif:Artist", "Dublin Corecreator"

• Typo: "exif" instead of EXIF.

• Typo: "properties. we have".

• "beyond generic Dublin Core specification", add "the".

• Sometimes you say "the Ontology" and sometimes "the ontology", it would be nice to be consistent.

• You have definitions but aren't using <dfn>, it would be nice.

• You don't define what an ontology is, I wouldn't assume that people actually know.

• Typos: "gorup", "participant's expertise" -> "participants' expertise"

• The "Purpose of this specification" section seems redundant: that information should be in the abstract. What's more, it says that the specification is for video metadata while also saying that it is for media resources. I would simply cut it.

• "defined in this Working Group" The draft shouldn't talk about the WG except in the SotD and possibly in notes and issues; this should probably say "defined in this specification".

• Section 3 doesn't say whether it is normative or not. I would recommend having a blanket statement somewhere saying that everything is normative with a list of exceptions, like in http://dev.w3.org/2009/dap/system-info/#conformance

• Note that as per the Manual of Style it is usually recommended to use title case for titles.

• Have you considered not using XML Schema for the types and instead relying on the more commonly used and less scary definitions provided by HTML5? I believe that given the target audience (as I understand it) it would be more useful. See http://dev.w3.org/html5/spec/infrastructure.html#common-microsyntaxes. Also note that you say your URI type is RFC3986/7 but reference XML Schema 1.0 part 2 which says anyURI is from RFC2396/2732. If you plan on keeping the reference to XML Schema, I would recommend upgrading to 1.1, even though it is still in LC.

• Shouldn't section 4 be called "Property Definition*s*"?

• Section 4.1.1 really gives the impressions that the WG just picked whichever properties it happened to like that day. I would suggest that you either not describe the process, or at least make it look like there was a method of some sort. I don't think that you need to justify your choice, it simply represents the industry's consensus.

• Typo: "member's opinion" -> "members' opinions"

• "Rough description of purpose" -> "Description of purpose"

• The table in 4.1.2 is really hard to read. Is it necessary to cram all that information so closely together? I think that it would be clearer if the table subsections became simple subsections, and if the content were unfolded for each property, perhaps with a <dl> or something similar.

• I don't understand the datatype column in the table, why are there what look like tuples in there even for values that seem to have an atomic value (and are described as such, like identifier)? What does { identifier:URI, type:String } mean? Shouldn't it just be "URI"?

• In general the descriptions of the types need to be a lot clearer, for most properties I am completely lost.

• Where does the "ma" prefix come from? It is described later as the "namespace" but it seems to be a prefix instead (and should be defined before being used). I may have missed it but the document does not seem to define a namespace for the vocabulary. Or maybe it's using a different meaning of "namespace", in which case that should be clarified.

• "such as the EBU vocabulary" -> reference needed

• "ma:rating": "The scale of the rating should be provided." If the rating represents a vote you also need its amplitude (number of votes) to make it sensible (five stars by one person isn't the same as four by several thousand). Maybe the "context" thing is intended to do that, but it's impossible to guess.

• ma:fragments is defined as "a list of pairs" but the type system you claim to use does not contain lists.

• "ma:frameSize (...) It is required to use a pixel unit." Why? How would I define the ma:frameSize for an animation contained in this document: <svg width='21cm' height='29cm'...>...</svg>?

• "ma:compression" Have you considered calling it ma:coding instead? One may wish to use a coding of a resource for purposes other than compression (e.g. fast random access, low memory footprint, minimal CPU usage, etc.) and in some cases the coding might cause the representation to be bigger than the source.

• The previous point leads me to a more general comment: you gloss over the resource/representation distinction. It seems to be intentional, but the design decision could probably be better documented.

• ma:format: if it's always a media type (as the description states) then it should be called ma:mediaType.

• Why do you have namedFragments and numTracks but samplingrate, framerate, bitrate?

• ma:numTracks: "Number of tracks." Of what tracks? Audio? Video? Subtitles? Data? All of the above?

• Typo: "metatdata"

• The stated purpose of the relations tables is to get feedback — isn't the intention that they stay in the document so that people implementing converters or exposing other metadata through the API will know how to build the correspondence?

• Also note that you shouldn't have "Candidate Other Elements" in an LC draft.

Overall the document seems to be producing a useful vocabulary and the work of sifting through large amounts of existing material was well worth doing. That being said, I don't believe that the document as presented is ready for last call. Most importantly, the section that defines properties needs to be reworked so as to be far better defined — everything in it needs to be nailed to the floor and crystal clear before LC. As it currently stands, I don't feel that I was able to review it because too much of it is completely unspecified or lacking description.

Another major issue is that the specification as currently written has no discussion of conformance. What does it mean to implement this specification? What does it mean to conform to it? Some parts of it are described as normative, but there isn't a single normative assertion in it (no "must", a few "should", a couple "recommended" but none of them with a clear product to apply to). I understand that it may be difficult to write concrete tests for an ontology since it does "do" anything, but I still think that there could be clearly defined conformance requirements on products that expose this data model, e.g. that they must expose this or that property as a list of pairs of foo,bar, etc. The SotD states that this document is on Recommendation track, but it reads much like a Note (which might be fine as well, that's up to you).

Thanks for the hard work and best of luck!

--
Robin Berjon
  robineko — hired gun, higher standards
  http://robineko.com/

Received on Tuesday, 23 March 2010 14:31:52 UTC