Re: [AudioTF] Some helpful comments from an audio publisher from Ivan Herman on 2018-12-20 (public-publ-wg@w3.org from December 2018)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 20 Dec 2018 09:54:41 +0100
To: Daniel Weck <daniel.weck@gmail.com>
Cc: Wendy Reid <wendy.reid@rakuten.com>, W3C Publishing Working Group <public-publ-wg@w3.org>
Message-Id: <534A457D-BB5B-4BCF-98DD-E5BC13824110@w3.org>
I will look at the rest of the (helpful!) comments, but comments on this specific issue.

YAML is great and I actually really like it. I would love to use that instead of JSON, and, in theory, that could be done: JSON is 1-1 compatible with YAML, ie, it is only a syntactic difference and, actually, the JSON-LD 1.1 work explicitly refers to YAML(-LD) as an alternative syntax for the same model and plans a separate note on the subject.

However… I have the impression that this boat has sailed (alas!). At a time when JSON is, essentially, part of the core infrastructure (Javascript environments have a JSON parsers/serializers facility built-in, ie, it can be used, and is used, out of the box in a browser, in node.js, etc; the same holds for Python), YAML is only used at the side line. Schema.org <http://schema.org/> does not use it either. In other words, if we used YAML as a syntax for our manifest, we would loose some of the advantages that we have. As Daniel said, user agents would be, essentially, obliged to convert YAML into JSON anyway...

I say that with regrets, but I believe we should forget about YAML…

Ivan

> On 20 Dec 2018, at 00:37, Daniel Weck <daniel.weck@gmail.com <mailto:daniel.weck@gmail.com>> wrote:
> 
> Useful feedback.
> YAML is great, I also use CSON in some projects where JSON would otherwise suck as a human-authored format (because of the lack of comments, multiline strings etc.). That being said, this then needs to be converted to JSON so web browsers / JavaScript can consume it directly.
> Daniel 
> 
> On Wed, 19 Dec 2018, 19:32 Reid, Wendy <wendy.reid@rakuten.com <mailto:wendy.reid@rakuten.com> wrote:
> Hi everyone,
> 
>  
> 
> I had the opportunity to speak to someone from Blackstone (for those who are not familiar with the company, they are a producer and distributor of audiobooks). I outlined the work we have done so far and shared links to our GitHub, and in reply he provided some really good feedback I wanted to share with everyone else. There’s a lot, and please remember he is not a member so this is a completely fresh take on our work, but I think his feedback is very interesting and I hope others find it of use. 
> 
>  
> 
> Some thoughts right off the bat are:
> 
>  
> 
> 1.  Audiobooks are VERY simple. Try not to let anyone over-complicate them.
> 
> a.  Audio files
> 
> b.  Maybe a PDF (or other simple file type – jpg, txt etc)
> 
> c.  What order to play/display them
> 
> d.  How to display them
> 
> 2.  Do not mix the packaging of the “book” with the packaging of the “audio”. This is the mistake they made with m4b’s.
> 
> a.  Store files AS files
> 
> b.  AAC audio should be preferred IMO
> 
> c.  Store metadata in a manifest file, not in the audio
> 
> d.  The packaging file defines the contents, not the files it contains.
> 
> 3.  Use a standard format for the container. But don’t let the container drive the specification.
> 
> a.  The topic of DRM (whilst out of scope) gets *much* simpler if it is possible to read the container like a filesystem.
> 
> b.  Zips are universal. So are ISOs
> 
> 4.  Take a look at YAML as the document spec. It is a superset of JSON (and so people can use JSON too) but it is a lot more readable
> 
> a.  YAML is growing rapidly in its popularity
> 
> b.  YAML serializes to/from JSON easily
> 
> c.  It is amazingly human-readable
> 
> 5.  Avoid HTML. Seriously. Same for XML. Let the applications (and there will need to be application support) read and render the JSON data
> 
> a.  If you want formatting, you can add to the spec to add formatting to the nodes
> 
> 6.  User lists/arrays to define the implicit order.
> 
> a.  We made a mistake when we wrote our internal spec, and went with hashes. Those are not guaranteed to be ordered, and we had to rely on an additional ‘order’ datapoint
> 
> b.  Arrays/lists are. The order is implicit, and does not rely on any additional definition
> 
> c.  Array items can be hashes, containing *anything*. This is where I would include file hashes, runtimes, and anything specific to the display of the file (font, format, label, etc etc)
> 
> 7.  Avoid embedding data (tags, anything display oriented, artwork).
> 
> a.  Again, M4B. Yuck.
> 
> 8.  If you have a container (zip/iso) and files (jpg, pdf, audio) and a single ‘manifest.json’ (or manifest.yaml 😉) then the “specification” comes down to that single JSON/YAML file. Nothing else matters really.
> 
> 9.  If you focus the standard on that Manifest file, then the actual container becomes flexible. As long as an application can read that file. But it would lend itself to a self-contained package (in a Zip/iso) as well as direct filesystem storage (object/POSIX, whatever)
> 
>   
> 
> There is a tendency to over-complicate things – especially when there are lots of opinions involved. I would be a strong advocate for a SIMPLE specification, and preferably one that defines the *contents* of the container, rather than the container itself. Having a single document define the contents, makes it much easier to build applications IMHO.
> 
>  
> 
> If you have any questions of them or comments, let me know and I can pass things along!
> 
>  
> 
> Happy holidays,
> 
> Wendy
> 
>  
> 
>   Wendy Reid
> 
>   Senior Quality Analyst
> 
>   Toronto, Canada (GMT-5)
>  <http://www.kobo.com/>
>    <http://www.facebook.com/Kobo>  
> 
>    <http://www.instagram.com/kobobooks> 
>    <http://twitter.com/kobo> 
>   www.kobo.com <http://www.kobo.com/> 
>  
> 
>  
> 
>  
> 
> <image002.png>


----
Ivan Herman, W3C 
Publishing@W3C Technical Lead
Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
mobile: +31-641044153
ORCID ID: https://orcid.org/0000-0003-0782-2704 <https://orcid.org/0000-0003-0782-2704>
Received on Thursday, 20 December 2018 08:54:46 UTC