Re: extensions and expected values from Matt Garrish on 2013-08-01 (public-vocabs@w3.org from August 2013)

From: Matt Garrish <matt.garrish@bell.net>
Date: Wed, 31 Jul 2013 21:20:33 -0400
To: <kcoyle@kcoyle.net>
CC: <public-vocabs@w3.org>
Message-ID: <BLU0-SMTP120AA9053FEBB205146D29FA500@phx.gbl>
Hi Karen,

We're not specifically working on defining formats, only looking to be able 
to present to producers how information they're encoding can be done. The 
accessibility information may be one part of enhancing an online catalogue 
page, for example, so we've been looking at how to encode the format itself. 
That's why the work you're doing is so interesting, as it fills in many 
missing pieces.

And yes, complicating access is both national legislation (DRM requirements 
in the US and controlled circulation) and national boundaries, although the 
recent WIPO Marrakesh treaty has the potential to address the latter (albeit 
still through trusted intermediaries). A user will still have to navigate 
that problem if their search takes them to such a page where the content is 
not widely distributable. We have had some discussions about indicating DRM 
restrictions, but that may or may not be within our actual mandate to 
tackle.

I completely agree with including the narrator. I know from experience at 
CNIB, for example, that readers become attached to listening to certain 
narrators. It's an art in itself. Both DAISY and EPUB 3 accommodate this 
information in the metadata, but as that metadata is in the package, a means 
of exposing it on a web page for discovery is beneficial.

But no, we're not look at a new class. The trail we've been following was 
blazed by the LRMI initiative, which contributed educational metadata to the 
CreativeWork class. My focus is primarily on ebooks, but the project itself 
seeks to enhance all creative works.

(And sorry for the slow responses, I've been touring my eight week old 
little girl around to see family and not online.)

Matt

-----Original Message----- 
From: Karen Coyle
Sent: Monday, July 29, 2013 4:47 PM
To: Matt Garrish
Cc: public-vocabs@w3.org
Subject: Re: extensions and expected values

On 7/29/13 12:59 PM, Matt Garrish wrote:
> Hi Karen,
>
> Thanks for sharing that proposal; finding everything that is going on in
> schema.org is also not easy.

No kidding. I had no idea your group existed, nor that you were working
on DAISY.

It does raise another question we had,
> which was whether to extend bookFormatType for audio books, braille
> books, etc. (e.g., is Paperback/Braille an accurate presentation of a
> print braille book, given the three existing types). As you likely know,
> the flexibility of the DAISY formats means that publications can take
> the form of text-only ebooks, structured audiobooks, or some combination
> of both (and now EPUB 3 allows similar production), so the formats could
> be extensions of both ebooks and audiobooks. These are some of the grey
> areas of metadata application, of course.

The BibEx group came to the conclusion that audiobooks aren't just books
+ bookFormatType because we wanted to add at least one new property:
readBy.[1] We also noted that in the commercial audiobook sector the
abridged- and unabridged-ness are prominently displayed and are
considered different products. We were going to associate this with
audiobook, but thought that it might be useful for print and ebooks. (A
few of us are old enough to remember the Reader's Digest Condensed Books
series.)

DAISY *is* tricky because it is a combination, although note that
Amazon's "WhisperSync" also combines ebook and performed audio. I see
your dilemma in terms of where to put it, and wonder if it couldn't be
type with two formats: ebook and audiobook. (How to do that in schema,
though, is not obvious to me.) Also, doesn't DAISY sometimes get some
specific DRM/rights issues applied, like being certified as a visually
impaired person (in the US) to get access to materials in copyright?
That complicates matters.

>
> I don't speak for the group, but my concern with going by the player is
> that there can be many players that can play open formats like DAISY and
> EPUB (excluding the DRMed walled gardens, of course). When searching for
> accessible formats, there are fewer of them than there are players that
> can play them, which is why people often look by format.


Are you intending for accessibility to be a schema.org class? I can't
tell from your proposal where it would fit in.[2] It brings to mind the
potential faceted nature of some classes - I think of them as "other
characteristics" rather than "essences." It would be great to be able to
use accessibility anywhere, either to describe an online resource or to
describe a product. Then some of the accessibility properties could be
related to file types, and others to actual devices. (It's not an easy
area to tease apart.)

In terms of faceting, to me aspects like color, size, weight, location
(of the thing), rights (complex, of course)... those all lend themselves
to be free-floating characteristics that could be applied wherever
appropriate. This is one of the design goals of library classifications
-- to classify things by their "essence" but to have the ability to add
other characteristics that are useful either as qualifiers or even to be
used on their own. I often long for a good set of qualifiers to use in
semantic web ontologies. It shouldn't be all that difficult to add a
dozen or so key ones. I'd be tempted to add them to /Thing, although I
haven't thought it through thoroughly.

kc

[1] Some of the performers of these audiobooks have become so popular
that libraries that carry them had to go back and make the name of the
performer searchable. There are audiobook listeners who would listen to
almost anything read by their favorite performers. Myself, if Simon
Prebble read a grocery list, I'd listen gladly.

[2] http://www.w3.org/wiki/WebSchemas/Accessibility
>
> We're trying to make the most effective use of the existing properties
> in conjunction with the defining access modes, accessible media
> features, etc. to ideally enable users to search for content more
> suitable to their needs, and avoid the current problem of having to
> investigate all search results to find appropriate content.
>
> But when it comes to making effective use, the existing documentation
> doesn't fill in all the gaps. Some examples in schema.org don't use the
> expected types to express information, for example (author names as
> strings instead of Person types), but is it more or less bad from a
> search perspective to work around unknowns?
>
> Matt
>
> -----Original Message----- From: Karen Coyle
> Sent: Friday, July 26, 2013 9:51 AM
> To: public-vocabs@w3.org
> Subject: Re: extensions and expected values
>
> Matt, the Bibliographic group is close to a proposal on audiobook [1].
> We were thinking of using "playerType" from audioObject. Whatever
> solution you select (because it is more important for accessibility) we
> should probably incorporate by sub-classing.
>
> kc
> [1] http://www.w3.org/community/schemabibex/wiki/Audiobook
>
>
> On 7/26/13 5:44 AM, Matt Garrish wrote:
>> Hello folks,
>> A couple of questions which I hope aren’t too naive, but between reading
>> this list and what documentation exists, it’s not always clear what best
>> practices are for schema.org.
>> First, should we (the accessibility metadata group) be recommending that
>> people use the “/” extension mechanism for accessible book formats like
>> EPUB and DAISY. The extension page notes
>> “http://schema.org/EBook/KindleFormat” as an extension of the
>> bookFormatType enumeration, but is that URI syntax expected to last?
>> Should we instead recommend using a more basic string like “DAISY3” or
>> “EPUB3” to avoid future problems, or is the use of string values with
>> bookFormat problematic in itself? Recent discussions on this list have
>> cast some doubts.
>> And that leads to the other question we have, which is what to do when a
>> needed data type doesn’t have an exact match in schema.org? If you have
>> adapted a work to make it accessible and want to note that it is an
>> adaptation of another, should we indicate an expected value of URL even
>> though a URI is wanted, since URNs may be the only usable identifiers?
>> In other words, is usage context more important than the expected value?
>> For example, if I use link/@href <mailto:link/@href> for URLs and
>> meta/@content <mailto:meta/@content> for URNs, does it matter that the
>> expected value is URL because it’s expected that most adaptations will
>> have a referenceable source on a publisher’s site?
>> Thanks in advance for any insights that can be provided,
>> Matt
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
Received on Thursday, 1 August 2013 01:21:01 UTC