RE: WCAG WG discussion of "authored unit" from Rhys Lewis on 2004-07-30 (www-di@w3.org from July 2004)

From: Rhys Lewis <rhys.lewis@volantis.com>
Date: Fri, 30 Jul 2004 09:20:52 +0100
To: "Wendy A Chisholm" <wendy@w3.org>
Cc: "Rotan Hanrahan" <Rotan.Hanrahan@MobileAware.com>, <www-di@w3.org>, <w3c-wai-gl@w3.org>
Message-ID: <D18551A6C7EA6241B960D4909EB75DF501CDCEB1@squid.volantis-uk>
Hello Wendy, 

Thanks for the pointer to the discussions on authored units.

Rotan has, in his usual eloquent style, described the thinking behind DIWG's views on resources, URIs and authored units. 

Perhaps I could add a few comments on the origins of the term within DIWG which
may help explain our thinking. After this description, I'll attempt to
define what I see as the difference between the DIWG definition and your 
proposal.

The needs of device independence have an influence on the way people think about
the relationship between the material that is authored, the material that is
delivered to the user agent and the resulting user experience. In the
'traditional' web, there is rough equivalence between these sets of material. 
An author creates a 'web page' which is transferred, in its entirety, to a browser and is then rendered. Images and other media complicate this model slightly, but I'll return to them later.

The needs of device independence cause us to need to make distinctions between
the chunks of material that an author creates (authored units), the chunks that are delivered to the device (delivery units) and the material that the end user
perceives (perceivable units). 

Why should this be? Well it's all to do with size. Again, I need to use an example. Apologies for favouring visual modalities, but they are the ones with which I'm most familiar. Suppose a site presents a large complex page with lots
of data and function when used on a typical desktop machine. Let's also assume
that this entire page is authored as a single entity. The page is the authored
unit. How can such a page be adapted for use on a small mobile device with a much smaller screen and much lower storage capability than a desktop machine?

One approach is to transform the complete page into a set of smaller pages that
can be delivered sequentially to the device. You could think of this as 
transforming the page into an equivalent mini web site. By the way, the nature
of the transformations involved in doing this is the subject of specific work within the DIWG.

The individual portions of the original material are the delivery units. These
are the units in which material is transferred to the device. The final step
involves how the browser interprets the material in the delivery unit. Many 
browsers will, of course, render the entire material as a single entity. In this
case, the perceivable unit is the same as the delivery unit. The user can
experience the entire contents of the delivery unit without further navigation.

However, there are also browsers that can interpret the material in the 
delivery unit and present it as a series of individual user experiences that 
are themselves linked via navigation. These individual user experiences are the perceivable units. In this case, of course, the navigation is entirely
within the device. Mobile phones that support WML are good examples of this behaviour. For these devices, the WML 'deck' corresponds to the delivery unit,
and individual 'cards' correspond to the perceivable units'.

Ok, so where do images and other media fit? Well, these are just other examples
of authored material, as Rotan pointed out. So once again, these are authored
units. My previous example is a bit unrealistic in that large, complex pages are
rarely authored as single units in practice. They tend to be assembled from 
a number of authored units, some of which might be shared across many pages.
Toolbars, menus, icons, backgrounds, copyright statements, disclaimers are all
examples of shared entities that might be created as authored units and are
then used by inclusion. Everything else on a particular page or referenced by
it is also composed of one or more authored units.

So just to summarise, DIWG needed a really general term for the bits that
authors create. We decided to call these things authored units. We don't 
require them to have any particular structure, though we do believe that they
are examples of web resources. We've used the term fairly extensively in our documents.

I think there is a real distinction between your proposed definition and our
published one that makes them incompatible. I think the distinction is that for
DIWG, an authored unit is a single resource, for WCAG it could be a group of
resources. Like Rotan, I'm using the term resource in the sense of the DI 
Glossary (http://www.w3.org/TR/di-gloss/) and HTTP definitions, as something
that is referenced by a URI.

We don't currently have a term for anything that appears to match your proposal.
However, I can see real value in such a definition for DI. For example, your 
definition would provide a term for a group of authored units which includes
some markup that defines the rules for selecting different variants of a media
resource together with each of those variants. The variants might include 
images of different sizes and types, audio clips and text for use in different
delivery contexts. I can see a real need for a term that names this kind of
coherent set of authored units within DI. Presumably you see a need within WCAG.

So, in summary, I think this is a great definition, but I would urge you to try and find a different term for it. I think it will cause confusion if we use the
same term for different meanings. I'm sure DIWG would be only too happy to help try and find another term for this concept. 

Very best wishes

Rhys Lewis, chair DIWG

-----Original Message-----
From: www-di-request@w3.org [mailto:www-di-request@w3.org]On Behalf Of
Rotan Hanrahan
Sent: 29 July 2004 23:31
To: Wendy A Chisholm; www-di@w3.org; w3c-wai-gl@w3.org
Subject: RE: WCAG WG discussion of "authored unit"


I am not fully familiar with the details of the discussions you 
have undertaken recently that led to the debate [1] on the 
definition of “authored unit”, but I am sufficiently familiar with 
the concept to (perhaps) offer some insight.
 
You said:
<propose>
An "authored unit" includes a resource identified by a URI, 
together with any other resources directly or indirectly 
associated with it, that may be retrieved in the course of 
rendering or otherwise processing it.
</propose>
 
This proposal echoes the general viewpoint held in DIWG. One must, 
of course, interpret this definition within the context of the 
concept of “resource”. A resource, for our purposes, is an 
identifiable entity that has at least one perceptible expression. 
Thus a piece of text would be a resource on the basis of being 
perceptible in visual form (or perhaps as synthesised speech).

However, this is a particularly narrow example. In the cases 
considered by DIWG, the concept of resource is expanded, yet still 
within the reasonable understanding of what we mean by the Web. 
Thus a better example would be “Today’s Weather Forecast”. This is 
also a resource. However, it has some interesting features:

Firstly, it is temporally dependent. The perception you obtain 
today may not be the same perception you obtain 24 hours hence. 
(Unless perhaps you are at the North Pole.)
 
Secondly, it may have multiple modes of perception, according to 
delivery context.

It is the second property that I will explore. If the delivery 
context mandates that the only mode of perception is through text 
(roughly assumed to be a form of “print”) then today’s weather 
should be delivered as text. Wind directions will be expressed 
textually (bearing West-North-West), as will be the form of 
precipitation (snow).

If, however, some graphical form is permitted (or preferred?) then 
directional arrows and images of fluffy white ice crystals may be 
appropriate.
 
So, “today’s weather forecast” is a resource comprising several 
other resources (text, images, sound, styles, etc.). The 
sub-resources (if you will permit me to use such a term without a 
formal definition, on the basis that it should be obvious) are 
also identifiable entities. Thus a URI can be associated with 
“today’s weather forecast” and other URIs can be associated with 
the sub-resources. Of course, referring to an individual 
sub-resource may not be meaningful, as such a reference may lose 
information when taken out of the context of the information 
provided by the other sub-resources. (Example: the image of the 
snowflake is meaningless unless considered in the context of the 
surrounding text, which may say “starting cold in the morning”, as 
opposed to “continuing cold this evening”).

Where are the “authored units” in all this? Firstly, consider the 
concept of authoring. It is necessary that someone create the text 
to describe the blizzard. Thus this text is authored. I also 
assume that some artist created the snowflake. Human expression is 
involved here. This is a traditional understanding of the 
authoring process. Yet the text author may have produced several 
versions of text describing the blizzard. There’s the one-liner 
for headlines and instant messaging devices. There’s the short 
paragraph for small screens, and there’s the twelve-verse poem 
produced while trapped indoors. Only one of these (if any) will be 
used. Some might also be used in conjunction with the snowflake 
image.

Which brings me to another point. The image may look good in 
shades of blue and silver, but perhaps the device is limited to 
monochrome. In this case, an adaptation mechanism may 
automatically transform the image to monochrome. The artist did 
not produce the resulting image. Is this still something that was 
authored? I think so. The adaptation process is acting on behalf 
of the artist (though possibly not with the artist’s approval!). 
Thus the adapted image is also authored.

To me, and I believe this also applies to (most of) my colleagues 
in DIWG, something that is authored is something that was created 
by someone, or created by a mechanism that was in turn created by 
someone, or selected from a set of creations, or created through 
an adaptation of existing creations.

Why have I avoided the term “unit”? This has been a recurring 
problem. The idea of a unit suggests indivisibility. Yet we would 
generally think of a document comprising text and images as being 
a unit. We can then go on to talk about the authoring of the text 
independent of the authoring of the images. Thus “unit” is 
possibly a bad term. Yet the term is familiar and persists in 
usage.

So an Authored Unit is a set of one or more entities obtained 
through initial authoring (typically involving human effort), or 
processes (including adaptation) such that collectively they 
provide a perceptible expression (within the delivery context) of 
a resource (identified by a URI).
 
Thus the AU must be associated with a specific URI and a delivery 
context, and it may comprise several other entities, potentially 
themselves AUs. It is not necessary for the specific combinations 
of entities to be known in advance, but it should be possible to 
demonstrate or prove that any AU obtained from creations 
associated with a resource is deterministically derivable via the 
delivery context. You can then go on to impose requirements on any 
AUs so derived. For example, you can require that an image within 
the AU will not exceed the visual presentation capabilities 
indicated by the delivery context.

In that last example, it is not necessary that the author of an 
image be able to offer such guarantees, but anyone who provides an 
image transformation solution will have to offer such a guarantee 
if the presentation of the resource is to be acceptable.

The current DIWG Glossary [2] defines an Authored Unit as:

<defn>
Some set of material created as a single entity by an author. 
Examples include a collection of markup, a style sheet, and a 
media resource, such as an image or audio clip.
</defn>
 
The “set” in the definition suggests the set I mentioned earlier. 
The “single entity” alludes to the fact that the set itself can be 
referenced. The definition does not require that the members of 
the set can themselves be referenced, though the example supplied 
would suggest this may be the case.

I am sure that if your deliberations reach a superior definition 
of Authored Unit then it would likely find its way in to a 
revision of our glossary.
 
I wish you luck.
 
--- Rotan Hanrahan
Chief Innovations Architect
MobileAware
(Member DIWG)
 
[1] http://lists.w3.org/Archives/Public/w3c-wai-gl/2004JulSep/0203.html

[2] http://www.w3.org/TR/di-gloss/



 -----Original Message----- 
 From: Wendy A Chisholm [mailto:wendy@w3.org] 
 Sent: Thu 29/07/2004 20:00 
 To: www-di@w3.org 
 Cc: 
 Subject: WCAG WG discussion of "authored unit"
 
 


 Hello, 

 I would like to raise your awareness about a discussion in the WCAG WG 
 about the definition of "authored unit."  We plan to use the term in our 
 next public Working Draft of WCAG 2.0. 

 Thread begins at: 
 <http://lists.w3.org/Archives/Public/w3c-wai-gl/2004JulSep/0203.html> 

 Advice and comments encouraged.  We plan to publish our next draft within 
 the next week. 

 Best, 
 --wendy 

 -- 
 wendy a chisholm 
 world wide web consortium 
 web accessibility initiative 
 http://www.w3.org/WAI/ 
 /--
Received on Friday, 30 July 2004 04:42:33 UTC