Re: Proposal: HTML use of remote alt text; matching extension/formalization for HTTP Content-Disposition & MIME from Sai on 2020-10-17 (public-apa@w3.org from October 2020)

From: Sai <sai@fiatfiendum.org>
Date: Sat, 17 Oct 2020 16:34:55 +0100
To: David Singer <singer@apple.com>
Cc: public-html@w3.org, public-apa@w3.org, public-aria@w3.org, Rens Troost <rens@century.com>, Keith Moore <moore@network-heretics.com>, Julian Reschke <julian.reschke@gmx.de>, Steve Dorner <sdorner@qualcomm.com>, Ned Freed <ned.freed@mrochek.com>, Nathaniel Borenstein <nsb@guppylake.com>, Ed Levinson <XIson@cnj.digex.net>, Leonard Rosenthal <lrosenth@adobe.com>
Message-ID: <CAHs-R5ztz8wW=uXSpLFM9OBgrt1vU0ORSBwfZODbHJZmUHeAvw@mail.gmail.com>
Thanks for the highlight.

See also this proposal re XMP (another image metadata embedding),
highlighted on W3C's github by Leonard Rosenthal (CCed):

Expose XMP information to the browser (and potentially JavaScript) #5890
https://github.com/whatwg/html/issues/5890

Cross-referencing:
https://www.w3.org/html/wg/wiki/Metadata
https://www.w3.org/wiki/ImageDescription
https://www.w3.org/TR/mediaont-api-1.0/

Sincerely,
Sai
President, Fiat Fiendum, Inc., a 501(c)(3)

PS Non-gendered pronouns please. I'm a US citizen.


On Thu, Oct 15, 2020 at 11:06 PM David Singer <singer@apple.com> wrote:

> - some lists
>
> You should probably be aware that we recently amended the base HEIF format
> (which is not tied to HEVC, and indeed lays under AVIF) to allow for
> intrinsic alt text(s) (possibly plural in multiple languages), as this
> enables the image creator to make that text at the time of file creation,
> and for it to travel automatically with the image.
>
> > On 14Oct, 2020, at 11:23 , Sai <sai@fiatfiendum.org> wrote:
> >
> > # Recipient list
> >
> > WHATWG: html, html-aam, html-aria
> > W3C WGs: html, apa, aria, webapps
> > IETF WGs: httpbis, 822ext
> >
> > CC authors of (current) prior RFCs: Nathaniel Borenstein, Steve Dorner,
> Ned Freed, Ed Levinson, Keith Moore, Julian Reschke, Rens Troost
> >
> > Cross-posted by email to W3C & IETF groups, and by GitHub to WHATWG at:
> > * main: https://github.com/w3c/html-aam/issues/309
> > * crossposts by reference:
> >   - https://github.com/whatwg/html/issues/6061
> >   - https://github.com/w3c/html-aria/issues/248
> >
> > The content is equivalent, modulo small formatting changes for Markdown
> vs email, and addition of section deeplinks in Markdown version.
> >
> >
> >
> > # Background
> >
> > ## Objective
> >
> > Humans with disabilities, and machines, should have fully equal access
> to the textual content of image and other files.
> >
> >
> > ## Problems with the current specs
> >
> > 1. In current practice, the embedder of content often fails to add alt
> tags, making it inaccessible to people with disabilities and to computers.
> > 2. It is literally impossible for the embedder to describe some content,
> e.g. dynamic images; in such situations, the current specs cannot fulfill
> the goal of accessibility.
> > 3. An image's embedders must describe its content, even though its
> source is better able to do so, both practically and authoritatively.
> > 4. Human effort is wasted by requiring many end users to write content
> descriptions for a single source file.
> > 5. Updates to the HTTP Content-Disposition header spec failed to include
> Content-Description in the spec.
> > 6. MIME/HTTP Content-Description is equivalent to HTML LONGDESC
> (narrative description). There's no current field equivalent to ALT
> (verbatim content in text form).
> >
> >
> >
> > ## Relevant prior RFCs
> >
> > ### HTTP/1.1 Content-Disposition header & Content-Description field
> >
> > * RFC 2616 [obsolete] Hypertext Transfer Protocol — HTTP/1.1
> >   - https://tools.ietf.org/html/rfc2616
> >   - § 15.5 Content-Disposition Issues (security)
> >   - § 19.5.1 Content-Disposition
> > * RFC 7231 [current, no updates] Hypertext Transfer Protocol (HTTP/1.1):
> Semantics and Content
> >   - https://tools.ietf..org/html/rfc7231
> >   - Appendix B Changes from RFC 2616
> >     "The Content-Disposition header field has been removed since it is
> now defined by [RFC6266]."
> >
> > * RFC 1806 [obsolete] Communicating Presentation Information in Internet
> Messages: The Content-Disposition Header
> >   - https://tools.ietf..org/html/rfc1806
> >   - § 3 (Content-Description only in examples)
> > * RFC 2183 [current, no relevant updates] Communicating Presentation
> Information in Internet Messages: The Content-Disposition Header Field
> >   - https://tools.ietf.org/html/rfc2183
> >   - § 2 The Content-Disposition Header Field
> >   - § 2.8 Future Extensions and Unrecognized Disposition Types
> >   - § 3 Examples (only section mentioning Content-Description)
> > * RFC 6266 [current, no updates] Use of the Content-Disposition Header
> Field in the Hypertext Transfer Protocol (HTTP)
> >   - https://tools.ietf.org/html/rfc6266
> >   - Note: has no mention of Content-Description
> >
> >
> > ### HTML
> >
> > * RFC 1866 [obsolete] Hypertext Markup Language - 2.0
> >   - https://tools.ietf..org/html/rfc1866
> >   - § 5.10 Image: IMG (ALT tag)
> > * RFC 2854 [current, informational] The 'text/html' Media Type
> >   - https://tools.ietf.org/html/rfc2854
> >   - (standard transferred from IETF to W3C)
> >
> > * HTML 4.01
> >   - https://www.w3.org/TR/html401/struct/objects.html
> >   - § 13 Objects, Images, and Applets
> >   - § 13.2 Including an image: the IMG element (longdesc URI)
> >   - § 13.8 How to specify alternate text (alt text)
> >
> > * HTML 5
> >   -
> https://html.spec.whatwg.org/multipage/embedded-content.html#the-img-element
> >   - https://html.spec.whatwg.org/multipage/images.html
> >   - https://html.spec.whatwg.org/multipage/input.html
> >   - https://html.spec.whatwg.org/multipage/rendering.html
> >   - § 4.8.3 The img element
> >   - § 4.8.4 Images
> >   - § 4.8.4.4 Requirements for providing text to act as an alternative
> for images
> >   - § 4.10.5 The input element
> >   - § 4.10.5.1.19 Image Button state (type=image)
> >   - § 14 Rendering
> >   - § 14.4.2 Images
> >
> > * HTML Accessibility API Mappings (AAM)
> >   - https://w3c.github.io/html-aam/
> >   - § img Element Accessible Name Computation
> >   - § input type="image" Accessible Name Computation
> >
> >
> > ### MIME Content-Description header
> >
> > * RFC 1341 [obsolete] MIME (Multipurpose Internet Mail Extensions)
> >   - https://tools.ietf.org/html/rfc1341
> >   - § 6.2 Optional Content-Description Header Field
> > * RFC 1521 [obsolete] MIME (Multipurpose Internet Mail Extensions) Part
> One: Mechanisms for Specifying and Describing the Format of Internet
> Message Bodies
> >   - https://tools.ietf.org/html/rfc1521
> >   - § 6.2 Optional Content-Description Header Field
> > * RFC 2045 [current, no relevant updates] Multipurpose Internet Mail
> Extensions (MIME) Part One: Format of Internet Message Bodies
> >   - https://tools.ietf.org/html/rfc2045
> >   - § 8 Content-Description Header Field
> >   - § 9 Additional MIME Header Fields
> >
> > * RFC 1872 [obsolete] The MIME Multipart/Related Content-type
> >   - https://tools.ietf.org/html/rfc1872
> >   - § 4 Examples (mentions Content-Description)
> > * RFC 2112 [obsolete] The MIME Multipart/Related Content-type
> >   - https://tools.ietf.org/html/rfc2112
> >   - § 4 Handling Content-Disposition Headers
> >   - § 5 Examples (mentions Content-Description)
> >   - § 5.3 Content-Disposition
> > * RFC 2387 [current] The MIME Multipart/Related Content-type
> >   - https://tools.ietf.org/html/rfc2387
> >   - § 4 Handling Content-Disposition Headers
> >   - § 5 Examples (mentions Content-Description)
> >   - § 5.3 Content-Disposition
> >
> >
> > ### EXIF
> >
> > * CIPA DC-008-2012 Exchangeable image file format for digital still
> cameras: Exif Version 2.3
> >   - http://www..cipa.jp/std/documents/e/DC-008-2012_E.pdf
> >   - § 4.6.4 TIFF Rev. 6.0 Attribute Information (ImageDescription tag)
> >
> >
> >
> > ## Discussion
> >
> > 1. In current practice, the image's source server will often have all
> the necessary metadata to describe the image.
> >
> > It could put this metadata in its HTTP headers. However, this
> information is either not transmitted, or not used.
> >
> > Some image formats provide for the necessary metadata. However, these
> are rarely used — typically, it's stored separately — and not all image
> formats have this support.
> >
> >
> > 2. Some images used in HTML are deliberately dynamic.
> >
> > Consider the various dependency/test status images used on GitHub.
> >
> > Only the remote image server knows, at display time, what the image
> represents. This is because it runs test suites on the most recent version
> of the codebase, checks the current status of servers, monitors third-party
> published vulnerabilities or library updates, etc.
> >
> > Example: https://github.com/atom/atom
> >
> > The first 3 images in the README section are:
> > a. Azure Pipelines build/test/integration status
> > b. David Dependency Manager dependencies update status
> > c. Heroku/Slack server status [this image currently doesn't load]
> >
> > The correct alt text for these, at time of writing, should be:
> > a. Azure Pipelines succeeded
> > b. dependencies up to date
> > c. Heroku is offline for maintenance
> >
> > It's impossible for the author of README.md, or GitHub itself, to know
> any of this before the user agent actually fetches the image.
> >
> > As a result, people using a screen reader get zero information from
> these images, whereas sighted users know the live statuses .
> >
> > (Pedantic caveat: actually, GitHub runs a caching proxy server on such
> images; they aren't fetched by the user agent directly from the
> authoritative server. However, this is functionally transparent.)
> >
> >
> > 3. Content-Description is not defined equivalently to ALT text.
> >
> > Content-Description is defined as "some descriptive information" (RFC
> 2045 § 8). All examples in the RFCs are either narrative, e.g. "just a
> small picture of me" (RFC 2183 § 3), or useless, e.g. "jpeg-1" (id.).
> >
> > By contrast, ALT text is meant to be the nearest equivalent — which, in
> the case of simple images of short text, is the verbatim text.
> >
> > HTML 4.01 (§ 13.8) describes it as "alternate text to serve as content
> when the element cannot be rendered normally".
> >
> > HTML 5 describes it as "equivalent content for those who cannot process
> images or who have image loading disabled (i.e. it is the img element's
> fallback content)" (§ 4.8.3). It "should never contain text that could be
> considered the image's caption, title, or legend. It is supposed to contain
> replacement text that could be used by users instead of the image; it is
> not meant to supplement the image" (§ 4.8.4.4.1).
> >
> > AFAICT, there is no equivalent field in either HTTP or MIME. There could
> and should be.
> >
> >
> >
> > # Proposals
> >
> > ## Mime — Content-Text
> >
> > Update RFC 2045 to add the header Content-Text, defined as follows.
> >
> > Content-Text should contain that text, following the specifications in
> WHATWG HTML 5 § 4.8.4.4.
> >
> > All files should include this header if:
> > 1. the file is not a TXT/* MIME type, and
> > 2. the file semantically (if not digitally)
> >   a. contains text, or
> >   b. has a text equivalent
> >
> >
> > ## HTTP — Content-Disposition
> >
> > Update RFC 2183 and RFC 6266 to change the Content-Disposition header as
> follows:
> >
> > 1. formalization of Content-Description
> >    Re-add the Content-Description field, as defined in RFC 1521 § 6.2.
> > 2. addition of Content-Text
> >    Add the field Content-Text, defined identically to the RFC 2045
> update above, by reference.
> >
> >
> > ## HTML-AAM — IMG and INPUT type=image
> >
> > Insert the following before the "none of the above" option in the
> HTML-AAM accessible name computation instructions:
> >
> > When an ALT or TITLE attribute is not available, use the first available
> of the following:
> > 1. equivalent metadata in the image file, e.g. the
> Exif.Image.ImageDescription field
> > 2. image's HTTP Content-Disposition header's Content-Text field
> > 3. image's HTTP Content-Disposition header's Content-Description field
> >
> >
> > ## HTML — no change
> >
> > There is deliberately no change proposed to the HTML spec itself.
> >
> > The purpose of this proposal is to address situations where the HTML
> author does not, or cannot, add the relevant information. Therefore, the
> changes are to user agent behavior, and to the data accessible to user
> agents from sources other than the HTML, i.e. server and file headers.
> >
> >
> >
> > # Intellectual property release
> >
> > All original IP in this proposal is owned jointly by Sai and Fiat
> Fiendum.
> >
> > We freely license it as follows:
> > 1. Copyright: CC-by (attribution-only)
> https://creativecommons.org/licenses/by/4.0/
> > 2. Patentable material: public domain where possible, otherwise CC
> Public Patent License
> https://wiki.creativecommons.org/wiki/CC_Public_Patent_License
> >
> > Sincerely,
> > Sai
> > President, Fiat Fiendum, Inc., a 501(c)(3)
> >
> > PS Non-gendered pronouns please. I'm a US citizen.
>
> Dave Singer
> Multimedia and Software Standards, Apple
>
> singer@apple.com
>
>
>
>
Received on Saturday, 17 October 2020 15:36:07 UTC