Re: Proposal: HTML use of remote alt text; matching extension/formalization for HTTP Content-Disposition & MIME

- some lists

You should probably be aware that we recently amended the base HEIF format (which is not tied to HEVC, and indeed lays under AVIF) to allow for intrinsic alt text(s) (possibly plural in multiple languages), as this enables the image creator to make that text at the time of file creation, and for it to travel automatically with the image.

> On 14Oct, 2020, at 11:23 , Sai <sai@fiatfiendum.org> wrote:
> 
> # Recipient list
> 
> WHATWG: html, html-aam, html-aria
> W3C WGs: html, apa, aria, webapps
> IETF WGs: httpbis, 822ext
> 
> CC authors of (current) prior RFCs: Nathaniel Borenstein, Steve Dorner, Ned Freed, Ed Levinson, Keith Moore, Julian Reschke, Rens Troost
> 
> Cross-posted by email to W3C & IETF groups, and by GitHub to WHATWG at:
> * main: https://github.com/w3c/html-aam/issues/309
> * crossposts by reference:
>   - https://github.com/whatwg/html/issues/6061
>   - https://github.com/w3c/html-aria/issues/248
> 
> The content is equivalent, modulo small formatting changes for Markdown vs email, and addition of section deeplinks in Markdown version.
> 
> 
> 
> # Background
> 
> ## Objective
> 
> Humans with disabilities, and machines, should have fully equal access to the textual content of image and other files.
> 
> 
> ## Problems with the current specs
> 
> 1. In current practice, the embedder of content often fails to add alt tags, making it inaccessible to people with disabilities and to computers.
> 2. It is literally impossible for the embedder to describe some content, e.g. dynamic images; in such situations, the current specs cannot fulfill the goal of accessibility.
> 3. An image's embedders must describe its content, even though its source is better able to do so, both practically and authoritatively.
> 4. Human effort is wasted by requiring many end users to write content descriptions for a single source file.
> 5. Updates to the HTTP Content-Disposition header spec failed to include Content-Description in the spec.
> 6. MIME/HTTP Content-Description is equivalent to HTML LONGDESC (narrative description). There's no current field equivalent to ALT (verbatim content in text form).
> 
> 
> 
> ## Relevant prior RFCs
> 
> ### HTTP/1.1 Content-Disposition header & Content-Description field
> 
> * RFC 2616 [obsolete] Hypertext Transfer Protocol — HTTP/1.1
>   - https://tools.ietf.org/html/rfc2616 
>   - § 15.5 Content-Disposition Issues (security)
>   - § 19.5.1 Content-Disposition
> * RFC 7231 [current, no updates] Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content 
>   - https://tools.ietf..org/html/rfc7231 
>   - Appendix B Changes from RFC 2616
>     "The Content-Disposition header field has been removed since it is now defined by [RFC6266]."
> 
> * RFC 1806 [obsolete] Communicating Presentation Information in Internet Messages: The Content-Disposition Header
>   - https://tools.ietf..org/html/rfc1806 
>   - § 3 (Content-Description only in examples)
> * RFC 2183 [current, no relevant updates] Communicating Presentation Information in Internet Messages: The Content-Disposition Header Field
>   - https://tools.ietf.org/html/rfc2183 
>   - § 2 The Content-Disposition Header Field
>   - § 2.8 Future Extensions and Unrecognized Disposition Types
>   - § 3 Examples (only section mentioning Content-Description)
> * RFC 6266 [current, no updates] Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)
>   - https://tools.ietf.org/html/rfc6266 
>   - Note: has no mention of Content-Description
> 
> 
> ### HTML
> 
> * RFC 1866 [obsolete] Hypertext Markup Language - 2.0
>   - https://tools.ietf..org/html/rfc1866 
>   - § 5.10 Image: IMG (ALT tag)
> * RFC 2854 [current, informational] The 'text/html' Media Type
>   - https://tools.ietf.org/html/rfc2854 
>   - (standard transferred from IETF to W3C)
> 
> * HTML 4.01
>   - https://www.w3.org/TR/html401/struct/objects.html 
>   - § 13 Objects, Images, and Applets
>   - § 13.2 Including an image: the IMG element (longdesc URI)
>   - § 13.8 How to specify alternate text (alt text)
> 
> * HTML 5
>   - https://html.spec.whatwg.org/multipage/embedded-content.html#the-img-element
>   - https://html.spec.whatwg.org/multipage/images.html
>   - https://html.spec.whatwg.org/multipage/input.html
>   - https://html.spec.whatwg.org/multipage/rendering.html
>   - § 4.8.3 The img element
>   - § 4.8.4 Images
>   - § 4.8.4.4 Requirements for providing text to act as an alternative for images
>   - § 4.10.5 The input element
>   - § 4.10.5.1.19 Image Button state (type=image)
>   - § 14 Rendering
>   - § 14.4.2 Images
> 
> * HTML Accessibility API Mappings (AAM)
>   - https://w3c.github.io/html-aam/
>   - § img Element Accessible Name Computation
>   - § input type="image" Accessible Name Computation
> 
> 
> ### MIME Content-Description header
> 
> * RFC 1341 [obsolete] MIME (Multipurpose Internet Mail Extensions)
>   - https://tools.ietf.org/html/rfc1341 
>   - § 6.2 Optional Content-Description Header Field
> * RFC 1521 [obsolete] MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies
>   - https://tools.ietf.org/html/rfc1521
>   - § 6.2 Optional Content-Description Header Field
> * RFC 2045 [current, no relevant updates] Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies
>   - https://tools.ietf.org/html/rfc2045 
>   - § 8 Content-Description Header Field
>   - § 9 Additional MIME Header Fields
> 
> * RFC 1872 [obsolete] The MIME Multipart/Related Content-type
>   - https://tools.ietf.org/html/rfc1872 
>   - § 4 Examples (mentions Content-Description)
> * RFC 2112 [obsolete] The MIME Multipart/Related Content-type
>   - https://tools.ietf.org/html/rfc2112 
>   - § 4 Handling Content-Disposition Headers
>   - § 5 Examples (mentions Content-Description)
>   - § 5.3 Content-Disposition
> * RFC 2387 [current] The MIME Multipart/Related Content-type
>   - https://tools.ietf.org/html/rfc2387 
>   - § 4 Handling Content-Disposition Headers
>   - § 5 Examples (mentions Content-Description)
>   - § 5.3 Content-Disposition
> 
> 
> ### EXIF
> 
> * CIPA DC-008-2012 Exchangeable image file format for digital still cameras: Exif Version 2.3
>   - http://www..cipa.jp/std/documents/e/DC-008-2012_E.pdf
>   - § 4.6.4 TIFF Rev. 6.0 Attribute Information (ImageDescription tag)
> 
> 
> 
> ## Discussion
> 
> 1. In current practice, the image's source server will often have all the necessary metadata to describe the image.
> 
> It could put this metadata in its HTTP headers. However, this information is either not transmitted, or not used.
> 
> Some image formats provide for the necessary metadata. However, these are rarely used — typically, it's stored separately — and not all image formats have this support.
> 
> 
> 2. Some images used in HTML are deliberately dynamic.
> 
> Consider the various dependency/test status images used on GitHub.
> 
> Only the remote image server knows, at display time, what the image represents. This is because it runs test suites on the most recent version of the codebase, checks the current status of servers, monitors third-party published vulnerabilities or library updates, etc.
> 
> Example: https://github.com/atom/atom
> 
> The first 3 images in the README section are:
> a. Azure Pipelines build/test/integration status
> b. David Dependency Manager dependencies update status
> c. Heroku/Slack server status [this image currently doesn't load]
> 
> The correct alt text for these, at time of writing, should be:
> a. Azure Pipelines succeeded
> b. dependencies up to date
> c. Heroku is offline for maintenance
> 
> It's impossible for the author of README.md, or GitHub itself, to know any of this before the user agent actually fetches the image.
> 
> As a result, people using a screen reader get zero information from these images, whereas sighted users know the live statuses .
> 
> (Pedantic caveat: actually, GitHub runs a caching proxy server on such images; they aren't fetched by the user agent directly from the authoritative server. However, this is functionally transparent.)
> 
> 
> 3. Content-Description is not defined equivalently to ALT text.
> 
> Content-Description is defined as "some descriptive information" (RFC 2045 § 8). All examples in the RFCs are either narrative, e.g. "just a small picture of me" (RFC 2183 § 3), or useless, e.g. "jpeg-1" (id.).
> 
> By contrast, ALT text is meant to be the nearest equivalent — which, in the case of simple images of short text, is the verbatim text.
> 
> HTML 4.01 (§ 13.8) describes it as "alternate text to serve as content when the element cannot be rendered normally".
> 
> HTML 5 describes it as "equivalent content for those who cannot process images or who have image loading disabled (i.e. it is the img element's fallback content)" (§ 4.8.3). It "should never contain text that could be considered the image's caption, title, or legend. It is supposed to contain replacement text that could be used by users instead of the image; it is not meant to supplement the image" (§ 4.8.4.4.1).
> 
> AFAICT, there is no equivalent field in either HTTP or MIME. There could and should be.
> 
> 
> 
> # Proposals
> 
> ## Mime — Content-Text
> 
> Update RFC 2045 to add the header Content-Text, defined as follows.
> 
> Content-Text should contain that text, following the specifications in WHATWG HTML 5 § 4.8.4.4.
> 
> All files should include this header if:
> 1. the file is not a TXT/* MIME type, and
> 2. the file semantically (if not digitally)
>   a. contains text, or
>   b. has a text equivalent
> 
> 
> ## HTTP — Content-Disposition
> 
> Update RFC 2183 and RFC 6266 to change the Content-Disposition header as follows:
> 
> 1. formalization of Content-Description
>    Re-add the Content-Description field, as defined in RFC 1521 § 6.2.
> 2. addition of Content-Text
>    Add the field Content-Text, defined identically to the RFC 2045 update above, by reference.
> 
> 
> ## HTML-AAM — IMG and INPUT type=image
> 
> Insert the following before the "none of the above" option in the HTML-AAM accessible name computation instructions:
> 
> When an ALT or TITLE attribute is not available, use the first available of the following:
> 1. equivalent metadata in the image file, e.g. the Exif.Image.ImageDescription field
> 2. image's HTTP Content-Disposition header's Content-Text field
> 3. image's HTTP Content-Disposition header's Content-Description field
> 
> 
> ## HTML — no change
> 
> There is deliberately no change proposed to the HTML spec itself.
> 
> The purpose of this proposal is to address situations where the HTML author does not, or cannot, add the relevant information. Therefore, the changes are to user agent behavior, and to the data accessible to user agents from sources other than the HTML, i.e. server and file headers.
> 
> 
> 
> # Intellectual property release
> 
> All original IP in this proposal is owned jointly by Sai and Fiat Fiendum.
> 
> We freely license it as follows:
> 1. Copyright: CC-by (attribution-only) https://creativecommons.org/licenses/by/4.0/
> 2. Patentable material: public domain where possible, otherwise CC Public Patent License https://wiki.creativecommons.org/wiki/CC_Public_Patent_License
> 
> Sincerely,
> Sai
> President, Fiat Fiendum, Inc., a 501(c)(3)
> 
> PS Non-gendered pronouns please. I'm a US citizen.

Dave Singer
Multimedia and Software Standards, Apple

singer@apple.com

Received on Thursday, 15 October 2020 22:07:28 UTC