- From: Claus Thøgersen <thoeg@get2net.dk>
- Date: Thu, 19 Oct 2000 07:24:02 -0700
- To: "Loretta Guarino Reid" <lguarino@Adobe.COM>
- Cc: <w3c-wai-gl@w3.org>
Hi, I like this note it is much better than the notes I supplied! I hope that when this version or any other document dealing with pdf is posted to the web then we can link to the reference manuals and technical notes in html versions? I gues my question is if these documents are available as html files somewhere at the adobe web site? Claus ----- Original Message ----- From: Loretta Guarino Reid <lguarino@Adobe.COM> To: Wendy A Chisholm <wendy@w3.org> Cc: <w3c-wai-gl@w3.org>; <lguarino@Adobe.COM> Sent: Wednesday, October 18, 2000 8:19 PM Subject: Re: Summary of action items, resolutions, and open issues from the F2F > > Action Katie/Loretta: Send PDF techniques to the list. > > ******************** > > Summary from the group discussing PDF issues at the Face-2-Face > meeting Friday morning, based on notes by Claus Thoegerson and Katie > Haritos-Shea. > > Participants: > Loretta Guarino Reid > Sally Hadland > Katie Haritos-Shea > William Loughborough > Tom Pereira > Claus Thoegerson > > Although the morning's topic was to look at the mapping of techniques > for PDF and see how they mapped into the proposed Guidelines and > Checkpoints, most of our time was spent just discussing techniques for > PDF. PDF is a page description language, not a Mark-up language. For > each technique, we attempted to identify the most likely checkpoint > that applies, and the version of PDF in which the language support is > first available. Some of these items refer to language features in PDF 1.4, > which has not yet been released. > > All references to the PDF Reference Manual are to the PDF Reference > Second Edition, Version 1.3. > > > 1. [Guideline 1?] Within a PDF page, there may be a sequence of show string > operations, each with a sequence of Character Codes with associated > fonts. Every such sequence of character codes must map unambiguously > into a sequence of Unicode code points. Mapping is done as follows: > 1a) If the Font contains a ToUnicode entry, convert the Character Code > to Unicode via the ToUnicode CMap. XS > 1b) If the Font uses one of the PDF predefined encodings > MacRomanEncoding, MacExpertEncoding, or WinAnsi Encoding > (perhaps as modified by a DIFFERENCES array in the fonts > encoding resource), use the DIFFERENCES array or Appendix D > to convert the Character Code to an Adobe glyph name. Then use > the Adobe glyph name and look up the corresponding Unicode value. > 1c) If the Font uses one of the predefined CMaps listed in Table 5.14 > on page 320 of the PDF Reference Manual except Identity-H and > Identity-V, convert the Character Code to a Unicode value via > the following steps. > 1) Obtain the Registry and Ordering of the predefined CMap > from the CIDSystemInfo of the appropriate CMap. > 2) Concatenate the Registry and the Ordering according to the > format "<registry>-<ordering>-UCS2" to obtain a second > CMap name, e.g. "Adobe-Japan1-UCS2". Obtain that CMap. > 3) Index into the predefined Cmap, using the Character Code, > and obtain an Intermediate Value. > 4) Index into the CMap obtained in step 2), using the > Intermediate Value, and obtain a Unicode Value. > If any of these four steps fail, e.g. there is no CMap of that > name or the indexing value is missing or undefined in the CMap, > then there is no mapping of the character code to Unicode. > 1d) If the font is a Type 0 font whose descendant CIDFOnt uses > the Adobe-Japan, Adobe-Korea, Adobe-CNS1, or Adobe-GB1 character > collection, as specified in the CIDSystemInfo dictionary, follow > the same steps as in 1c) to obtain the character code mapping. > 1e) If the Font is a Type 1 font whose character names are > taken from the Adobe standard Latin character set and the set > of named characters in the Symbol font, documented in Appendix C, > use the corresponding Unicode value found by looking up the glyph > name. > > 2. Separate words explicitly with spacing characters. Do not rely on > the location of the characters or the division of characters into > showstring operations to indicate word breaks. Note that this implies > that lines of text for western languages usually end with a trailing > space character. > > 3. [Guideline 1.1] If characters are not rendered using the showstring > operation, they must be marked in the page as a Span element with an > ActualText value reflecting the desired Unicode value. (PDF 1.4) > > 4. [Guideline 1.1] All images and other non-text content must have an > Alt property to provide a textual equivalent. (PDF 1.3) > > 5. [Guideline 1.1] Multimedia annotations such as Sounds and > Movies must be accessible. > > 6. [Guideline 2.5] Provide logical structure (PDF Reference Manual > Section 8.4.3) for the document. Map structure types to the standard > structure types described in Adobe Technical Note #5401. (PDF 1.3) > > 7. Set the data access restrictions on the document to permit the > contents to be accessed. In PDF 1.3 and early, permit the text and > graphics in the document to be copied. In PDF 1.4, set accessibility > permission for the document. > > 8. [Guideline 4.1] Use bookmarks and links within a document to > provide navigation aids. > > 9. [Guideline 3.8] Mark tables appropriately with the structure types > described in Adobe Technical Note #5401. (PDF 1.3). > > 10. [Guideline 3.9] Provide expansion attributes for abbreviations and > acronyms. (PDF 1.4) > > 11. [Guideline 2.5] Use the language tagging facilities (Lang) to > specify the natural language of all text in the document. > > 12. Tag Artifacts in the page contents, so that users can control how > and whether they are included in the contents of the > document. (PDF 1.4) Artifacts are either > 12a) Artifacts of the printing process, like crop-box markings > and document file name printed outside the crop box. > 12b) Artifacts of the pagination of the document, that is > elements that would be absent or present in a much different > form if a document was always one big page. like running headers > and page numbers > 12c) Artifacts of the layout process and typographic style, > like a horizontal rule above a footnote. > > 13. Use a soft hyphen, identified by a character that maps to the > Unicode value U+00AD or 173 decimal, when a line-break hyphen is > introduced into the middle of a word. > > We also discussed whether search commands should search Alt text; this > is not a Contents question but a User Agent question. > > >
Received on Thursday, 19 October 2000 01:23:47 UTC