- From: David Woolley <david@djwhome.demon.co.uk>
- Date: Mon, 2 Dec 2002 20:52:45 +0000 (GMT)
- To: w3c-wai-ig@w3.org
> > > Maybe we should define what me mean by pdf. > Pdf is not one but several file formats all having the one extension. It's one, flexible file format, that can convey different forms of information in different combinations, although the tagging feature is a new addition. It's basically a combination of graphic primitives for drawing the pages and structuring information (which in earlier versions included the outline structure of the document, and the sequence of an article in the sort of magazine where an article skips a few pages and may have the final column in a convenient gap. Pure scanned PDFs have basically one graphic primitive, an image that fills the whole page, but that is no different from the smaller graphics on a page with real text, in terms of the primitives used. Different profiles of the contents have more to do with the authoring tool than the file format, so, if you wanted to, you would have to refer to a scan with OCRed underlay as a Acrobat Capture document, although Acrobat Capture can, presumably, be configured to create one without the underlay. Many graphic applications may produce the latter. If GIMP currently doesn't, it could easily be made to, so naming those by tool is more difficult.
Received on Monday, 2 December 2002 16:18:47 UTC