Re: versioning and version indicators in PDF

On Thursday 2009-08-20 17:04 -0700, Larry Masinter wrote:
> Although there are differences between PDF and HTML
> in deployment, the compatibility goals (new versions of
> software handles old files, old software gracefully handles
> new files) are quite similar.

I think there's actually a pretty fundamental difference.  With PDF,
my understanding (although I could be wrong) is that there's one
implementation (Adobe's) that defines the correct behavior in case
of ambiguities in the standard, and that other implementations have
to be compatible with if they want to read the PDF content out in
the world.  According to , versions
of PDF have nearly perfect correspondence to major releases of Adobe

With HTML, we'd at least *like* for all implementations to have to
base their work on the standard, rather than having to reverse
engineer the market leader (often a significantly more expensive
process).  But the reality is that content will depend on the
behavior of the market-leading implementation or implementations.
We can make competition easier (and are doing so in the HTML5
effort) by documenting the behavior that users depend on in a freely
available standard as such behavior is discovered.  Having such a
thorough standard also increases the long term value and portability
of data stored in a format, since it makes it easier to write new
tools to handle such a format in the future.

Adding version-indicator-dependent behavior to the market-leading
implementations significantly increases the cost of
reverse-engineering their behavior well enough to be compatible
enough for user requirements, and thus makes it significantly harder
for competitors to enter the market, and potentially for users to
access their own data with new tools.  And the cost of standards
maintenance makes it significantly less likely that the
compatibility requirements of users, which tend to increase over
time as implementations converge, will be reflected in maintenance
of all versions of the specification.

I think version-indicator-dependent behavior is a good idea for
formats whose versions correspond to versions of a single software
product, but bad for formats whose versions should not be tied to a
particular product.  In the former case, version-indicator-dependent
behavior simply reduces the chance of breaking compatibility, but in
the latter case it dramatically increases the costs of behavior
convergence for increased compatibility, to the point where the cost
of a new version is a significant portion of the cost of an entirely
new format.

I wrote about this at (slightly) greater length at


L. David Baron                       
Mozilla Corporation             

Received on Friday, 21 August 2009 03:09:16 UTC