W3C home > Mailing lists > Public > semantic-web@w3.org > May 2013

Re: Petitioning ISWC to allow Web friendly formats

From: Austin William Wright <aaa@bzfx.net>
Date: Tue, 7 May 2013 01:08:17 -0700
Message-ID: <CANkuk-V3o2Dyj3e5sSHE3FJu3M3nNk3w00WRerrsf85Bfw8ULA@mail.gmail.com>
To: Leonard Rosenthol <lrosenth@adobe.com>
Cc: Linking Open Data <public-lod@w3.org>, SW-forum <semantic-web@w3.org>, "beyond-the-pdf@googlegroups.com" <beyond-the-pdf@googlegroups.com>
On Mon, May 6, 2013 at 6:33 AM, Leonard Rosenthol <lrosenth@adobe.com>wrote:

> On 5/6/13 9:24 AM, "Sarven Capadisli" <info@csarven.ca> wrote:
>
> That is a definition that YOU have chosen. It is not one that is used by
> any official standards body, government regulation, etc. As such, it's use
> creates confusion amongst the uninformed user and that's certainly
> something none of us want.
>

Government regulation? If Congress ruled that HTML is the One Superior
Format to use, would that make it so? Why hasn't anyone actually discussed
the merits of the proposal?

PDF is blatantly not web-friendly. If you're going to use PDF, you may as
well use SVG or <font> tags for everything instead. There's very little use
you can get out of this, except to ensure that there's a canonical drawing
of the text onto a fixed-size sheet of paper. How terribly (not)
impressive. Maybe you have a printer that prints pages the same size as the
PDF file and you'd like a hard copy, or maybe you want to count pages that
you've written. That's about the extent of the functionality that PDF can
give users (while the format does support such features as code-on-demand
and metadata, this isn't part of what is being asked when PDF is asked
for). Otherwise there's not very much use for this format, it is not very
portable, contrary to the principles of the Web. What if you want to read
the paper on a mobile device, like my e-reader? I take great pains to
convert stuff to HTML (specifically, epub) so I can read it (see e.g. <
https://github.com/Acubed/rfc-html>).

Being able to view a PDF in a web browser doesn't imply anything. You can
use the Web to describe _anything_, so I don't find this argument
convincing.

I assume most authors don't actually format their documents by selecting a
font size for every single heading and so on. They work in a format that
utilizes semantically meaningful information about the work: to identify a
title, headings, math blocks, illustrations, plots, etc. Why should all of
this information get stripped away for the sake of seeing what it'll look
like on paper... I'm not sure, really, what the goal is. If I wanted to see
what it'll look like, or get a print copy, I'd navigate over to the "Print"
menu option.

While I would find HTML the best because of it's ubiquitous support across
devices, even posting TeX sources would be an improvement. And then from
that you can offer PDF versions alongside the sources -- but it's not a
substitute for the real thing.

The point of HTML is I can grok it on anything: web browsers, robots,
e-readers, printing, hyperlinking, etc. With PDF, a ton of semantically
useful information is (probably, usually) being discarded.

Austin Wright.
Received on Tuesday, 7 May 2013 08:08:45 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:33 UTC