Re: storing info in XSL-FO: new issue? [was: Draft TAG Finding:...]

Also sprach Dan Connolly:

 > > Indeed, I believe it is an architectural principle, that the TAG
 > > hopefully one day puts in writing, that XSL-FO elements are at a
 > > different semantic level than (X)HTML, SVG and MathML, and that it is
 > > possible to transform the latter three into the former, but that
 > > putting them in the same document would be counter to the goals of the
 > > semantic Web. In particular, W3C promotes (X)HTML, SVG and MathML as
 > > permanent repositories of information and we expect them to be found
 > > on Web servers, but we don't expect XSL-FO's to be used for anything
 > > else than as a volatile format, that only exists in the milliseconds
 > > between the formatting and the printing of the results.
 > > 
 > > I suggest to fix that sentence by omitting XSL-FOs, or to find a
 > > different example.
 > I'll leave it to the editor(s) to address your suggestion about
 > the text of the finding, but it seems to me that just omitting
 > XSL-FO from that phrase won't address the issue you raise
 > about whether it's appropriate to use XSL-FO to store
 > information.
 > It looks like a new issue.

New to the TAG, but fundamentally it's the old priciple of separating
presentation from content that needs to be confirmed.

The principle has been one of the foundations of the web. For example,
in 1992 Tim Berners-Lee described the design constraints of HTML [1]:

  Logical Markup

  It is required that HTML be a common language between all platforms.
  This implies no device-specific markup, or anything which requires
  control over fonts or colors, for example. This is in keeping with
  the SGML ideal.

Further, the "SGML ideal" is described [2]:

  High level markup

  An SGML document is marked up in a way which says nothing about the
  representation of the document on paper or a screen. A presentation
  program must marge the document with style information in order to
  produce a printed copy. This is invaluable when it comes to
  interchange of documents between different systems, providing
  different views of a document, extracting information about it, and
  for machine processing in general.

In addition to SGML, the same set of views guided the design of Scribe
and LaTex before the web.

After the introduction of the Web, W3C's Style activity has also been
based on the principle of separating presentation from content. 

For example, CSS2 states:

  Style sheets complement structured documents (e.g., HTML and XML
  applications), providing stylistic information for the marked-up

And XSL states:

  An XSL stylesheet processor accepts a document or data in XML and an
  XSL stylesheet and produces the presentation of that XML source
  content that was intended by the designer of that stylesheet.

Thus, it seems like the principle of separating presentation from
content in web documents is well-understood and well-established. 

The recent TAG finding which suggests that XSL FOs is just another XML
vocabulary which can/should be stored/transferred on the web breaks
with this principle since FOs don't separate content from presentation
-- it's all mixed up and one can barely extract the text in
machine-readable form.

I have long argued [5][6] against representing FOs in a syntax since it
opens up for W3C-blessed semantic firewalls and all sorts of
accessibility problems. My concerns have often been rejected by "but
noone plans to use FOs on the web". The recent TAG finding [7]
suggests otherwise, but I'll spare you for too many I-told-you-so's.

Rather, I'd like to encourage the TAG to take a deep breath and take
this on as an issue.



              Håkon Wium Lie                          cto °þe®ª        

Received on Thursday, 15 August 2002 18:53:02 UTC