W3C home > Mailing lists > Public > www-talk@w3.org > January to February 1997

Re: Need Help: How to put big documents on the Web

From: John Udall <jsu1@cornell.edu>
Date: Wed, 15 Jan 1997 16:45:19 -0500
Message-Id: <>
To: Vai Roberto <r.vai@fstsf.it>
Cc: www-talk@www10.w3.org

At 07:13 PM 1/14/97 +0100, you wrote:
>I need to put on Internet / Intranet big documents (even more then 100
>pages long) that will be read and also printed by the users.
        A lot depends on what kind of documents these are.  How many are
there (2-10, 50-100+, or somewhere in between)?   Are the documents
relatively static, do they change a lot?  What is the lifespan of the
documents, how long do you expect them to need to be available? The reason
these questions are significant is that they can help you make decisions
about how much up front work you are willing put into the project.

[-Donning flame proof underwear-]
        This sounds like a text-book application for SGML.  

        Assuming that the information in the documents is relatively static,
and doesn't change a lot. And assuming that you are willing to put some
effort in up front during the development process, (It takes some time and
effort to learn about SGML.) SGML could be the right tool.    Just taking a
look at your requirements ...

>I'm looking for some tools (or rules) that can be useful in structuring
>and editing those documents to reach the following objectives:
>*	Be able to split the document in more Web pages.

        SGML can do this.  The SGML document(s) can be run through a filter
to generate HTML and to (according to rules that you create) automatically
split the document into multiple web-pages.

>*	Automatic generation of table of contents and related links to
>all the pages and to each intra-page heading.

        SGML is an excellent tool for this sort of thing.

>*	Automatic insertion in the document of useful intra-page links
>to give the possibility to the reader to go directly to the next or
>previous chapter avoiding to scroll many pages.

      SGML can do this, too. Just tag each chapter, and tell the filter to
put links to previous and next chapter on each page.

>*	Easy updating of the fixed parts that are present in all the Web
>pages  belonging to the same document (Updating of Logos, Copyright
>string, etc..).

        Style sheets are good here. (See
<URL:http://www.htmlhelp.com/reference/css/> and
<URL:http://www.w3.org/pub/WWW/Style/> for information about CSS1 style
sheets to use with HTML.)

>*	Easy downloading (by a single request) of all the pages that are
>members of the same document.

        Just run your SGML document through an output filter that specifies
that you want the output in the form of a single document.

>*	Pretty document print-out, as close as possible to the quality
>we can achive by the mean of an word processor.
        Filters are available to convert SGML documents into many common
formats, such as HTML, PostScript, RTF (the official document transfer
format for MS Word, although not that many people use it for that purpose.),
and PDF (I think.  I haven't actually seen the SGML-to-PDF filter, though.)

So what are the advantages of SGML:
        * standard, non-proprietary format.
        * very stable.
        * seems to fulfill your requirements, as you stated them.

Disadvantages of SGML:
        * Steep learning curve.
        * Tools can be expensive (some freeware utilities are available, but
they can be difficult to use).
        * If you are going to be working with documents that have a short
life-span, SGML probably isn't worth the effort (except in some special

If you would like to learn more about SGML, check out:
        The W3 Organization's SGML Activity page --

        The SGML Web Page --

        SoftQuad's SGML information page --
        (SoftQuad makes a neat SGML browser. However, I am not specifically
endorsing their products.  There are a number of commercial SGML products
available from a variety of vendors.)

[-Removing flame proof underwear-]
        It seems like anytime anyone says anything about SGML (either
positive or negative) around here, they have to worry about getting flamed
into the dirt. :-)  Sometime you've just got to pick the right tool for the
right job, whatever that might be.

        I hope this information is useful for you. Good luck with your project.


>Thanks in advance for your interest in my questions.
>Roberto Vai
Standard Disclamer -- The opinions expessed here are my own. They do not
represent official advice or opinions of Cornell Cooperative Extension 
or Cornell University.

John Udall,                                       
      Programmer/Systems Administrator            40 Warren Hall
Extension Electronic Technologies Group           Cornell University
Cornell Cooperative Extension                     Ithaca, NY 14853
email: jsu1@cornell.edu                           Phone: (607) 255-8127
Received on Wednesday, 15 January 1997 16:52:46 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:33:00 UTC