Re: Using htlm editors to produce clean code?

I'm not a subscriber to this list, but I found this thread on the
archive and felt impelled to comment.  I've copied the typo in the
subject line, in the hope that it keeps the thread together.

Bruce Bailey <bbailey@clark.net> wrote:

> The consenses so far:
> 1) FrontPage is quirky and you still have to know html to use it.
> 2) It doesn't take much work to correct Communicator pages.
> 3) Other packages MIGHT do the trick.

Part of the problem here is that you haven't really stated in a global
sense what problem you are aiming to solve.  Perhaps if I draft the
problem out in a crass fashion you can correct it: "we have
secretarial staff, we need web pages, therefore secretarial staff
shall create web pages" - is that roughly it? 

If you want to employ secretarial staff to write letters for you, then
you'd expect them to be trained in whatever software tools you aim to
use for writing letters (let's say for the sake of argument MS Word),
surely?  If you aim to have them create your brochures for you, then a
certain aptitude and training in graphic design and associated
software tools would hardly be optional, would it? 

If you aim to have them create web pages, then I think your
apparent aim of having them create web sites without knowing what a
web site is or what HTML is/can do, might be somewhat over-ambitious.

The problem with these would-be "create a web without knowing anything
about it" packages is that they roll a lot of unrelated functions into
one (HTML composition, graphic design, web server file management
etc.) and tend to do each task in their own quirky fashion, locking
you into just one way of doing things (which it then becomes hard to
change) and leaving you with a legacy of quirky documents.

If I had a clear solution to offer you, I would offer it.  Frankly, I
don't.  But what I do note is that many of these tools create more
problems than they solve, and they still don't produce what I would
call good, accessible web pages.

Netscape Page Composer (which is what I'm assuming you're referring to
when you speak of Communicator's Editor) overtly offers the author a
selection of fonts sized in points, and visual page markups, things
which HTML doesn't even aim to support, yet gives little or no support
for the really interesting bits of HTML - those by means of which a
web page can be made to fit itself comfortably across a wide range of
browsing situations.  This latter is the major strength of HTML, and
the thing that give the most prospect of achieving accessibility.  In
short, this product (like so many others, I am afraid) _deceives_ the
user (I use this term quite deliberately) into believing that what she
is doing is visual page design, and (in effect) throws aside all the
best parts of HTML in the process. 

You write

> I need to find tools that 
> our secretaries can use without learning html.

I'm afraid that is analogous to saying "I need our secretaries to
write letters on the computer without having to learn to use a
wordprocessor". 

Here's one approach.  I don't claim that it's ideal for every
situation, and I don't claim to have the ideal tools for implementing
the approach, but it seems to me to fit well into the skill sets of
the people that I deal with.

Let's take it as axiomatic that secretarial staff would already have
skills on MS Word.  They might need a little refresher on exploiting
Word style templates, depending on their background. Then we can have
them concentrating on styling their documents by means of style
templates instead of by means of direct style-setting.  This is an
excellent launch point for what we need.  Even when considered as a
pure word-processor technique without reference to the WWW, the use of
content-based named styles is very beneficial, as becomes apparent
when a document originally authored for one situation (e.g a flyer) 
is needed to be used for another (e.g incorporated into a brochure or
handbook).  It pains me to have seen in earlier times the wasted
effort that went into visually re-styling whole collections of
documents when this happened. 

We then persuade them to author their documents using content-based
named styles (things like aims-objectives, course-outline,
lab-details, whatever is appropriate to the document).  You then have
a portable, content-marked-up master document, created based on the
existing skill-set of your staff.  It can incidentally be used to
create quality printed versions of the documents, different
presentations for different situations by attaching different style
templates to the same original named-styles content.

Then you convert that (Word or RTF document) into HTML with optional
style sheet support, using whatever customisation seems appropriate
for the current crop of browsers.  This is a job for a person with the
necessary technical expertise and knowledge of current browser
capabilities (your webmaster, perhaps, in conjunction with someone
responsible for house style).  Look at the benefits of this approach: 
if a reader complains that something in the HTML isn't working, then
instead of just fixing that one document, you embody the new knowledge
into your conversion mapping, such that the new knowledge is leveraged
across your entire web.  (That's the theory, anyhow ;-).  And you
don't have to wait for your vendor to bring out a new release of the
authoring tool before you can solve the problem.

Now, a good designer of HTML-generating tools would be doing that for
you when they made the tool; unfortunately the widely-available tools
often come from browser makers, and may well be based on the strengths
of their own browser (= not take into account the weaknesses of the
other browsers that your readers will be using).  Hence the futile
bleating about "best viewed with nerdscrape exploder, download it now"
that gets plastered across the web nowadays, sigh. 

That's the theory.  I'm afraid the practical realisation still leaves
quite a bit to be desired.  I personally use "rtftohtml" as the
converter, which is basically good, but has plenty of rough edges and
shortcomings.  If you can find a better way then feel free to use it,
I'm only sketching out suggestions based on my own experience.  But
I'm not going to abdicate the duty of creating real HTML, by caving-in
to the widespread superstition that HTML is just a new way to do
WYSIWYG visual page design - that idea was injected by force back with
HTML3.2, but I'm glad to see it being pushed out again at HTML4.0. 

With a different kind of author (e.g our academics), latex would be
the markup of choice, and a latex-to-html converter would do the job. 
latex, after all, is also conceived as a _structural_ markup (to be
used in conjunction with appropriate styles so that the same source
document can be used to create different presentation formats), so it
all seems to hang together, in my mind. 

> I hate that Communicator favors appearance tags vice 
> logical ones (<B> and <I> instead of <STRONG> and <EM>), but only
> fails me on two points:
>        1)  It won't include the <!DOCTYPE... statement
>        2)  It puts nbsp; after graphics in table cells which causes
>             a Bobby error 

"Bobby" is a splendid tool, indeed; but it does call attention not
only to real problems but also to potential ones.  There are quite
a number of Bobby's concerns that can be addressed in a practical way
that avoids the potential problem that Bobby is calling attention to,
even though Bobby, unaware that the problem has been solved, will
continue to point out the potential problem.  This seems entirely
reasonable to me.

But, on your specific points, and assuming now that your authors are
meant to be creating HTML documents (rather than having HTML derived
from some other format).  I would be inclined to design your
"publish" operation such that pages were fed through an arbitrary
filter, that you can design locally, on their way to the web server.
Initially it would do nothing, beyond checking for and if necessary
adding your DOCTYPE.  As you gain expertise, you could add other
"purifying" actions to the filter, for example correcting the
undefined &#nnn; references in the range 128-159 that some authoring
tools are so keen to insert.  And all without the authoring staff
needing to know or care what the details of the cleanup were.

Just exactly how you implement this action depends, again, on the
available skill set, OS etc.  Some responsible webmasters use a form
of the unix "make" command to proceed from the author's output to the
server's document, applying local conventions for standard head parts,
navigation footers etc.

But frankly, I don't believe your answer can be a few cosmetic fixes
to something that is fundamentally designed on a misconceived premise
(i.e purporting to offer exact visual formatting via a markup language
that was from the start designed to have its strong points in
platform- and presentation- independence).

If, rather than consider the somewhat indirect approach I sketched out
above, you are certain that you want to have actual HTML documents
created by your authors, then I have heard good things said of "HoT
MetaL pro" by people whose judgment I trust.  I can't speak for it
personally, though - I simply don't have enough experience of it.

Dear colleagues, please accept my apologies for rambling on at some
length on this topic, but I do strongly feel that the WWW in general,
and HTML in particular, was designed with portability and
accessibility as one of its strengths.  I find it so utterly
frustrating that we are presented with tools that throw away most of
this accessibility and flexibility, and mislead authors into thinking
they are doing nothing more than visual design.  When these authors
discover what their creations look like in unexpected browsing
situations they are justifiably disappointed; meantime, much effort is
being spent (I may say wasted)  on trying to bolt-on additional
accessibility to these misconceived web documents, when, if the
originals had been designed in accordance with the principles and
practices of the original WWW concept, no repair would have been
necessary, albeit a few enhancements (LONGDESC etc) would be useful.

If there is some approach that achieves the aims I have described, but
in a more efficient and effective way, and that attaches naturally to 
the typical skill set of the participants, I would be only too glad to
learn of it.

best regards

Received on Saturday, 16 May 1998 11:54:31 UTC