Re: HTML Pro questions from Drazen Kacar on 1996-11-05 (www-html@w3.org from November 1996)

From: Drazen Kacar <Drazen.Kacar@public.srce.hr>
Date: Tue, 5 Nov 1996 15:57:40 +0100 (MET)
To: pflynn@curia.ucc.ie (Peter Flynn)
Cc: www-html@w3.org
Message-Id: <199611051457.PAA23882@jagor.srce.hr>

Peter Flynn wrote:

>    Namely, this:
> 
> 	<title>Why can't I be the first one?</title>
> 
> Yes, I wanted to avoid missing HTML and HEAD. There was never any
> intention that a HTML 2.0 file should be valid in HTML Pro, but I can
> change this if people feel strongly about it. I would still rather
> produce a robust DTD than cripple it with limitless backward
> compatibility: it's intended for the future, not the past.

It seems I'll be in charge of a search service and I thought I just might
run each page through SGML validator and display number of errors to the
innocent user of service. HTML Pro is just what I need, but I'll have to
make it HTML 2.0 compliant. I suppose I can do it myself. This is a
specific project, there's no need for crippling the DTD in general. And I
must say I like name Silmaril very much, I can see validator saying "Tears
unnumbered ye shall shed..." :)

>    Is there software that makes use of XTML?
> 
> OpenText and similar databases that can store multiple instances in a
> single file.

I think someone once posted URL to draft that describes transfer of HTML
objects in MIME, but I lost it. So, if someone has that URL, I'll be very
grateful...

>    Another thing is that CLEAR attribute is left out for many elements.
>    HTML 3.0 specified CLEAR for virtually everything, and it would be
>    nice to have it that way. In general, Lynx source has more attributes
>    for various elements than HTML Pro. Comments?
> 
> Whoops. Yes, CLEAR should be in there. My fault. I have the complete
> Lynx list from Fote and I'll add them. What specifically was missing?

I don't know exactly, I was just checking which tag has the most attributes.
Since I'll have to parse pages before of validator, I wanted to see
if I can store information about presence of attributes in 32 bits. In
Lynx INPUT has 30 or 31, HTML Pro has much less.

Parsing before validator is needed because I've seen a lot of pages with
--!> thing intended for comment termination, and SGML validators don't
generate much errors for them. Most of the document appears as a comment
and you'll get just one error about unterminated comment. Besides, it
would be nice to count BLINKs, IMGs without ALT and some other things.
I've posted RFW (request for wishes) on lynx-dev already and anyone
here is welcome to add things he/she would like to see in search service
output. Things like FRAME alert, VirusX alert...

Back to HTML Pro DTD. I think that DTD allows multiple TITLE elements
and, if memory serves me well, I think some time ago I've seen a hack
posted that would enable only one TITLE in HEAD. I call it a hack
because my understanding of SGML was not enough to see what was going
on there. :) But then, my SGML knowledge is very close to zero. The
author was, I believe, Joe English. Perhaps you could incorporate it
into HTML Pro DTD.

-- 
Life is a sexually transmitted disease.

dave@fly.cc.fer.hr
dave@zemris.fer.hr

Received on Tuesday, 5 November 1996 09:58:29 UTC