Re: Ignoring empty paragraphs from Jan Roland Eriksson on 2000-04-10 (www-html@w3.org from April 2000)

From: Jan Roland Eriksson <jrexon@newsguy.com>
Date: Tue, 11 Apr 2000 00:06:55 +0200
To: Ian Graham <ian.graham@utoronto.ca>
Cc: www-html@w3.org
Message-ID: <rtj4fsknil4477n1bhsp93b5p4670fhbqd@4ax.com>

On Mon, 10 Apr 2000 10:13:50 -0400, Ian Graham
<igraham@smaug.java.utoronto.ca> wrote:

>On Sun, 9 Apr 2000, Braden N. McDaniel wrote:
>> On Sun, 9 Apr 2000, Jan Roland Eriksson wrote:
[...]
>> > >> If there's nothing to mark-up, there's no motivation for markup either.
[...]
>For someone writing DOM code that access a document, it is unacceptable
>that the parser/processor can arbitrarily decide to modify the data
>structures by removing data from the document it receives.

We may already have established that view as being correct so far.

>For example, I (or some auto-generation tool pumping out
>valid HTML) could produce a document containing something like:

><p id="part1"> </p> 
><p id="para2"> </p> 

>and then later use script code to appropriately fill the <p>'s.

I read your comment here as "script code" being executed client side,
another interpretation (as in 'server side' does not make sense to me)

Remember then that "client side scripting" _and_ "client side styling"
are both of _optional_ nature only.

I get butterflies in my stomach when spot ideas of "client side driven
content generation" because that is, IMO, a first hand violation of the
original idea of the www, where it says that _one_full_resource_ is
represented by _one_URL_

If we find "web masters" to put client side generated content into their
docs on the www, they are also automatically shutting out some
percentage of clients from accessing that content.

My pet argument is to refer to search engines as really important
clients here of course.

>Obviously the code will fail if the parser/processor has decided
>to prune these empty but needed elements from the tree.

It should of course. But as I agreed too after Bradens input, "pruning a
'correct' tree" may not be the way to go in the first place.

>Moreover, with XML this would simply be illegal -- an XML parser can
>_never_ modify the incoming data, as Tantek pointed out.

True, but another approach is to have the parser signal empty "container
type" elements as invalid markup already from the start of it.

Why do we need to find 'author errors' late in the chain if we can spot
and act on them early on?

I'm pretty sure that SGML have the tools available to let that happen,
but I will have to investigate it a bit further.

The key might be sitting somewhere inside the concept of an
"Architectural Forms" validation, I hope, and that approach would then
be valid for both HTML and xml-type documents as I can think of it.

I'l try to find out something about that.

-- 
Jan Roland Eriksson <jrexon@newsguy.com>
<URL:http://member.newsguy.com/%7Ejrexon/>

Received on Monday, 10 April 2000 18:00:11 UTC