W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2002

'strip-presentation' option -- revised

From: Jelks Cabaniss <jelks@jelks.nu>
Date: Mon, 12 Aug 2002 18:32:48 -0400
To: "'Tidy Development List'" <Tidy-develop@lists.sourceforge.net>, <html-tidy@w3.org>
Message-ID: <006001c24250$30925af0$6701a8c0@blackie>

Consider 'strip-presentation' as 'drop-font-tags' on steroids...

When I submitted the "'strip-presentation' option" proposal (below), I
had some incorrect assumptions -- from older Tidy's -- of how
current-day Tidy works.  For one, I had forgotten that 'drop-font-tags'
is (thankfully) no longer dependent on 'clean'.  At any rate, I have
narrowed down what I think this option should do.  First draft:

1) Run 'drop-font-tags'.  This does close to 90% of the job.

2) Delete all BACKGROUND and BGCOLOR attributes -- globally.

3) Delete all ALIGN attributes unless they appear in table elements (one
*could* provide 'drop-table-aligns' -- and 'drop-table-widths' -- since
tables are a special case, but that's really a separate issue).


Notes: 

1)  Regarding #3 above, Tidy currently changes '<p align="left">' to '<p
style="text-align: left">' when 'drop-font-tags' is set but 'clean' is
not.  I don't think it should do this *unless* the Strict Doctype is
requested.  Maybe the fix for this should wait until Tidy becomes in
some fashion DTD-aware (perhaps even through lookup tables).

2)  In the same vein, 'drop-font-tags' seems to be doing double-duty.
For example, it also gets rid of <center>.  Shouldn't that action be
moved out of 'drop-font-tags' and into 'strip-presentation'?

3)  What does Tidy currently do with all those vendor-specific BODY
attributes?  (Can't remember them off the top of my head, since I never
use them, but you see them often enough.)  Anyway, if they aren't
already removed (through drop-proprietary attributes?),
'strip-presentation' should I think remove these as well.

4)  Should inline CSS from the original be removed in this option as
well?  That would suit me fine, but others might disagree.  Should we
then add a 'drop-inline-styles' option to do this?


/Jelks


-----Original Message-----
From: Jelks Cabaniss [mailto:jelks@jelks.nu] 
Sent: Saturday, July 20, 2002 6:25 PM
To: 'html-tidy@w3.org'
Cc: 'Tidy-develop@lists.sourceforge.net'
Subject: Request: a 'strip-presentation' option


There is, I think, currently only one glaring omission in Tidy.  Many
times, when we only want the *structural* markup, we need to get rid of
**ALL** presentational markup leftovers, including the embedded style
sheet.  Right now, the closest we can come to that is by applying
"clean" and "drop-font-tags", but that leaves droppings like this still
hanging around:

    <style type="text/css">
    /*<![CDATA[*/
 	p.c1 {font-style: italic; text-align: justify}
    /*]]>*/
    </style>

	...

    <p class="c1">...</p>

To get rid of this you have to manually delete the style section and
then do a regex search & replace to get rid of all the 'class=".*"'
attributes.  A 

	strip-presentation: yes/[NO]

option (or similar idea) would be extremely useful.  And rather than
have to set *two* options -- like 'clean' and 'drop-font-tags', which
you currently have to set to get even part of the way there -- it should
be a "one size kills all":  get rid of all FONT & CENTER elements,
ALIGN, BGCOLOR, & BACKGROUND attributes etc., and *not* replacing them
with CLASS attributes and *not* creating an embedded style section.

Thanks,


/Jelks
Received on Monday, 12 August 2002 18:33:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:52 GMT