W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2003

Re: [Tidy-dev] What are typical file extensions for X(HT)ML documents?

From: Klaus Johannes Rusch <KlausRusch@atmedia.net>
Date: Tue, 01 Apr 2003 07:44:32 -0100
Message-ID: <3E895170.1671549D@atmedia.net>
To: Terry Teague <terry_teague@users.sourceforge.net>
Cc: html-tidy@w3.org, tidy-develop@lists.sourceforge.net

Terry Teague wrote:

> In developing the next version of a program using Tidy based code, I am
> needing to add support for input X(HT)ML files using specific file
> extensions, to weed out unwanted files, especially when tidying whole
> directories.
>
> i.e. we don't want to Tidy "mylargeofficesuite.exe" or "mymapoftheworld.jpg".
>
> Here is a list of file extensions I am using at the moment :
>
>         /* [1] */ ".html";
>         /* [2] */ ".htm";
>         /* [3] */ ".text";
>         /* [4] */ ".txt";
>         /* [5] */ ".xml";
>         /* [6] */ ".xhtml";
>         /* [7] */ ".asp";
>         /* [8] */ ".jsp";
>         /* [9] */ ".php"

I would add

.shtml
.shtm
.phtml
.phtml

*.wml (WML 2.0 only)

.?html (maybe, depends on whether or not you want to process cHTML also)
.?htm (maybe, depends on whether or not you want to process cHTML also)

and remove

.txt

Microsoft office products register additional extensions for their HTML
templates, try assoc on a Win2000/WinXP machine:

.dochtml=wordhtmlfile
.docmhtml=wordmhtmlfile
.dothtml=wordhtmltemplate
.htm=htmlfile
.html=htmlfile
.htw=htmlfile
.htx=htmlfile
.mht=mhtmlfile
.mhtml=mhtmlfile
.pothtml=powerpointhtmltemplate
.ppthtml=powerpointhtmlfile
.pptmhtml=powerpointmhtmlfile
.shtml=NetscapeMarkup
.xhtml=xhtmlfile
.xlshtml=Excelhtmlfile
.xlsmhtml=excelmhtmlfile
.xlthtml=Excelhtmltemplate

Fragments are likely to be found in *.ssi or *.inc also.

Depending on what your program does, you may want to let the user specify
extensions, or guess the file type by looking at the content of the document, or
both.

--
Klaus Johannes Rusch
KlausRusch@atmedia.net
http://www.atmedia.net/KlausRusch/
Received on Tuesday, 1 April 2003 03:45:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:54 GMT