W3C home > Mailing lists > Public > public-html@w3.org > February 2008

Re: Investigate expected results to http://www.hixie.ch/tests/adhoc/http/content-type/sniffing/ tests in collaboration with the IETF HTTP WG (ACTION-44)

From: Thomas Broyer <t.broyer@gmail.com>
Date: Thu, 7 Feb 2008 17:47:11 +0100
Message-ID: <a9699fd20802070847u16179c86ka0fbfa9072743c6e@mail.gmail.com>
To: public-html@w3.org

On Feb 7, 2008 5:36 PM, Anne van Kesteren wrote:
>
> On Thu, 07 Feb 2008 15:18:51 +0100, Julian Reschke wrote:
> > Smylers wrote:
> >> That mail points out that none of the characters are > 127, which is
> >> correct.  However, test 16 does contain several characters < 32 (and
> >> which aren't tabs or line-breaks); these are not normally considered to
> >> be plain text.
> >
> > By whom? Is there any spec that disallows them in text types?

>From RFC 2046:
   Note that the control characters including DEL (0-31, 127) have no
   defined meaning in apart from the combination CRLF (US-ASCII values
   13 and 10) indicating a new line.  Two of the characters have de
   facto meanings in wide use: FF (12) often means "start subsequent
   text on the beginning of a new page"; and TAB or HT (9) often (though
   not always) means "move the cursor to the next available column after
   the current position where the column number is a multiple of 8
   (counting the first column as column 0)."  Aside from these
   conventions, any use of the control characters or DEL in a body must
   either occur

    (1)   because a subtype of text other than "plain"
          specifically assigns some additional meaning, or

    (2)   within the context of a private agreement between the
          sender and recipient. Such private agreements are
          discouraged and should be replaced by the other
          capabilities of this document.

My understanding is that text/plain doesn't assign additional meanings
to control characters, and neither we're in the second case of a
private agreement; therefore control characters *must not* (should
not?) be used in text/plain on the Web.

> There's a spec that suggests they don't have defined meaning:
>
>    http://www.ietf.org/rfc/rfc2046
>
> (Ironically their server is misconfigured to serve that as text/html so
> you'd have to view source...)

I know you deliberately chosed this misconfigured URL, but for others:
append .txt to it and you'll get text/plain ;-)

http://www.ietf.org/rfc/rfc2046.txt

-- 
Thomas Broyer
Received on Thursday, 7 February 2008 16:54:09 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:12 GMT