tor 2007-05-24 klockan 10:31 -0700 skrev Eric Lawrence: > I think the trick is distinguishing between a control character and a byte that's part of a multi-byte international character. > > Obviously, we'd need to escape any byte not valid in HTTP headers (e.g. 0x0d, 0x0a) to ensure the integrity of the headers. The quoting productions in RFC2616 isn't very obvious, but technically the syntax allows any characer 0-127 quoted. 8-bit characters is not allowed in HTTP anywhere which kind of rules out the use of most multi-byte characters without recoding them first.. HTTP builds on MIME which builds on RFC822, and it's successor RFC2822 is good reading regarding these things and an example where the BNF has been constructed such that it clearly separates producer rules from parser rules, with strict producer rules and relaxed parser rules accepting many "obsolete" things. But yes, in HTTP it's a bit of a mess. I do not think many implementations parse HTTP entirely correct, nor am I sure it's a desirable thing to parse HTTP fully to the specs as it requires the parser to allows a great deal of crap nobody expects as it's not allowed to produce.. To be honest I don't think many MIME parses passes the full RFC2822 requirements either.. Regards HenrikReceived on Thursday, 24 May 2007 20:15:44 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:50:09 GMT