- From: Jim Derry <balthisar@gmail.com>
- Date: Sun, 1 Feb 2015 02:06:35 +0800
- To: "To=" <54C79596.4050902@geoffair.info>, public-htacg-contrib@w3.org
- Message-ID: <CABUm+BdDdGmrezLFczMyvp3WMPmMkR4ZBsa0fOpdMjqy8S_djg@mail.gmail.com>
There's not a lot of discussion leading me to think this doesn't impact a lot of people. Using tidy from November 2014 (assuming no work added for this bug since then), given the input: <!DOCTYPE html> <html> <head><title></title> <body> <script> var a = '<script'; </script> </body> </html> I get the output: <!DOCTYPE html> <html> <head> <meta name="generator" content="HTML Tidy (Balthisar Tidy) for HTML5 for Mac OS X dated 2014/11/22"> <title></title> </head> <body> <script> var a = '<script'; <\/script> <\/body> <\/html> </script> </body> </html> ...and so I wonder if this is something a new configuration option should handle, or if it's an inherent bug? I think the question comes down to are we trying to identify errors in strings? The behavior currently seems to be that Tidy is simply not taking into account that something is quoted, and interpreting the string contents as markup. The danger I see in adding a new configuration option is expecting the user to know the difference. Yes, users SHOULD know the difference, but they don't always. If it's a configuration option, then the default should definitely be on the side of safety -- ignore anything that's in legal quotes. Advanced uses could turn off the option when required. I hope we spur some more discussion before making a huge decision. Given that this is a very old bug, I also suggest we move this beyond the 5.0.0 milestone. -- --- Jim Derry Clinton Township, MI, USA Nanjing, Jiangsu, China PRC
Received on Saturday, 31 January 2015 18:07:03 UTC