- From: Maciej Stachowiak <mjs@apple.com>
- Date: Wed, 21 Nov 2007 05:55:05 -0800
- To: Dean Edridge <dean@55.co.nz>
- Cc: Karl Dubost <karl@w3.org>, "public-html@w3.org Tracking WG" <public-html@w3.org>, Roger Johansson <roger@456bereastreet.com>
On Nov 21, 2007, at 5:29 AM, Dean Edridge wrote: > Maciej Stachowiak wrote: >> >> >> Making a single document that works in both serializations is >> significantly trickier than just using quotes around attributes. > > Really? What about this below? Only the mime type would need to be > changed: > > <!DOCTYPE html> > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> > <head> > <title>Demo</title> > </head> > <body> > <p class="top-paragraph" id="something"> > Hello World > </p> > </body> > </html> Your document, despite being a trivial example, is not conforming to either the HTML or XML serializations of HTML5. Even experts can make mistakes. And it gets a lot more complicated if you do things like: - Apply CSS styles to the body - Reference an external stylesheet via <link> - Reference an external script via <script> - Attempt to use document.write These are just a few of the most obvious pitfalls. Trying to keep them all in mind while authoring content is a whole lot of complexity. >> A CMS that wants to generate both HTML and XHTML needs to work at a >> higher level of abstraction than string pasting and can therefore >> produce separate documents for each serialization. > > Yes I know, thanks. > And why is this a reason to have discrepancies between the two > languages/serialisations? > Surely, if anything, it is a good reason to encourage the reduction > of discrepancies between the languages. The languages already have discrepancies. That is not in our power to change. Both classic HTML syntax and XML syntax were defined years ago and there are some incompatibilities that will probably never be resolved. Trying to write in the approximate common subset is really hard; most people who try get it wrong, even if they are experts. >> In any case, a CMS that does target producing single chameleon >> markup documents will need to follow the right conventions. > > But that wouldn't be so differcult. You'd think - a lot get it wrong and I'm not sure any get it totally right. >> That doesn't necessarily mean those rules are right for authors >> writing pure HTML by hand, or for XML-only document processing >> systems. > > Why not? For one thing, if I'm hand-authoring an HTML document, I shouldn't have to remember the magic URL talisman. For another, using XML minimized syntax in HTML documents is confusing, and restricting it only to HTML void elements in XML documents is needlessly restrictive. >> Anyway, my point is just that I think both ways of writing it are >> reasonable in different situations, and should be chosen based on >> circumstances. > > There is a method that is suitable for all circumstances, that's the > beauty of (X)HTML5: > > <!DOCTYPE html> > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> > <head> > <title>Demo</title> > </head> > <body> > <p class="top-paragraph" id="something"> > Hello World > </p> > </body> > </html> > > Wouldn't it be better to encourage people to markup their webpages > like this? > The less choosing the author has to do the better. I'd rather encourage people to: a) Validate, to find the errors in markup like the above. b) Not include unnecessary cargo cult talismans like the xmlns declaration in HTML syntax, unless they actually really need it. > Unnecessarily having two or more methods of quoting makes it much > more difficult to have HTML and XHTML in the world at the same time. That horse is decades out of the barn. We are unlikely to get it back in. Some people may feel that always using quotes is a better practice, but I don't think concern about XML syntax is enough reason to declare those who disagree categorically wrong. Note that even XML has two ways to quote attribute values, double quotes and single quotes. It drops the option of leaving simple values unquoted. > I don't see what is to gain from having unneeded discrepancies > between HTML and XHTML. Me neither, but we've had them since 1998 when XML became a REC and I don't expect them to go away any time soon. > My point is this: in regards to the quoting of attributes there > doesn't need to be two or more different ways to write up a (X)HTML > document. Of course, I don't have a problem with authors leaving out > the namespace attribute when intending to author in text/html as > this is easily altered later if someone wanted to convert the > document to XHTML. Just adding an xmlns attribute is not nearly enough to turn a nontrivial document into conforming XHTML5. Pretending so is a bad idea. Anyway, if using XHTML-like syntax is right for you, then you are free to use it. The current draft actually makes much more XHTML syntax legal in the HTML serialization than previous versions of HTML. However, I think it's wrong to try to set it down as some sort of mandate for all content authors. Seriously, does it make any real difference to anyone whether, for instance, the Google homepage double quotes its attributes? Would there be any benefit to humanity if it was changed? Regards, Maciej
Received on Wednesday, 21 November 2007 13:55:54 UTC