- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Fri, 12 Feb 2010 06:17:49 +0200
- To: "Claudia Murialdo" <cmurialdo@gmail.com>, <www-validator@w3.org>
Claudia Murialdo wrote: > I need to add a few custom attributes in some HTML elements, which is > the best way to do it in order to keep validating with w3c validation > service?. Define a DTD that contains them. This isn't trivial, but it isn't rocket science either. See http://www.cs.tut.fi/~jkorpela/html/own-dtd.html > I want my page validates with HTML 4.01 Transitional. You can't eat your cake and keep it. "Validates with with HTML 4.01 Transitional" is a common loose expression for having a document that declares _the_ HTML 4.01 Transitional DTD. If you use any attribute that is not declared in that DTD, the document is not valid. Any attempts at avoiding this simple conclusion are based on misunderstandings of what validity is (in the relevant technical sense, the SGML sense). > I read that one > way is to extend de HTML 4.0 Transitional DTD and put this in the > DOCTYPE declaration of the page (as it says in > http://htmlhelp.com/tools/validator/customdtd.html), This is somewhat confusing since the good old htmlhelp.com refers to HTML 4.0, not HTML 4.01, though the difference is small. _The_ way to keep using the W3C validation service, or any SGML validator or close relative, is to modify the DTD to reflect the markup you want to use. Of course, this achieves nothing but the usefulness of checking that your document's syntax isn't unintentionally malformed, i.e. that you use markup the way you have declared. Well, some people might refer to the additional potential benefit of showing off that your page "validates", but there are so many more effective ways of deception. Most people couldn't care less whether someone else's page "validates". > but I would like > to know if I can do it in another way so that I can have the original > and public w3c DTD in the DOCTYPE and add my customs attributes adding > a new DTD or a namespace. Can I do that?. Adding new DTD? No, an SGML document has only one DTD by definition. Namespace? No, that's not an SGML thing at all. But by SGML rules, a document's DTD may appear in two parts, as external subset (as normal with HTML: the DTD is in a file and just referenced in the document type declaration, the <!DOCTYPE ...> stuff) and internal subset (part of DTD directly appearing inside the document type declaration). I wonder why I didn't consider that possibility years ago. It would save copying and would make it easier to keep track of your modifications. And up to a point it works nicely. E.g., if you wanted to use attribute FOO, with any string as value, in P elements, you would just say <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd" [ <!ATTLIST P foo CDATA #IMPLIED ] > And you document would validate. The W3C validator would even say "This document was successfully checked as HTML 4.01 Transitional!" in the heading. But that's just misguided technobabble, in a misguided attempt at being understandable and helpful. Later, much less noticeably, the validator says what the babble means: "This means that the resource in question identified itself as "HTML 4.01 Transitional" and that we successfully performed a formal validation using an SGML, HTML5 and/or XML Parser(s) (depending on the markup language used)." So it's just the _string_ "-//W3C//DTD HTML 4.01 Transitional//EN" in the DOCTYPE declaration that makes the validator characterize the document as HTML 4.01 Transitional. However (and now I vaguely remember why I haven't used this nicer approach), web browsers have never followed HTML specifications properly. In particular, they don't understand anything about document type declarations, except in the banal sense of recognizing some forms of them as special, by fairly simple string matching, in the infamous DOCTYPE sniffing behavior: they use simple string patterns to make a choice between rendering modes (like Quirks and "standard"). The problem here is that browsers don't even _parse_ DOCTYPE declarations properly: they'll take "] >" as document content and display them at the start of the page. This happens even in IE 8 and Firefox 3.5, so it won't ever change. (Well, unless SGML gets rehabilitated and its great merits as extended XML will be recognized and taken into use... :-)) -- Yucca, http://www.cs.tut.fi/~jkorpela/
Received on Friday, 12 February 2010 04:19:20 UTC