- From: Marcos Rubino <contact_marcos@yahoo.es>
- Date: Sun, 17 Jul 2005 23:48:59 -0400
- To: olivier Thereaux <ot@w3.org>
- Cc: www-validator@w3.org
Hi Olivier, Thanks for your response. I did quite a bit more digging and I now think I understand the situation a little better. See my responses inline. olivier Thereaux wrote: > Hi Marc, > > Thanks for sending this message, especially after obvious serious > research. > > I think your conclusions are correct (see below for details), but > please note that I am not as much of an expert as others on this list. > Hopefully if I say something completely wrong, they'll jump in :). > > On 12 Jul 2005, at 00:28, Marc Richards wrote: > >> 1) Should the validator be throwing an error instead of a warning >> whenever it encounters an ampersand or left angle bracket as data for >> a document served as application/xhtml+xml? i.e. was there a >> conscious decision made to only throw a warning or is this simply one >> of the XML parser limitations. > > > As far as I know, it is not legal in XML and authorized in SGML (with > shorttags). Therefore, in XML mode it should throw an error. Whether it > should be a warning in SGML mode is source of controversy : you'll get > an equal number of people asking for it, for the sake of quality, and > of people complaining that the validator should not dare confuse people > with warnings for a valid construct. > > Instead, what happens is: > - openSP's XML mode is "limited" (you saw the note) > - in XML mode, openSP throws a warning for such a construct > - in SGML mode, openSP accepts such constructs, unless asked to > - XHTML is always parsed using XML mode (see also Bug 1500) > > [Bug 1500] http://www.w3.org/Bugs/Public/show_bug.cgi?id=1500 Isn't bug 1500 misdirected? Correct me if I am wrong here, but even if the XHTML as text/html pages were processed by the validator in SGML mode with an XHTML DTD they would still be "valid" (since XML is a subset of SGML) and as a result, bugs would still be filed agaist Mozilla, Opera and Safari as long as people weren't taking advantage of the techniques outlined in appendix C. It may be useful to offer a XHTML 1.0 Appendix C conformance testing service (and it seemd there has been some forays in that direction[1]) so that people could get an idea of how well their pages worked in HTML4 UAs, but that doesn't mean that the validator is doing anything wrong. [1]http://qa-dev.w3.org/~bjoern/appendix-c/validator/ A legitimate question still remains: Should the validator be parsing XHTML served as text/html in SGML mode or XML mode? While I think it makes sense for standard HTML4 user-agents to process text/html documents in SGML mode for backwards compatibility, the majority of the users who test their XHTML pages using the validator are looking for forwards compatibility and the well formedness that XML brings to the table. In an ideal world, HTML4 only UAs would be served the page as text/html and XHTML UAs (including the validator) would be served the same page as application/xhtml+xml, however the fact of the matter is that (a) most people don't have content negotiation setup (b) serving docs as application/xhtml+xml to current browsers that support it is very tricky/error prone (javascript issues, CSS issues, browser issues, etc) (c) people have come to expect the validator to test XHTML pages for xml well-formedness Given the way things stand now I think the best default is for the the validator to parse and evaluate the pages as XML. I can't see any value to anyone (end-users, web-developers, UA-developers) in evaluating the pages as SGML instead of as XML while still using the XHTML DTD. If you are testing XML well-formedness, you already have SGML well-formedness covered (right?). There is some value in testing Appendix C conformance, but that is a separate issue. >> If this *is* one of the XML limitations then I think it would be >> helpful to compile a short list of common limitations and list them >> on a w3c page in plain English. I have read the OpenSP page[4] a >> couple times and I am still not sure whether or not recognizing "<" >> and "&" as invalid is a limitation of the parser; The language on >> that page is fairly technical. The validator could link to this >> internal page directly and that page would then link to the OpenSP >> page as well. > > > This could be a good idea. How about starting a scratchpad on the wiki, > e.g somewhere like: http://esw.w3.org/topic/MarkupValidator/ > XML_Limitations and motivate people on the list to contribute? Done[2]. Everybody feel free to add, subtract, enhance. [2] http://esw.w3.org/topic/MarkupValidator/XML_Limitations >> 2) Why are you issuing a warning for the use of ampersands and let >> angle brackets in xhtml but not html. If the warning is in fact >> saying "this may be valid in some contexts, but it is recommended to >> use & or <" then this is an SGML warning and should be shown >> for both HTML and XHTML as text/html. Ideally with and example like >> "R & D valid, R&D invalid". Is there a bug open for issuing the >> warning for html doctypes as well? > > > See above, my remark on the fussy mode. You could search this list for > "fussy" and get an idea of the discussions that happened a while ago on > this topic. Is it technically possible to get the validator to flag & and < as warnings in SGML mode? I couldn't find a bug for this one. If it is technically doable, I think there is less likelyhood of backlash from the community (ala fussy mode) if - users still got the bright green "this page is valid" at the top of the page - the color of the warnings were made a little more neutral (yellow instead of pale red). - the warning text is clear and helpful. I am not sure how much utility this solution would really have, so I am not terribly gung ho about it, but I will file a bug if people think it is likely to help users avoid potential errors. >> Are there open bugs you can point me to? Are there bugs I should file? > > > I think 798 and 1500 are the relevant ones. If you think they do not > cover the whole span of the issue, feel free to open others. Assuming that we are agreed about evaluating XHTML documents served as text/html in XML mode, is it technically possible to get the validator to flag & and < as errors? Unless I am mistaken, this seems to be the most obvious area where UAs choke on the well-formedness test (when parsing as XML), but the validator just lets you off with a warning. Of course it would be very imporant to make it clear to users why their document doesn't validate using language they can understand, plety of examples, and links to more detailed information. Is there a bug open for this? Is it likely to fixed without major architectural changes? Bug 798 seems to be mislabeled. As far as the soulution that was found is concerned it should be titled "warnings are mistakenly suppresed on valid pages". > Hope this answered your questions. > Sure did, which of course led to more questions. Thanks for taking the time answer. Marc
Received on Monday, 18 July 2005 03:49:29 UTC