- From: Marc Richards <contact_marcos@yahoo.es>
- Date: Mon, 18 Jul 2005 09:59:19 -0400
- To: olivier Thereaux <ot@w3.org>
- Cc: www-validator@w3.org
Oops, looks like I sent that email from a mis-configured account. The from name should have been "Marc Richards" not Marcos Rubino. Marc Marcos Rubino wrote: > > Hi Olivier, > > Thanks for your response. I did quite a bit more digging and I now > think I understand the situation a little better. See my responses > inline. > > olivier Thereaux wrote: > >> Hi Marc, >> >> Thanks for sending this message, especially after obvious serious >> research. >> >> I think your conclusions are correct (see below for details), but >> please note that I am not as much of an expert as others on this >> list. Hopefully if I say something completely wrong, they'll jump in :). >> >> On 12 Jul 2005, at 00:28, Marc Richards wrote: >> >>> 1) Should the validator be throwing an error instead of a warning >>> whenever it encounters an ampersand or left angle bracket as data >>> for a document served as application/xhtml+xml? i.e. was there a >>> conscious decision made to only throw a warning or is this simply >>> one of the XML parser limitations. >> >> >> >> As far as I know, it is not legal in XML and authorized in SGML >> (with shorttags). Therefore, in XML mode it should throw an error. >> Whether it should be a warning in SGML mode is source of controversy >> : you'll get an equal number of people asking for it, for the sake >> of quality, and of people complaining that the validator should not >> dare confuse people with warnings for a valid construct. >> >> Instead, what happens is: >> - openSP's XML mode is "limited" (you saw the note) >> - in XML mode, openSP throws a warning for such a construct >> - in SGML mode, openSP accepts such constructs, unless asked to >> - XHTML is always parsed using XML mode (see also Bug 1500) >> >> [Bug 1500] http://www.w3.org/Bugs/Public/show_bug.cgi?id=1500 > > > Isn't bug 1500 misdirected? Correct me if I am wrong here, but even > if the XHTML as text/html pages were processed by the validator in > SGML mode with an XHTML DTD they would still be "valid" (since XML is > a subset of SGML) and as a result, bugs would still be filed agaist > Mozilla, Opera and Safari as long as people weren't taking advantage > of the techniques outlined in appendix C. > > It may be useful to offer a XHTML 1.0 Appendix C conformance testing > service (and it seemd there has been some forays in that direction[1]) > so that people could get an idea of how well their pages worked in > HTML4 UAs, but that doesn't mean that the validator is doing anything > wrong. > > [1]http://qa-dev.w3.org/~bjoern/appendix-c/validator/ > > A legitimate question still remains: Should the validator be parsing > XHTML served as text/html in SGML mode or XML mode? > > While I think it makes sense for standard HTML4 user-agents to process > text/html documents in SGML mode for backwards compatibility, the > majority of the users who test their XHTML pages using the validator > are looking for forwards compatibility and the well formedness that > XML brings to the table. > > In an ideal world, HTML4 only UAs would be served the page as > text/html and XHTML UAs (including the validator) would be served the > same page as application/xhtml+xml, however the fact of the matter is > that > (a) most people don't have content negotiation setup > (b) serving docs as application/xhtml+xml to current browsers that > support it is very tricky/error prone (javascript issues, CSS issues, > browser issues, etc) > (c) people have come to expect the validator to test XHTML pages for > xml well-formedness > > Given the way things stand now I think the best default is for the the > validator to parse and evaluate the pages as XML. I can't see any > value to anyone (end-users, web-developers, UA-developers) in > evaluating the pages as SGML instead of as XML while still using the > XHTML DTD. If you are testing XML well-formedness, you already have > SGML well-formedness covered (right?). There is some value in testing > Appendix C conformance, but that is a separate issue. > > >>> If this *is* one of the XML limitations then I think it would be >>> helpful to compile a short list of common limitations and list them >>> on a w3c page in plain English. I have read the OpenSP page[4] a >>> couple times and I am still not sure whether or not recognizing "<" >>> and "&" as invalid is a limitation of the parser; The language on >>> that page is fairly technical. The validator could link to this >>> internal page directly and that page would then link to the OpenSP >>> page as well. >> >> >> >> This could be a good idea. How about starting a scratchpad on the >> wiki, e.g somewhere like: http://esw.w3.org/topic/MarkupValidator/ >> XML_Limitations and motivate people on the list to contribute? > > > Done[2]. Everybody feel free to add, subtract, enhance. > > [2] http://esw.w3.org/topic/MarkupValidator/XML_Limitations > >>> 2) Why are you issuing a warning for the use of ampersands and let >>> angle brackets in xhtml but not html. If the warning is in fact >>> saying "this may be valid in some contexts, but it is recommended >>> to use & or <" then this is an SGML warning and should be >>> shown for both HTML and XHTML as text/html. Ideally with and >>> example like "R & D valid, R&D invalid". Is there a bug open for >>> issuing the warning for html doctypes as well? >> >> >> >> See above, my remark on the fussy mode. You could search this list >> for "fussy" and get an idea of the discussions that happened a while >> ago on this topic. > > > Is it technically possible to get the validator to flag & and < as > warnings in SGML mode? I couldn't find a bug for this one. > > If it is technically doable, I think there is less likelyhood of > backlash from the community (ala fussy mode) if > - users still got the bright green "this page is valid" at the top of > the page > - the color of the warnings were made a little more neutral (yellow > instead of pale red). > - the warning text is clear and helpful. > > I am not sure how much utility this solution would really have, so I > am not terribly gung ho about it, but I will file a bug if people > think it is likely to help users avoid potential errors. > >>> Are there open bugs you can point me to? Are there bugs I should file? >> >> >> >> I think 798 and 1500 are the relevant ones. If you think they do not >> cover the whole span of the issue, feel free to open others. > > > Assuming that we are agreed about evaluating XHTML documents served as > text/html in XML mode, is it technically possible to get the validator > to flag & and < as errors? Unless I am mistaken, this seems to be the > most obvious area where UAs choke on the well-formedness test (when > parsing as XML), but the validator just lets you off with a warning. > Of course it would be very imporant to make it clear to users why > their document doesn't validate using language they can understand, > plety of examples, and links to more detailed information. > > Is there a bug open for this? Is it likely to fixed without major > architectural changes? Bug 798 seems to be mislabeled. As far as the > soulution that was found is concerned it should be titled "warnings > are mistakenly suppresed on valid pages". > >> Hope this answered your questions. >> > > Sure did, which of course led to more questions. Thanks for taking > the time answer. > > > Marc > > >
Received on Monday, 18 July 2005 14:24:21 UTC