- From: Michael[tm] Smith <mike@w3.org>
- Date: Mon, 27 Jun 2011 18:19:52 +0900
- To: Etan Wexler <ewexler@stickdog.com>
- Cc: pundits on and partisans of the W3C Markup Validation Service <www-validator@w3.org>
Etan Wexler <ewexler@stickdog.com>, 2011-06-26 22:38 -0400: > The W3C Markup Validation Service issues poor reports on such violations. Agreed. Specifically, the HTML5 facet of the validation service. (I don't think the non-HTML5 facets of the service do any of this kind of checking at all. At least, I don't think the core DTD-based backend of the service is not even capable of performing this kind of check at all. ) Anyway, that said, this is a known issue for which we have had an open bug for some time now: http://bugzilla.validator.nu/show_bug.cgi?id=339 But improving those particular error messages is a challenge. I do plan to do it (if Henri Sivonen doesn't get around to it first), but it's a significant amount of work to code it up, so it's going to be a while yet before it gets done. In the mean time, if you care to know the details about why it's difficult, here they are: The HTML5 backend Jing as its core component, and relies on a Relax NG schema that can be found here: https://bitbucket.org/validator/syntax/src/tip/relaxng/ Jing is RelaxNG-based validation tool, and practically speaking, Jing on its own is not capable of emitting a useful error message for this case. At least not with the current schema. It's imaginable that the schema could be (re)constructed in a way that enabled Jing to emit a useful error message for this, but suffice it to say that trying to do that would not be the right solution for this problem -- because this is one of many cases of error reporting for which a grammar/schema-based general-purpose validator is, on its own, not a good fit. That said, it is something for which an assertions-based validator like Schematron is a slightly better fit. And we have such an assertions-based validator[1] built into the HTML5 facet already, and it's responsible for emitting error messages for quite a lot of other cases. So we could add it there, but doing that would also complicate that code quite a lot. And the result would be that we'd then have *two* error messages for each instance of this error case: The first generated by Jing and the second generated by the assertions validator. We could "fix" that by changing the Relax NG schema to allow the attributes even in places where the spec says they are invalid. But one obvious disadvantage of that is, if somebody uses the schema on its own, without the additional assertions validation, then they would not get any error message about this at all. But the is one other place in HTML5 backend where we can deal with the content of the error message that gets emitted for this. That's in the Java code that actually emits the message. Among the things that code currently does is, it takes fragments from the HTML5 spec and includes them in error messages where they are useful. In this case -- where the error is for an invalid attribute -- it takes the fragment from the spec that (normally) is a list of all the attributes that are allowed on a particular element, and appends that to the error message from Jing. For pretty much every other element, that works great. But the input element is a special case due to it having certain attributes that are allowed only for certain (sub)types. The complexity of describing which attributes on input are allowed for which subtypes is way more than can just be included in the simple format normally used in the spec to list out the attributes. And that part of the current message-emitter code is generic in the sense that it behaves the same way regardless of what element it's emitting a message for. But after talking with Henri about this, it seems that, under the circumstances, the best way to address this is to add some special-casing for the input element in that message-emitter code. So that's what I plan to do when I can make the time. > The following can serve as the basis for an improvement to the reports of > the W3C Markup Validation Service. > “The ‘size’ attribute that appears on this ‘input’ element is invalid > because the element is in the Number state. That would be the ideal error message for this case, yeah. But that's not likely to be the error we end up emitting. What we'll likely end end up with instead is this: "Error: Attribute placeholder not allowed on element input at this point. Attributes that are allowed for type=number input elements: [list of only the attributes that are valid for the type=number case]" That is, the error message itself would remain exactly the same (because that's coming from Jing and is a general error for this case). But the part that follows the error message (which in the code is called "elaboration" and "spec advice") would be changed so that instead of being the long list of all attributes that can be valid somewhere for the input element element, it would be a shorter list of just those attributes that are valid for the particular subtype we're checking. > The element is in the Number state because the element has a ‘type’ > attribute that has the value ‘number’.” While that's more accurate in terms of using the actual language of the spec, I'm inclined to just have it say "type=number input elements" instead. --Mike [1] Note: The assertions-based validator in the HTML5 facet is not actually Schematron-based or even XPath-based; it's all custom Java code. But in practice it's a Schematron workalike, and we do also maintain a set of Schematron assertions that provide the same checks: https://bitbucket.org/validator/syntax/src/tip/relaxng/assertions.sch -- Michael[tm] Smith http://people.w3.org/mike
Received on Monday, 27 June 2011 09:20:03 UTC