Substantive comments on Proposed XHTML Module: Web Forms 2.0

Ian,

As promised, here are my comments on the current (25th December) WD of 
Web Forms 2.0. I'm not an expert in all the technologies that this 
specification touches, by any means, so this should not be considered a 
comprehensive review. As a result, most of my comments are from the 
perspective of a page author, rather than a UA implementer.

I think most of it is pretty sensible: I'm mainly interested in 
clarifying some of the expectations and edge cases.

I've split these comments into two sections: substantive and editorial. 
Lines beginning with '#' either refer to sections taken from the 
document, direct quotes, or (un-named) parts of a section. I hope it 
will be clear from context what the intended meaning is.

Regards,
Malcolm

# Abstract
# 1. Introduction
# 1.1. Relationship to HTML
After reading the complete document, it's fairly clear to me that you 
intend this document to apply equally to HTML and XHTML user agents. 
Unfortunately, I was initially unsure whether this was the case, even 
after reading the abstract, the introduction, and the 'Relationship to 
HTML' section.

The title of the document begins 'Proposed XHTML Module:' and section 
1.1 begins 'This specification is an extension to [XHTML1].', which 
clearly would not be expected to apply to an HTML UA. I think these two 
areas are the root of my confusion, although it does not help that the 
abstract does not mention HTML4 until the last sentence, and then only 
in a fairly complex fashion.

It might be a good idea to rephrase the abstract along the lines of:
"This specification defines Web Forms 2.0, an extension to the forms 
features found in HTML 4.01's forms chapter. Web Forms 2.0 applies to 
both HTML and XHTML user agents, and provides new strongly-typed input 
fields, new attributes for defining constraints, a repeating model for 
declarative repeating of form sections, new DOM interfaces, new DOM 
events for validation and dependency tracking, and XML submission and 
initialization of forms. This specification also standardises and 
codifies existing practice in areas that have not been previously 
documented.

HTML4, XHTML1.1 and the DOM are thus extended in a manner which has a 
clear migration path from existing HTML forms, leveraging the knowledge 
authors have built up with their experience with HTML so far."

I would then add, as the first 'requirements' bullet point in the 
Introduction, "Applicable to both HTML and XHTML User Agents", and 
change section 1.1 so that it more clearly states that the specification 
extends both HTML 4.01 and XHTML 1.1, for HTML and XHTML UAs respectively.

This may seem like a minor issue to belabour, but it took some time for 
me to fully understand the scope of the specification, and it is a 
fairly crucial point.

# 2. Extensions to form control elements
You mention that empty <form> elements can now be contained within the 
<head> element of XHTML (and presumably, HTML?) documents, though you do 
not later describe the modified content model for <head>.

Are you aware of any benefit in pre-declaring the <form> element in this 
fashion? It seems like it would add quite a bit of complexity (I'm 
thinking particularly of Mozilla's quirks-mode form handling, here).


You also mention nested forms a few times, but you don't describe what 
the expected behaviour (or indeed, the point!) of a nested form would 
be. Are you anticipating that nested forms would inherit behaviour or 
attributes from their parent forms, or that they have particular semantics?

# 2.1 Extensions to the input element

# time input type
Should the time provided by the user be converted into UTC for 
submission? If not, this has the odd side-effect that a datetime input 
field will submit different values to two controls of 'date' and 'time' 
type.

Why does the 'time' type only contain hours and minutes, and not seconds?

# integer and numeric types
Although the suggestion that UAs not convert numbers from string form to 
binary form is a good one for the reasons described, I do not believe 
that it is reasonable to assume that UAs will be able to enforce 
minimum, maximum, or integer constraints without converting the number 
to binary.

# email type
Strictly speaking, removing the FWS and CFWS tokens from the addr-spec 
grammar prevents odd, but valid, email addresses such as "foo 
bar"@example.org from being valid, from what I can see.

# tel type
The 'global-phone-number' type from RFC 2806 is intended to represent 
voice telephone numbers only; separate types are defined for fax 
machines and modems. I don't think that it matters in this case, 
particularly, though I thought it might be worth noting.

# format of min/max/value for date and time types
The format of the content of the min, max, and value attributes is not 
explicitly specified in the specification. I would assume that they 
should be the same format as the submission is expected to be made in, 
but I don't know what would be expected of UAs that are given valid ISO 
8601 values that are not in the submission format, for example <input 
type="date" value="2001">; or valid ISO 8601 values that are not the 
same 'type' as the input field, for example: <input type="week" 
value="2001-01-01"> or <input type="time" min="2001-01-01T08:30:00Z" 
max="2001-01-01T21:30:00Z">.

Should these be considered 'invalid' values per section 2.15, and 
ignored, or parsed per ISO 8601, and the relevant data extracted?

# "Note: Servers should still perform type checking on submitted data, 
as malicious users
# or rogue user agents might submit data intended to bypass this 
client-side type checking."
Or, in the example given, user agents with no, or disabled, scripting 
support, which would not be able to ensure that the 'time1' time was 
before the 'time2' time.

# 2.4 Extensions to file upload controls
What security considerations exist if a non-existent file is specified 
via the upload control? Presumably this is something that existing UAs 
do, rather than a new requirement?

# accept attribute
The paragraph describing that UAs may allow the user to override the 
MIME type of a file should be strengthened to clarify that UAs should 
not allow the user to override the MIME type merely to allow the upload 
to proceed, but only to correct the MIME type if it is incorrect.

I assume the accept attribute is not additive. For example, <form 
accept="image/png"><input type="file" accept="image/gif"> would result 
in a file upload control that can only accept GIF images?

# 2.5. Extensions to existing attributes
# maxlength attribute
I can see why the decision was made, but it does seem odd to prevent 
maxlength from applying to the integer input type.

Additionally, it is frequently the case now that email input fields (for 
example) have maximum lengths due to restrictions not under the control 
of the author (however much we would like this not to be the case). It 
would be preferable to be able to specify a maximum length declaratively 
rather than enforce it via script or at the server-side.

The same argument essentially holds for all of the email, tel, and uri 
types. Perhaps maxlength should be allowed for these types, but its use 
not recommended?

# readonly attribute
*Why* is it not possible to create a readonly checkbox? (or select, or 
radio button). Any arguments against that situation must surely apply to 
text entry fields as well, so why are they exempt?

# 2.6. The pattern attribute
Does the pattern attribute apply if the field is empty? I assume not, 
but this is not described.

# 2.7. The required attribute
Is whitespace considered significant when determining if a control 'has 
a value'?

# 2.12. The output element
Why does the output element have a name attribute? Since it cannot be 
successful, I assume this is only so it can be easily reflected into the 
global script context?

# 2.13. The implied form for form controls with no form element
I think it would be very confusing to have an anonymous form node that 
appears in the 'forms' collection and has the document as a parent, but 
does not 'appear in the document'. If you do decide to allow the 
anonymous form to be present in the DOM (and I don't see why not), the 
position in the DOM should be fully specified, as some applications will 
depend upon the exact position in the DOM.

It is also not clear when the anonymous form is created ("when 
required", but even that isn't thoroughly explained).

I do not entirely follow the comments about the anonymous form and evil 
QA engineers, so I'll leave that to someone else.

# 3.1. Introduction for authors
Great example, by the way!

# 3.5. The repetition model
Should mutation events fire in addition to the repetition events 
described in this section? If so, do they fire before or after the 
repetition events?

# 4.1. Bubbling semantics
How could a bubbling form event get to the #document node without 
passing through a form node? Either the form control has a form 
attribute, or if not, it either has a form as a parent, or it will use 
the anonymous form. Or are you thinking of form events targeted manually 
at non-form controls?

Should redirection via the form attribute only work for form controls? 
In the example below, to which form does the button belong? Will the 
events generated by the button bubble into the <form> element after the 
<p> element, or into the anonymous form?

<html>
...
<body>
<form id="example"/>

<p form="example">
  <button />
</p>
</body></html>

# 4.5. Form validation
When focussing the event during form validation failure, should the UA 
fire focus events?

# 5.3. application/x-www-form+xml: XML submission
# <file> element, type attribute
The type attribute should not be mandatory. What is the correct 
behaviour for a client that does not know the correct MIME type of a 
file: application/octet-stream? This is worse than nothing - if the 
client does not know, it should not provide.

The example shown later indicates that the charset parameter on the MIME 
type is allowed. Are any other parameters allowed? Again, what if the 
charset of the document is unknown - can the parameter be omitted?

# 5.4. text/plain
# values of file upload controls.
The pathname of files should not be sent, for security reasons.

# 5.5.7. For javascript: actions
# For the POST action
Are there security considerations in allowing the form data to add 
variables to global scope? For example, non-replaceable properties.

Is the scope that the form data will be added to the same scope as used 
by other scripts on the same page, or a temporary scope that is cloned 
from the default global scope for the duration of the form submission? 
(In other words: a. Can a javascript: action affect the global scope 
after submission, and b. If so, is the form data set removed from the 
global scope after submission?)

# 6.1. Seeding a form with initial values
Accessing a file on the local filesystem would have security 
considerations, even given the constraints on the content of the file 
specified in the second paragraph. Include a comment to the effect that 
access to file:// content is generally not permitted for untrusted content.

Is 'an XML MIME type' a well-defined term? What is the recommended MIME 
type for a 'form data' XML file - application/xml?

Is it allowable to specify that we can process only certain 'subsets' of 
XML? See TAG issue xmlProfiles-29:
http://www.w3.org/2001/tag/issues#xmlProfiles-29

# "If the element cannot be given the value specified, ..."
Can a typed input field be given an invalid value? Should UAs ignore, 
for example, an impossible value, say the value 'today' for a datetime 
input field? Or is this dependent on how the UA presents the control, 
and so implementation-dependent?

# 6.2. Filling select elements
Similar comments on the opening paragraphs as were already mentioned for 
6.1.
Should the UA also ignore PIs and so on, as per 6.1, when reading XML 
for the contents of select elements?

A fairly common situation for web authors is to have two lists, where 
the contents of the second depends on the value of the first. Is there 
any way we could extend the current model to include this functionality?

# 7.1. Additions specific to the HTMLFormElement interface
Is ERROR_USER_DEFINED persistent? That is, once set with setValidity(), 
is there anything that can unset it (resetting the form, for example?).

# willConsiderForSubmission()
What is this expected to be used for? Why is the example function called 
'focussed'?

# 7.7. Resetting form controls
Does resetting form controls also fire formchange events?

# 7.9. Loading remote documents
# "These method is asynchronous, and are guarenteed to not finish 
loading the document
# or signal an error before the running script either completes or 
yields to the user..."
Isn't it more accurate to say that they are not guaranteed to *start* 
loading the document until the script completes or yields?

# References
The reference you quote for ISO8601 is, of course, the normative one, 
although it would be very useful to include the following as an 
informative reference:
http://www.cl.cam.ac.uk/~mgk25/iso-time.html

## general
How should an author determine whether the client UA supports this 
specification? The natural way would be to provide a feature to be 
tested via the hasFeature method of the DOMImplementation interface, 
though that assumes that the client has scripting capability.

Received on Saturday, 27 December 2003 05:22:33 UTC