Re: Substantive comments on Proposed XHTML Module: Web Forms 2.0

On Fri, 26 Dec 2003, Malcolm Rowe wrote:
>
> # Abstract
> # 1. Introduction
> # 1.1. Relationship to HTML
> After reading the complete document, it's fairly clear to me that you
> intend this document to apply equally to HTML and XHTML user agents.
> Unfortunately, I was initially unsure whether this was the case, even
> after reading the abstract, the introduction, and the 'Relationship to
> HTML' section.

Agreed. I have added new sections and expanded the existing ones.  Please
let me know if the new text is clearer or still needs work, in your
opinion.


> It might be a good idea to rephrase the abstract along the lines of
> [...]

I have used your text verbatim! Thanks!


> I would then add, as the first 'requirements' bullet point in the
> Introduction, "Applicable to both HTML and XHTML User Agents",

I disagree that this should be a design requirement. There are some
features in particular which only apply to XHTML (and can't apply to HTML
since the parsing rules there are much less tractable).


> change section 1.1 so that it more clearly states that the specification
> extends both HTML 4.01 and XHTML 1.1, for HTML and XHTML UAs
> respectively.

Hopefully the introduction of a new section there should help with this.


> # 2. Extensions to form control elements
> You mention that empty <form> elements can now be contained within the
> <head> element of XHTML (and presumably, HTML?) documents, though you do
> not later describe the modified content model for <head>.

Fixed.


> Are you aware of any benefit in pre-declaring the <form> element in this
> fashion? It seems like it would add quite a bit of complexity (I'm
> thinking particularly of Mozilla's quirks-mode form handling, here).

Indeed. In HTML this is now disallowed. In XHTML it's a non-issue really,
might as well allow it since people have to code for it anyway.


> You also mention nested forms a few times, but you don't describe what
> the expected behaviour (or indeed, the point!) of a nested form would
> be. Are you anticipating that nested forms would inherit behaviour or
> attributes from their parent forms, or that they have particular
> semantics?

This should be clearer now.


> # 2.1 Extensions to the input element
>
> # time input type
> Should the time provided by the user be converted into UTC for
> submission? If not, this has the odd side-effect that a datetime input
> field will submit different values to two controls of 'date' and 'time'
> type.
>
> Why does the 'time' type only contain hours and minutes, and not seconds?

Because in my experience, almost all sites on the Web that have time
fields only ask for hours and minutes. Indeed the only site I can recall
seeing with a seconds field basically forced it to 00!


> # integer and numeric types
> Although the suggestion that UAs not convert numbers from string form to
> binary form is a good one for the reasons described, I do not believe
> that it is reasonable to assume that UAs will be able to enforce
> minimum, maximum, or integer constraints without converting the number
> to binary.

Yeah, unfortunately that is true, hence the "if you can't do this" part
which defines how to handle that case.


> # email type
> Strictly speaking, removing the FWS and CFWS tokens from the addr-spec
> grammar prevents odd, but valid, email addresses such as "foo
> bar"@example.org from being valid, from what I can see.

Good point. I've made the exclusion slightly more surgical so as to not
exclude any valid addresses but still exclude comments.


> # tel type
> The 'global-phone-number' type from RFC 2806 is intended to represent
> voice telephone numbers only; separate types are defined for fax
> machines and modems. I don't think that it matters in this case,
> particularly, though I thought it might be worth noting.

Yeah. That was my feeling too.


> # format of min/max/value for date and time types
> The format of the content of the min, max, and value attributes is not
> explicitly specified in the specification. I would assume that they
> should be the same format as the submission is expected to be made in,
> but I don't know what would be expected of UAs that are given valid ISO
> 8601 values that are not in the submission format, for example <input
> type="date" value="2001">; or valid ISO 8601 values that are not the
> same 'type' as the input field, for example: <input type="week"
> value="2001-01-01"> or <input type="time" min="2001-01-01T08:30:00Z"
> max="2001-01-01T21:30:00Z">.
>
> Should these be considered 'invalid' values per section 2.15, and
> ignored, or parsed per ISO 8601, and the relevant data extracted?

Ignored, due to the text "if the minimum value is not in exactly the
expected format, there is no minimum restriction". (The format is "the
format specified for the relevant type" which is listed in the section
above this one.)


> # "Note: Servers should still perform type checking on submitted data,
> # as malicious users
> # or rogue user agents might submit data intended to bypass this
> client-side type checking."
> Or, in the example given, user agents with no, or disabled, scripting
> support, which would not be able to ensure that the 'time1' time was
> before the 'time2' time.

Good point.


> # 2.4 Extensions to file upload controls
> What security considerations exist if a non-existent file is specified
> via the upload control? Presumably this is something that existing UAs
> do, rather than a new requirement?

The whole validation thing is a new requirement. I'm just ensuring that
the new validation thing doesn't leak information about what files exist
and what files don't.


> # accept attribute
> The paragraph describing that UAs may allow the user to override the
> MIME type of a file should be strengthened to clarify that UAs should
> not allow the user to override the MIME type merely to allow the upload
> to proceed, but only to correct the MIME type if it is incorrect.

Yeah. Not that we'll ever be able to really enforce this.


> I assume the accept attribute is not additive. For example, <form
> accept="image/png"><input type="file" accept="image/gif"> would result
> in a file upload control that can only accept GIF images?

Yes. Clarified (I hope...).


> # 2.5. Extensions to existing attributes
> # maxlength attribute
> I can see why the decision was made, but it does seem odd to prevent
> maxlength from applying to the integer input type.

Allowing it wouldn't allow anything that "max" and "precision" don't
control, and would be somewhat strange.


> Additionally, it is frequently the case now that email input fields (for
> example) have maximum lengths due to restrictions not under the control
> of the author (however much we would like this not to be the case). It
> would be preferable to be able to specify a maximum length declaratively
> rather than enforce it via script or at the server-side.

Ugh. I guess. Do I have to support that? I don't want people being
"helpful" and adding limits for no reason.


> # readonly attribute
> *Why* is it not possible to create a readonly checkbox? (or select, or
> radio button). Any arguments against that situation must surely apply to
> text entry fields as well, so why are they exempt?

There is no such thing as a readonly check box or radio button. Look in
any GUI manual.

A readonly input text field is pretty dubious on the Web too, in fact.
(It's better to just use a <span>.)


> # 2.6. The pattern attribute
> Does the pattern attribute apply if the field is empty? I assume not,
> but this is not described.

This is now described.


> # 2.7. The required attribute
> Is whitespace considered significant when determining if a control 'has
> a value'?

This is now described.


> # 2.12. The output element
> Why does the output element have a name attribute? Since it cannot be
> successful, I assume this is only so it can be easily reflected into the
> global script context?

Correct.


> # 2.13. The implied form for form controls with no form element
> I think it would be very confusing to have an anonymous form node that
> appears in the 'forms' collection and has the document as a parent, but
> does not 'appear in the document'. If you do decide to allow the
> anonymous form to be present in the DOM (and I don't see why not), the
> position in the DOM should be fully specified, as some applications will
> depend upon the exact position in the DOM.

This entire section has been removed.


> # 3.1. Introduction for authors
> Great example, by the way!

Thanks!


> # 3.5. The repetition model
> Should mutation events fire in addition to the repetition events
> described in this section? If so, do they fire before or after the
> repetition events?

This should now be specified.


> # 4.1. Bubbling semantics
> How could a bubbling form event get to the #document node without
> passing through a form node? Either the form control has a form
> attribute, or if not, it either has a form as a parent, or it will use
> the anonymous form. Or are you thinking of form events targeted manually
> at non-form controls?

I was covering all bases, just in case (your last example is one
possibility). Anyway, having removed the implied form, this is now much
simpler.


> Should redirection via the form attribute only work for form controls?

Yes.


> In the example below, to which form does the button belong?

None (or the implied form, when that was still part of the spec).


> Will the
> events generated by the button bubble into the <form> element after the
> <p> element, or into the anonymous form?

The form attribute on the <p> element has no effect.

> <html>
> ...
> <body>
> <form id="example"/>
>
> <p form="example">
>   <button />
> </p>
> </body></html>


> # 4.5. Form validation
> When focussing the event during form validation failure, should the UA
> fire focus events?

That has been clarified.


> # 5.3. application/x-www-form+xml: XML submission
> # <file> element, type attribute
> The type attribute should not be mandatory. What is the correct
> behaviour for a client that does not know the correct MIME type of a
> file: application/octet-stream? This is worse than nothing - if the
> client does not know, it should not provide.

Fair point. Fixed.


> The example shown later indicates that the charset parameter on the MIME
> type is allowed. Are any other parameters allowed? Again, what if the
> charset of the document is unknown - can the parameter be omitted?

Hopefully clarified.


> # 5.4. text/plain
> # values of file upload controls.
> The pathname of files should not be sent, for security reasons.

Fair enough. (Although I do believe UAs already do this, and have not
heard of any security complaints about it.)


> # 5.5.7. For javascript: actions
> # For the POST action
> Are there security considerations in allowing the form data to add
> variables to global scope? For example, non-replaceable properties.

I am not aware of any.


> Is the scope that the form data will be added to the same scope as used
> by other scripts on the same page, or a temporary scope that is cloned
> from the default global scope for the duration of the form submission?
> (In other words: a. Can a javascript: action affect the global scope
> after submission, and b. If so, is the form data set removed from the
> global scope after submission?)

That should be clarified now. Let me know if you can think of a better
definition. This section is just for completeness really, anyway. I don't
really expect it to ever be used.


> # 6.1. Seeding a form with initial values
> Accessing a file on the local filesystem would have security
> considerations, even given the constraints on the content of the file
> specified in the second paragraph. Include a comment to the effect that
> access to file:// content is generally not permitted for untrusted content.

Agreed.


> Is 'an XML MIME type' a well-defined term? What is the recommended MIME
> type for a 'form data' XML file - application/xml?

Added references to RFC3023.


> Is it allowable to specify that we can process only certain 'subsets' of
> XML? See TAG issue xmlProfiles-29:
> http://www.w3.org/2001/tag/issues#xmlProfiles-29

That isn't specified, is it? The UA is expected to be a fully conformant
XML UA. Or did I accidentally profile XML...?


> # "If the element cannot be given the value specified, ..."
> Can a typed input field be given an invalid value? Should UAs ignore,
> for example, an impossible value, say the value 'today' for a datetime
> input field? Or is this dependent on how the UA presents the control,
> and so implementation-dependent?

I've added some text to make this clearer.


> # 6.2. Filling select elements
> Similar comments on the opening paragraphs as were already mentioned for
> 6.1.

Fixed at the same time.


> Should the UA also ignore PIs and so on, as per 6.1, when reading XML
> for the contents of select elements?

The current algorithm does actually say this. :-)


> A fairly common situation for web authors is to have two lists, where
> the contents of the second depends on the value of the first. Is there
> any way we could extend the current model to include this functionality?

Hmm. I'll think about it, but nothing comes up to mind right away.


> # 7.1. Additions specific to the HTMLFormElement interface
> Is ERROR_USER_DEFINED persistent? That is, once set with setValidity(),
> is there anything that can unset it (resetting the form, for example?).

Clarified.


> # willConsiderForSubmission()
> What is this expected to be used for? Why is the example function called
> 'focussed'?

I had a use case! I cannot remember what it is though. I may remove this
unless I can remember it again.


> # 7.7. Resetting form controls
> Does resetting form controls also fire formchange events?

Good point. Added notes on this in some places (although maybe I need to
refactor all the resetting stuff into one place at some point).


> # 7.9. Loading remote documents
> # "These method is asynchronous, and are guarenteed to not finish
> loading the document
> # or signal an error before the running script either completes or
> yields to the user..."
> Isn't it more accurate to say that they are not guaranteed to *start*
> loading the document until the script completes or yields?

They can start loadnig the documents whenever, just so long as nothing
detectable by script happens until the script is done!


> # References
> The reference you quote for ISO8601 is, of course, the normative one,
> although it would be very useful to include the following as an
> informative reference:
> http://www.cl.cam.ac.uk/~mgk25/iso-time.html

It's the first Google hit for "ISO 8601", I don't think I need mention it
in the spec.


> How should an author determine whether the client UA supports this
> specification? The natural way would be to provide a feature to be
> tested via the hasFeature method of the DOMImplementation interface,
> though that assumes that the client has scripting capability.

This is a good question. I don't know. Ideally a UA needs to be detected
on the server side ("does it support data= or not?"). I have no answer for
this at the moment. Suggestions welcome.

Thanks for your input, it has been very, very useful. I would love to hear
your comments if you have more. In particular, I have made several major
changes and many minor changes over the last few weeks, and would love to
hear your input. When the draft is next stable, I'll mention it on my blog
(and hopefully other, more public places) again.

Cheers,
-- 
Ian Hickson                                      )\._.,--....,'``.    fL
U+1047E                                         /,   _.. \   _\  ;`._ ,.
http://index.hixie.ch/                         `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 7 January 2004 15:08:45 UTC