[Bug 6742] New: pre-encoded form values should be restorable as submitted

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6742

           Summary: pre-encoded form values should be restorable as
                    submitted
           Product: HTML WG
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P3
         Component: Spec bugs
        AssignedTo: dave.null@w3.org
        ReportedBy: Nick_Levinson@yahoo.com
         QAContact: public-html-bugzilla@w3.org
                CC: ian@hixie.ch, mike@w3.org, public-html@w3.org


Suppose a submitting user types into an HTML form field the following sentence:

I am puzzled by %26.

None of this is dangerous, so none of this needs encoding, so an HTML-compliant
user agent will submit the sentence unchanged to the next stage, which might
result in simple storage in a database.

Now suppose the sentence is retrieved and decoded. If whether decoding is
needed is determined by the presence of a percent-encoding or numeric entity
reference and not by a separate flag or list, then the parser function should
produce the following:

I am puzzled by &.

This may not be what the submitting user intended. If the submitting user is
not very computer-savvy, they won't know about encoding at all and therefore
won't anticipate it.

The human recipient, if also not very computer-savvy, won't know what happened
and will think they are seeing literally what was entered. The resulting
human-to-human conversation via HTML would likely be confusing, if
communications don't break down entirely.

I propose that an option exist to permit accurate restoration. The option
should be asserted in the HTML markup for the form or in the document head. The
user agent should then be responsible for either suspending or delaying all
encoding or for delimiting or listing passages that are not encoded by the user
agent even if they already appear in encoded style.

This is in reference to section 4.10.15 of the Working Draft of February 12,
2009, including 4.10.15.3, step 6, substeps 1 and 2, or any successor
provisions.

Since safety requires encoding, it must be performed at some stage. Therefore,
if assertion of the option suspends or delays all encoding, that would be to
permit the website author to provide a method that when executed results in
safety, i.e., in no need for further encoding. The website author might, for
example, devise a method that lists passages not to be changed when restored
later. If the method is not executed or it does not produce the result required
for safety, the user agent would then encode as it must now.

A delimiter can be any string that does not appear elsewhere in the submission.
The number of possible delimiters is infinite; e.g., if j, jj, and jjj are in
the submission, then jjjj might qualify as the delimiter. The delimiter would
not have to be the same in all submissions using that form, that page, that
website, or that user agent, as long as a field or value is added to the form
to declare the delimiter for the submission. The existence or absence of a
delimiter declaration in a submission would signal the parser function as to
what to do.

This leaves open whether the website author or the user agent is responsible
for a method once the option for accurate restoration is asserted and a method
is needed. One is responsible, and as a fallback it will be the user agent, as
now. If there's a possibility for creativity or variety in method, then the
website author should be allowed to interject a method, such as a
third-party-developed method.

Thank you.

-- 
Nick


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Friday, 27 March 2009 03:54:41 UTC