# [whatwg] element "img" with HTTP POST method

From: Martin Janecke <whatwg.org@kaor.in>
Date: Fri, 10 Dec 2010 15:52:47 +0100
Message-ID: <32BE9988-CCF7-41F5-AF6D-DF760DA43D0C@kaor.in>
Am 09.12.2010 um 20:41 schrieb Philipp Serafin:

> [...] though this would
> also present serious security vulnerabilities, especially in forum pages.
>
> There are quite a number of older web forums that sanitize their HTML
> using black lists and would not strip new attributes like "post-data".
> For malicious users, it would be very easy to include e.g. <img
> spam URL)"> in their signature and have users doing involuntary posts by

I actually considered the question of misuse in forums, but only thought of those that use BBCode and therefore shouldn't be vulnerable to this. But you're right, there are others that allow HTML elements.

Am 09.12.2010 um 21:09 schrieb Philipp Serafin:

> ... on second thought, maybe  it would be an even better idea to just
> define a new "submit" like input type that would submit the form as soon
> as it's fully loaded and display the POST result as an image. This would
> work better with the form metaphor and would present less security
> risks, since only very few sites allow <form> or <input> elements in
> user content.
>
> Martin Janecke's example would then look like this:
>
> <form method="post" action="http://www.forkosh.dreamhost.com/mathtex.cgi">
> <input type="hidden" name="latexdata" value="\begin{align} (... latex
> ...) \end{align}">
> <input type="post-image">
> </form>

Yes, this would be better indeed to prevent abuse by forum/social network/etc. users.

Am 10.12.2010 um 05:35 schrieb Tab Atkins Jr.:

> On Thu, Dec 9, 2010 at 7:15 PM, Adam Barth <w3c at adambarth.com> wrote:
>>>>> On Thu, Dec 9, 2010 at 11:41 AM, Philipp Serafin <phil127 at gmail.com> wrote:
>>>>>> There are quite a number of older web forums that sanitize their HTML using black lists and would not strip new attributes like "post-data". For malicious users, it would be very easy to include e.g. <img src="./do_post.php" post-data="thread_id=42&post_content=Go visit (some spam URL)"> in their signature and have users doing involuntary posts by simply viewing a thread.
>>>>>
>>>>> Indeed.  You shouldn't be able to trigger POSTs from involuntary
>>>>> actions.  They should always require some sort of user input, because
>>>>> there is simply *far* too much naive code out there that is vulnerable
>>>>> to CSRF.
>>>>
>>>> Unfortunately, the attacker can already trigger POSTs with involuntary
>>>
>>> Via scripting, yes, which is usually stripped out by sanitizers (or
>>> just plain doesn't work, like javascript urls in images).  I don't
>>> believe there are any declarative ways to trigger involuntary POSTs,
>>> are there?
>>
>> The attacker can always make a giant invisible button that covers the
>> whole page that submits a form.  Web sites can generate POST requests
>> without user intervention.  Anyone who's using POST as a security
>> feature as far bigger troubles than this attribute.
>
>
> But still, none of those are new POST-ing abilities that can be
> utilized by J. Random User on a message board with half-decent
> security.

I agree with Adam about POST being not sufficient as security measurement anyway. There are already enough other CSRF possibilities for malicious web page authors. And Philipp Serafin's idea seems to prevent new POST-ing abilities that can be utilized by J. Random User on a message board with half-decent security.

Nevertheless I doubt my use cases justify opening a CSRF hole that can be used by malicious web page authors, even if it is just another hole among many?? More security holes are worse than less, I guess.

Am 10.12.2010 um 09:23 schrieb Julian Reschke:

> It's sad that the discussion even got that far.

I'm very happy about the broad feedback highlighting different aspects of the topic.

> If the URI length is a problem because of browsers, fix the browsers to extend the limits, instead of adding a completely new feature.

That's a good idea. Can we define a minimum length in the spec that should/must be supported? As a point of reference for browser vendors and web page authors? I didn't find a reliable point of reference other than http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2.1 ...

Note: Servers ought to be cautious about depending on URI lengths
above 255 bytes, because some older client or proxy
implementations might not properly support these lengths.

... which isn't sufficient for the use cases.

Am 10.12.2010 um 10:03 schrieb Benjamin Hawkes-Lewis:

> POST is a sort of catch-all method, but what here do you think makes
> POST a better match than GET?

I must admit that I apparently misunderstood "In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval" (http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.1). In the LaTeX example, the request provides source code which is compiled by the remote service. The requested content isn't something that the service provider prepared before. He just offers an environment and the user defines what happens within the boundaries of this environment. I considered this to be "an action other than retrieval" only. But now I understand that for the user it has the *significance* of retrieval only, though.

However, I still have the impression, that my use cases are described more precisely by the description "Providing a block of data [...] to a data-handling process. [...] The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI. The posted entity is subordinate to that URI in the same way that a file is subordinate to a directory containing it, a news article is subordinate to a newsgroup to which it is posted, or a record is subordinate to a database." (POST) than what is described in the section about GET. But I might be wrong here (and also impaired by a lack of better understanding of the English language).

>
> http://www.w3.org/2001/tag/doc/whenToUseGet.html

Thank you for the link. In particular, I found the following passage in http://www.w3.org/2001/tag/doc/whenToUseGet.html#uris interesting:

"By convention, when GET method is used, all information required to
identify the resource is encoded in the URI. There is no convention
in HTTP/1.1 for a safe interaction (e.g., retrieval) where the client
supplies data to the server in an HTTP entity body rather than in the
query part of a URI. This means that for safe operations, URIs may be
long. The case of large parameters to a safe operation is not
directly addressed by HTTP as it is presently deployed. A QUERY or
"safe POST" or "GET with BODY" method has been discussed (e.g., at
the December 1996 IETF meeting) but no consensus has emerged."

A "safe POST" or "GET with BODY" would meet what I'm looking for, it seems. But clearly this is out of scope of this mailing list.

>> but also
>> for practical reasons: URLs in GET requests are (inconsistently)
>> limited in length by browsers and server software, which limits these
>> services (inconsistently).
>
> The most popular browser currently limits URLs to 2,083 characters.
>
> http://support.microsoft.com/kb/208427
>
> [...] Apparently, Firefox/Safari/Opera already support much longer URLs:
>
> http://blogs.nitobi.com/alexei/?p=163
>
> I think at some point repeatedly GETing or POSTing large amounts of data
> to display an image becomes unrealistically wasteful of bandwidth,
> compared to generating the image once and reusing the created image URL.
>
>> This could be implemented with an optional attribute, e.g.
>> "post-data". The client would request the source using the POST method
>> if the attribute is present and send the attribute's value as the
>> request's message body.
>
> Increasing URL limits to (say) 10,000 characters by spec would have a
> better backwards compatibility story than a "post-data" attribute, since
> it would work in many of today's browsers.
>
> [...] are many formulas more than 10,000 characters?
>
>> (2) QR-Code generators encode texts or URLs in a way that can be
>> easily read by devices such as cell phones with built-in cameras. Info
>> and generator:
>
> QR codes have a maximum capacity of 7,089 characters, well within the
> 10,000 characters requirement I suggest above.
>
> http://en.wikipedia.org/wiki/QR_Code

Thanks for the research. I think it is a good idea to spec a minimum URL length and I agree that ? in this magnitude ? it would help with realistic use cases. Should I start a new thread for this proposal?

>> === Example/Use Cases ===
>>
>> (1) MimeTeX and MathTeX are used to display mathematical formulae in
>> web pages.
>
> Only future browsers could use "post-data" and are better off
> implementing inline MathML to support this use-case:
>
> http://www.whatwg.org/specs/web-apps/current-work/multipage/the-map-element.html#mathml

A little off-topic: MathML isn't really designed to be typed by hand, is it? My impression is that the solution from my use case is often used by those who lack the knowledge/rights to install sophisticated software on their own webspace that generates the desired output. So, they might as well lack the knowledge/rights to install tools that convert to MathML. Fortunately there are client-side solutions using Javascript today, but as some people disable scripting, there is still a need for a fallback solution.

So I think that use cases will pop up as long as there's webspace with narrowly limited rights.

Thanks again for all your comments (including those I haven't explicitly replied to),
Martin

Received on Friday, 10 December 2010 06:52:47 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:02 UTC