Re: Proposal to advertise automation of UA from Sergey Shekyan on 2017-01-17 (public-webappsec@w3.org from January 2017)

From: Sergey Shekyan <shekyan@gmail.com>
Date: Tue, 17 Jan 2017 11:46:09 -0800
To: Mike West <mkwst@google.com>
Cc: Jonathan Garbee <jonathan.garbee@gmail.com>, Daniel Veditz <dveditz@mozilla.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAPkvmc8ofX=TYecDvuv-S8rxuPTW+Nc83Ja3thrNDbZT6xeDsw@mail.gmail.com>

Hey Mike,

that'd be pretty naive to expect reduction in malicious traffic by adding a
way to say that visitor might be malicious. What I want is to have a way
for UA to say that the session is automated.

On Tue, Jan 17, 2017 at 4:16 AM, Mike West <mkwst@google.com> wrote:

> Hey Sergey,
>
> If your goal is to reduce malicious traffic on a website, why would you
> expect the malicious-traffic generator to opt-into sending a header
> advertising their automated nature? Doesn't this in some way boil down to setting
> the evil bit <https://www.ietf.org/rfc/rfc3514.txt>?
>
> -mike
>
> On Tue, Jan 17, 2017 at 8:18 AM, Jonathan Garbee <
> jonathan.garbee@gmail.com> wrote:
>
>> I'm what way should they respond differently? The site has absolutely no
>> context as to why headless is being used. Why mangle the response without
>> any context and just hope your users still get benefit from it?
>>
>> On Mon, Jan 16, 2017, 4:47 PM Sergey Shekyan <shekyan@gmail.com> wrote:
>>
>>> robots.txt is either is an on/off switch, while what I propose is more
>>> granular, allowing websites to chose how to respond.
>>>
>>>
>>> On Sat, Jan 14, 2017 at 5:52 AM, Jonathan Garbee <
>>> jonathan.garbee@gmail.com> wrote:
>>>
>>> I don't see where having a header or something to help detect automated
>>> access will be beneficial. We can already automate browser engines.
>>> Headless mode is just a native way to do it. So, if someone is already not
>>> taking your robots.txt into account, they'll just use another method or
>>> strip whatever we add to say headless mode is in use out. Sites don't gain
>>> any true benefit from having this kind of detection. If someone wants to
>>> automate tasks they do regularly, that's their prerogative. We have
>>> robots.txt as a respectful way to ask people automating things to avoid
>>> certain areas and actions, that easily continues into headless mode.
>>>
>>> On Sat, Jan 14, 2017, 4:28 AM Sergey Shekyan <shekyan@gmail.com> wrote:
>>>
>>> I am talking about tools that automate user agents, e.g. headless
>>> browsers (PhantomJS, SlimerJS, headless Chrome), Selenium, curl, etc.
>>> I mentioned navigation requests as don't see so far how advertising
>>> automation to non-navigation requests would help.
>>> Another option to advertise can be a property on navigator object, which
>>> would defer possible actions by authors to second request.
>>>
>>>
>>> On Sat, Jan 14, 2017 at 12:56 AM, Daniel Veditz <dveditz@mozilla.com>
>>> wrote:
>>>
>>> On Fri, Jan 13, 2017 at 5:11 PM, Sergey Shekyan <shekyan@gmail.com>
>>> wrote:
>>>
>>> I think that attaching a HTTP request header to synthetically initiated
>>> navigation requests (https://fetch.spec.whatwg.org/#navigation-request)
>>> will help authors to build more reliable mechanisms to detect unwanted
>>> automation.
>>>
>>>
>>> I don't see anything in that spec about "synthetic" navigation
>>> requests. Where would you define that? How would you define that? Is a
>>> scripted window.open() in a browser "synthetic"? what about an iframe in a
>>> page? Does it matter if the user expected the iframe to be there or not
>>> (such as ads)? What if the page had 100 iframes?
>>>
>>> Are you trying to solve the same problem robots.txt is trying to solve?
>>> If not what kind of automation are you talking about?
>>>
>>> -
>>> Dan Veditz
>>>
>>>
>>>
>>>
>

Received on Tuesday, 17 January 2017 19:47:02 UTC