- From: Paola Di Maio <paoladimaio10@gmail.com>
- Date: Sun, 22 Dec 2019 08:20:19 +0800
- To: Max Weiss <max_weiss@college.harvard.edu>, W3C AIKR CG <public-aikr@w3.org>
- Message-ID: <CAMXe=Sptosn6cw9xnHfTG7D_N946F_a9=UJ19W4j8Mzxjtfpnw@mail.gmail.com>
Dear Max Thanks a lot for your reply and for joining the CG, and for sharing the workings behind the bot in our email, which I am sharing with the group below- I am personally interested in understanding as much as possible how neural networks do their wonder, and in particular, how uncertainties in outputs /results can be reduced so that they can be used consistently. I dont know to what extent the outputs of NN can be explained, but nothing can stop us from trying to understand them. I ll study in more detail later What I perceived from the article is that the bot wrote human like comments That is the feature extraordinaire I d like to hear more about :-) That's what I am interested in, a bot that generates logical and correct sentences to the point of being passed for human? Tell us more! If however the 'fake'' sentences were generated by parsing and meshing existing text written by humans to generate its own text, is a different feature (not generate text from scratch but generating text from merging existing test) which is also a fairly extraordinary feat. Humans learn how to speak by listening to others and reproducing the language, before we can master our own. Maybe the paper could explain what the bot does exactly in more detail, but definitely yes, I d like to learn fromyou how your bot is producing such good natural language comments either way because that is in itself quite interesting work Could you show us what the NN looks like and how you put it together? We are working mostly asynchronously these days, so if you have anything to share maybe you can put together a few slides and do a narration of sorts, or a write up, but I do not rule out having some live calls from time to time if people want to present some topic. Look forward to yo contribution! Best regards PDM On Sun, Dec 22, 2019 at 4:29 AM Max Weiss <max_weiss@college.harvard.edu> wrote: > Hi Paola, > > Your CG seems to be discussing some interesting topics, so I just > requested join. I think your suspicions around my results are very wise, > but allow me to explain a little more about how they were possible and why > they are significant. > > The “bot” described throughout the paper primarily describes the script > used to actually submit the deepfake comments. This really just boiled down > to a simple for loop that used Selenium and Proxymesh to make make requests > and drive chrome to automate the submission process. I would be happy to > share this code with you, but there is not anything particularly > interesting or novel in its architecture. > > The actual code used to finetune GPT-2 with Tensorflow and generate the > deepfake comments was written by Max Woolf ( > https://colab.research.google.com/drive/1VLG8e7YSEwypxU-noRNhsv5dW4NfTGce#scrollTo=Fa6p6arifSL0&forceEdit=true&sandboxMode=true). > This is also a pretty basic process and diverges very little from the work > others interested in natural language generation have undertaken. > > As your experience informs, deepfaking text is not at the level of, for > example, writing an entire article using GPT-2. But, the key interesting > finding from my work is that this is a false metric—the methods for text > generation* are* already at the point where a meaningful attack on > something like the federal public comment process is possible and easy. In > other words, perhaps the best articulation of my findings is not "AI > methods are powerful enough to fledge a meaningful attack on federal > websites” but rather “federal websites and similar platforms are so > vulnerable to manipulation that current AI methods are already advanced > enough to fledge a meaningful attack." > > To elaborate, the barrier of deepfake competency is far lower than one > would think given a specific task. In this instance, I was able to finetune > GPT-2 using thousands of comments very similar to those I wanted to > generate, aside from some small edits requiring simple search-and-replace. > As described in the paper, of the comments generated with OpenAI’s smallest > model, about half were what I would qualify as both “highly relevant” to > the comment process and “highly sensible.” Perhaps this is where you raise > flags about the efficacy of GPT-2, but I think this misses my focus. In a > few hours, I was able to filter out the lower quality comments to achieve a > set of 1000 high-quality comments, that ended up representing a majority of > all the comments submitted during the period. In this way, I showed that it > was very easy to utilize GPT-2 to quickly generate and filter a high volume > of very good comments. Given a little more time and money, this process > could have been automated with paid grammar/syntax software, testing the > generated comments against the training set, and requiring a list of key > words for relevance > > I hope this explanation helps to contextualize my findings. As you > suspected, I could not simply throw a bunch of data into GPT-2 and spit out > 1000 comments that all passed as human, and I hope that is not how my > results are perceived. What I did show is that current AI methods are > already in a place where it took only about a week for me to fledge a > meaningful attack that totally undermined the efficacy of the federal > public comment process. > > Best, > Max > > > > On Dec 20, 2019, at 10:07 PM, Paola Di Maio <paola.dimaio@gmail.com> > wrote: > > Dear Max > cc AI KR W3C > > Thanks for this interesting work > > https://techscience.org/a/2019121801/?utm_campaign=the_cybersecurity_202&utm_medium=Email&utm_source=Newsletter&wpisrc=nl_cybersecurity202&wpmm=1#Authors > > > I am researching deepfakes, and I am also researching fake bots > and fake research claim > > I d like to look into the bot to see how the capability of developing and > delivering such > clever deepfaking BOT can be achieved and possibly replicate your results > if possible > > Ultimately, I d like to first learn more about this bot, which surely > sounds very clever but in my experience > sounds too good to be true > > I take this opportunity to invite you to join our CG and if you feel like, > we can set up a live call with you and other group members > and have a chat about this work, > https://www.w3.org/community/aikr/ > > > > > PDM > > > > >
Received on Sunday, 22 December 2019 00:21:02 UTC