AI alt text, Google Gemini and TalkBack

To the RQTF

Continuing the discussion on AI,  here at the Centre we've been testing the Android 15 update that now embeds Google Gemini alt text assessment into the TalkBack screen reader. I think this may be the first time AI alt text has been directly built into a screen reader, so it's been of considerable interest for us. For those who would like to test it, it's a matter of getting an image to have focus, then bringing up the TalkBack menu with the left up-then-across swipe gesture,, and there's then the new 'describe image' option.  This means that by default you can still use the manual alt text, or you can use this new feature if you want Gemini to attempt AI alt text.

Here's what we've learnt for our tests, both good and bad

The good:

  *   It is far more descriptive then most manually entered descriptions
  *   For infographics it is remarkably detailed
  *   For graphs, it does a good job sometimes and not other times, but generally better than the Microsoft AI discussed in the current draft note
  *   For general images like in news items it is again quite detailed with the image and text all being presented and the action of the image well captured

The bad:

  *   The AI alt text changes slightly every time the describe image option is selected even if it's the same image!  This really surprised us but every time we did it the result was a bit different, especially on news item images where it could be significantly different each time although still largely representative of the image, text, and action
  *   It still gets things wrong but more granular in its errors for example confusing an athlete punching the air with a salute. The first time it got it right, the second time we did the same image it got it wrong.
  *   It sometimes provides subjective information which is not great. For example, we tried a toy reindeer image, and it was good in identifying the reindeer, that it was a toy and its colour and appearance, but finished off by saying 'the image is cute and festive'. The second scan did not include this statement.

My colleagues are putting together a news item on our website shortly which will show the some of the examples we've tested with and the benefits and issues, so I'll circulate this once we've written it up The change in description each time it scans was the thing that most surprised me. I think eh positive for our Note is that the alt text is detailed, concise and in some case s better than the human alt text for the same image and has truly integrated itself in a way that's usable for people that want to know what n image is, especially if there's no alt text initially. It's leaps and bounds ahead of things like the Microsoft Seeing AI ap for example however it still has all the usual AI problems - it lies, it's inconsistent, it gives opinions when it's not contextually appropriate, and also appreciate this is a specific AI platform and not technology-agnostic.  Yet with all that said, being able to instantly find out what an image is no matter what I'm viewing on my phone be it website, app or e-mail image  is incredibly useful.

I'd like to see how we can put some of this into the draft Note, the challenge I appreciate is that we can't use copyrighted images to show these things so would welcome some thoughts on the best approach.

Thanks everyone

Scott.



Dr Scott Hollier
Chief Executive Officer
[Centre for Accessibility Australia logo]<https://www.accessibility.org.au/>
Centre For Accessibility Australia Ltd.
Phone: +61 (0)430 351 909
Email: scott.hollier@accessibility.org.au<mailto:scott.hollier@accessibility.org.au>
Address: Suite 5, Belmont Hub, 213 Wright Street, Cloverdale WA 6105
accessibility.org.au<https://www.accessibility.org.au/>
Subscribe to our newsletter<http://eepurl.com/drA-ib>

[X icon]<https://twitter.com/centrefora11y>[Instagram icon]<https://www.instagram.com/centreforaccessibility/> [Facebook icon] <https://www.facebook.com/centrefora11y/>  [LinkedIn icon] <https://www.linkedin.com/company/centreforaccessibility/>

CFA Australia respectfully acknowledges the Traditional Owners of Country across Australia and pay our respects to Elders past and present.

Received on Tuesday, 29 October 2024 09:14:42 UTC