Gemini App Gets Remade ‘Audio Overview’ player
A report from 9to5 Google’s Abner Li this week brings with it news Google has added what Li describes as a “nice quality-of-life update” to the Gemini app on iOS and Android: the ability to generate audio overviews, replete with native playback controls. The feature is in version 16.27 of the Google app on Android, as well as Gemini on iOS,
“Previously, tapping on a generated Audio Overview opened the file in your browser with a long URL,” Li wrote of the interface changes for Audio Overview. “You could listen in that Chrome tab or download (and use the Files app) for an unwieldy experience. Now, the Android and iOS app, like gemini.google.com, uses a native player.”
Li’s story includes screenshots (on Android) showing asking Gemini to “Generate Audio Overview” of various PDF files. In a broad scope, this functionality strikes me as conceptually similar to how, Forbes for instance, includes a button on webpages that people can click or tap to have an article read aloud to them. It effectively turns news stories into audiobooks—which, as Dr. Victor Pineda told me last year, were originally conceived by people in the Blind community. Ipso facto, the Gemini app’s new Audio Overview feature is, at its core, an accessibility feature. Beyond Blind and low vision people, the overviews may very well be a boon to, say, people who are strong auditory learners in grasping information. Likewise, it’s plausible someone with limited range of motion may find audio content more accessible than manually scrolling the aforementioned PDF document. Whatever the reason(s), it’s obvious the existence of Generate Audio Overview is as much about accessibility as it is purported convenience.
For its part, OpenAI has a Voice Mode for ChatGPT. I covered its Read Aloud feature last year, with my story including an interview with OpenAI’s Joanne Jang. She told me all about Read Aloud, as well as OpenAI’s philosophy on prioritizing accessibility for all.