Friday, July 19, 2024
- Advertisement -

    Latest Posts

    Here’s how ChatGPT and Google are using AI for accessibility

    In the past week, the AI landscape has been dominated by news of the various developments in AI models, particularly with the launch of GPT-4o and the various updates to Google Gemini at Google I/O 2024. Among the varied capabilities of the new models, multimodality has been prominent. Multimodality allows a model to process and provide output in the form of text, audio, or visual.

    This has contributed immensely to the technology required for accessibility, especially in creating virtual assistance for those who are visually impaired. Here’s all you need to know about the accessibility tools, Open AI and Google will be launching iteratively:

    Along with the launch of its latest flagship model GPT-4o, OpenAI also shared updates to ‘Be My Eyes’, a digital visual assistant that assists those who are blind or have low vision. ‘Be My Eyes’ was first integrated with ChatGPT in 2023. It was powered by OpenAI’s GPT-4 language model, to convert visual input into text ie. providing a written description of one’s surroundings using a camera, that was read out by ChatGPT’s Voice Mode.

    GPT 4-o’s ‘Be My Eyes’

    Now, ‘Be My Eyes’ has been integrated with GPT-4o which OpenAI claims has GPT-4  level intelligence but “much faster and improves on its capabilities across text, voice, and vision.” It is also natively multimodal and has a voice response that OpenAI claims is similar to human-like response time. Thus, those who require this tool for accessibility can point their device to any object, or ask ChatGPT questions vocally and receive a human-like response from the ‘Be My Eyes’ talking assistant in real time.

    Google TalkBack

    Google announced during its developer conference Google I/O, that it would integrate its AI model Gemini into its Pixel phones soon. This included a multimodal AI model built for Android phones called Gemini Nano.

    Google shared that this AI model would bring certain upgrades to its screen reader ‘Talkback’. ‘Talkback’ is also an accessibility tool for the visually impaired. It provides users with a description of any image appearing on the screen.

    With the integration of Gemini, ‘TalkBack’ will provide AI-based descriptions of images even without an internet connection. Google also claimed that it would reduce latency compared to its previous iteration.

    Google Lookout

    Google also announced that it would further integrate AI into Google Lookout, an accessibility tool that uses a phone’s camera to provide users with descriptions of objects. It’s capable of processing text, documents, currency,  food labels, and images. Earlier this year, Google made AI-generated image captions in English, available globally on Lookout.

    The new update called ‘Find mode’ which is rolling out in beta, is capable of identifying objects in the users’ surroundings. Users will be able to select from a list of seven categories of items such as seating, tables, bathrooms, etc —and as they move their camera around the room, Lookout will notify them of the direction and distance to the item.

    Also Read:

    The post Here’s how ChatGPT and Google are using AI for accessibility appeared first on MEDIANAMA.

    Latest Posts

    - Advertisement -

    Don't Miss

    Stay in touch

    To be updated with all the latest news, offers and special announcements.