ChatGPT gains multimodal capabilities to better assist users

September 26, 2023

ChatGPT has gained multimodal capabilities, allowing it to receive and respond to image and voice inputs. This new feature will make ChatGPT even more helpful in a variety of tasks, such as solving math problems, identifying objects, and providing recipes.

To use the image input feature, users simply need to snap a picture of what they are looking at and add the question they’d like an answer to. ChatGPT will then analyze the image and provide a response. For example, users could use this feature to identify the name of a plant, look up the nutritional information of a food item, or get help solving a math problem.

The voice input and output feature gives ChatGPT the same functionality as a voice assistant. Users can now ask ChatGPT to perform tasks or answer questions simply by speaking. ChatGPT will then process the request and respond verbally.

The sources for this piece include an article in ZDNET.

Top Stories

Related Articles

December 23, 2025 Editor's Notes: This is the first of two articles reflecting on the year but Yogi Schulz. Schulz' more...

December 23, 2025 Google parent company Alphabet said Monday that it will acquire Intersect Power for $4.75 billion in cash more...

December 22, 2025 Artificial intelligence dominated global search behaviour in 2025, with Google’s own AI assistant, Gemini, emerging as the more...

December 22, 2025 OpenAI has hired the former head of Shopify’s core product organization to lead its next phase of more...

Jim Love

Jim is an author and podcast host with over 40 years in technology.

Share:
Facebook
Twitter
LinkedIn