ChatGPT gains multimodal capabilities to better assist users

ChatGPT has gained multimodal capabilities, allowing it to receive and respond to image and voice inputs. This new feature will make ChatGPT even more helpful in a variety of tasks, such as solving math problems, identifying objects, and providing recipes.

To use the image input feature, users simply need to snap a picture of what they are looking at and add the question they’d like an answer to. ChatGPT will then analyze the image and provide a response. For example, users could use this feature to identify the name of a plant, look up the nutritional information of a food item, or get help solving a math problem.

The voice input and output feature gives ChatGPT the same functionality as a voice assistant. Users can now ask ChatGPT to perform tasks or answer questions simply by speaking. ChatGPT will then process the request and respond verbally.

The sources for this piece include an article in ZDNET.

ChatGPT gains multimodal capabilities to better assist users

Top Stories

GitHub Copilot to train on user data by default

Microsoft pulls Copilot Chat from core Office apps for enterprise customers

OpenAI pauses ChatGPT erotic mode “indefinitely”

Researcher Says “APT” Label No Longer Reflects the Threat Landscape

How do you select a graph database? – Part 1

OpenAI plans major hiring push as competition intensifies

Related Articles

Top 10 reflections on information technology developments in 2025

Alphabet to buy data centre and energy firm to boost AI capacity

AI is reshaping how people look for information, Google’s Year in Search 2025 shows

Former Shopify product chief joins OpenAI to lead ChatGPT app platform

TND Newsdesk

TND Newsdesk

Jim Love

Follow Us

Popular categories

Tech News Delivered