Primary navigation

Multimodal

Multimodality refers to a model's ability to understand and generate content using various input types—such as text, images, audio, and video.

VisionImagesSpeech