Multimodal

Combining vision and language? Evaluate image captioning, visual QA, text-to-image generation, and cross-modal retrieval models.

5 tasks2 datasets

Tasks in Multimodal

Explore Other Areas