# Multimodal

Grok 2

xAI's reasoning-focused chatbot with image generation and multi-modal input support.

4.3 (4)

Nexa AI

On-device AI runtime for running models locally across phones, PCs, and edge hardware.

4.8 (6)

Project Astra

Google DeepMind's universal AI agent that sees, hears, and reasons about the world in real time.

5.0 (4)

OmniVision

Compact vision-language model built for on-device and edge AI deployment.

4.6 (5)

Mistrezz AI

Uncensored gamified adult AI chat with private NSFW companions, voice, images and video.

Humain AI

A Saudi-backed AI company building large-scale infrastructure and multimodal Arabic LLMs for global AI services.

Contact

Uni-1 by Luma AI

Multimodal AI model for high-fidelity image generation with strong spatial reasoning and accurate text rendering.

Black Forest Labs

A pioneering AI startup specializing in state-of-the-art generative models for image and video synthesis.

5.0 (6)

#10

Veo 4

Multi-shot cinematic AI video generation with native synchronized audio

4.6 (5)

#11

OpenArt

Creative AI suite for generating art, video, and audio from text or images

#12

SuperAnnotate

End-to-end data annotation and management platform for building high-quality AI training datasets.

4.4 (5)

#13

Gemma 3

An open-source AI model optimized for single-GPU performance, supporting multimodal inputs and over 140 languages.

#14

GLM-4.6V

Open-source multimodal GLM from Z.ai unifying vision, text, and tool calling for long-context reasoning, search, coding, and UI-to-code.

4.3 (6)

#15

HappyHorse

Open-source model that generates video paired with synchronized audio from a single prompt.

#16

OpenAI GPT-4

OpenAI's multimodal large language model for text, code, and image understanding.

4.5 (6)

#17

Codex CLI

Open-source terminal AI assistant that reads, writes, and runs code locally with multimodal input.

#18

LTX 2.3 Video Generator

AI video generator that unifies text prompts, images, and audio into cohesive short-form clips.

4.5 (4)

#19

MyShell

No-code AI consumer platform to build, share, and own AI apps.

#20

LiveKit Agents

Open-source framework for building real-time, multimodal voice and video AI agents.

4.5 (6)

#21

Aivah

Build interactive AI avatar agents for immersive digital experiences

4.8 (4)

#22

OpenAI Advanced Voice

Real-time, natural voice conversations with ChatGPT

#23

Voice-gen.ai

Unified platform for AI-generated voiceovers, images, and videos in one workspace.

4.8 (4)

#24

WebVoyager

An LMM-powered web agent completing user instructions end-to-end by interacting with real-world websites.

5.0 (5)

#25

Jina AI

Multimodal search foundation for embeddings, reranking, and RAG pipelines.

4.2 (5)

#26

Seedance 1.5 Pro

AI creation platform for generating videos with synchronized audio (voice, lip-sync, SFX) from text or images, plus image generation and AI image editing tools.