Model Gallery

1 models from 1 repositories

Filter by type:

Filter by tags:

mistral-community_pixtral-12b

Highlights: - Natively multimodal, trained with interleaved image and text data - Strong performance on multimodal tasks, excels in instruction following - Maintains state-of-the-art performance on text-only benchmarks Architecture: - New 400M parameter vision encoder trained from scratch - 12B parameter multimodal decoder based on Mistral Nemo - Supports variable image sizes and aspect ratios - Supports multiple images in the long context window of 128k tokens

Repository: localaiLicense: apache-2.0