LocalAI - Models

moonshine-tiny

Moonshine Tiny is a lightweight speech-to-text model optimized for fast transcription. It is designed for efficient on-device ASR with high accuracy relative to its size.

Links

https://github.com/moonshine-ai/moonshine

Tags

whisperx-tiny

WhisperX Tiny is a fast and accurate speech recognition model with speaker diarization capabilities. Built on OpenAI's Whisper with additional features for alignment and speaker segmentation.

Links

https://github.com/m-bain/whisperX

Tags

ibm-granite_granite-4.0-h-tiny

Granite-4.0-H-Tiny is a 7B parameter long-context instruct model finetuned from Granite-4.0-H-Tiny-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging. Granite 4.0 instruct models feature improved instruction following (IF) and tool-calling capabilities, making them more effective in enterprise applications.

Links

Tags

liquidai_lfm2-350m-math

Based on LFM2-350M, LFM2-350M-Math is a tiny reasoning model designed for tackling tricky math problems.

Links

Tags

smolvlm-500m-instruct

SmolVLM-500M is a tiny multimodal model, member of the SmolVLM family. It accepts arbitrary sequences of image and text inputs to produce text outputs. It's designed for efficiency. SmolVLM can answer questions about images, describe visual content, or transcribe text. Its lightweight architecture makes it suitable for on-device applications while maintaining strong performance on multimodal tasks. It can run inference on one image with 1.23GB of GPU RAM.

Links

Tags

jina-reranker-v1-tiny-en

This model is designed for blazing-fast reranking while maintaining competitive performance. What's more, it leverages the power of our JinaBERT model as its foundation. JinaBERT itself is a unique variant of the BERT architecture that supports the symmetric bidirectional variant of ALiBi. This allows jina-reranker-v1-tiny-en to process significantly longer sequences of text compared to other reranking models, up to an impressive 8,192 tokens.

Links

Tags

l3.3-prikol-70b-v0.2

A merge of some Llama 3.3 models because um uh yeah Went extra schizo on the recipe, hoping for an extra fun result, and... Well, I guess it's an overall improvement over the previous revision. It's a tiny bit smarter, has even more distinct swipes and nice dialogues, but for some reason it's damn sloppy. I've published the second step of this merge as a separate model, and I'd say the results are more interesting, but not as usable as this one. https://huggingface.co/Nohobby/AbominationSnowPig Prompt format: Llama3 OR Llama3 Context and ChatML Instruct. It actually works a bit better this way

Links

Tags

qihoo360_tinyr1-32b-preview

We introduce our first-generation reasoning model, Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math. We applied supervised fine-tuning (SFT) to Deepseek-R1-Distill-Qwen-32B across three target domains—Mathematics, Code, and Science — using the 360-LLaMA-Factory training framework to produce three domain-specific models. We used questions from open-source data as seeds. Meanwhile, responses for mathematics, coding, and science tasks were generated by R1, creating specialized models for each domain. Building on this, we leveraged the Mergekit tool from the Arcee team to combine multiple models, creating Tiny-R1-32B-Preview, which demonstrates strong overall performance.

Links

Tags

mistralai_ministral-3-8b-instruct-2512-multimodal

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities. The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 12GB of VRAM in FP8, and less if further quantized. Key Features: Ministral 3 8B consists of two main architectural components: - 8.4B Language Model - 0.4B Vision Encoder The Ministral 3 8B Instruct model offers the following capabilities: - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text. - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic. - System Prompt: Maintains strong adherence and support for system prompts. - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting. - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere. - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes. - Large Context Window: Supports a 256k context window. This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.

Links

Tags

mistralai_ministral-3-8b-reasoning-2512-multimodal

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities. This model is the reasoning post-trained version, trained for reasoning tasks, making it ideal for math, coding and stem related use cases. The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 24GB of VRAM in BF16, and less than 12GB of RAM/VRAM when quantized. Key Features: Ministral 3 8B consists of two main architectural components: - 8.4B Language Model - 0.4B Vision Encoder The Ministral 3 8B Reasoning model offers the following capabilities: - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text. - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic. - System Prompt: Maintains strong adherence and support for system prompts. - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting. - Reasoning: Excels at complex, multi-step reasoning and dynamic problem-solving. - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere. - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes. - Large Context Window: Supports a 256k context window. This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.

Links

Tags

mistralai_ministral-3-3b-instruct-2512-multimodal

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities. The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, capable of fitting in 8GB of VRAM in FP8, and less if further quantized. Key Features: Ministral 3 3B consists of two main architectural components: - 3.4B Language Model - 0.4B Vision Encoder The Ministral 3 3B Instruct model offers the following capabilities: - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text. - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic. - System Prompt: Maintains strong adherence and support for system prompts. - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting. - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere. - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes. - Large Context Window: Supports a 256k context window. This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.

Links

Tags

mistralai_ministral-3-3b-reasoning-2512-multimodal

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities. This model is the reasoning post-trained version, trained for reasoning tasks, making it ideal for math, coding and stem related use cases. The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, fitting in 16GB of VRAM in BF16, and less than 8GB of RAM/VRAM when quantized. Key Features: Ministral 3 3B consists of two main architectural components: - 3.4B Language Model - 0.4B Vision Encoder The Ministral 3 3B Reasoning model offers the following capabilities: - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text. - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic. - System Prompt: Maintains strong adherence and support for system prompts. - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting. - Reasoning: Excels at complex, multi-step reasoning and dynamic problem-solving. - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere. - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes. - Large Context Window: Supports a 256k context window. This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.

Links

Tags

moondream2

a tiny vision language model that kicks ass and runs anywhere

Links

Tags