Model Gallery

7 models from 1 repositories

Filter by type:

Filter by tags:

qwen_qwen3.5-35b-a3b
Qwen3.5-35B-A3B is a quantized multimodal language model with 35B parameters using an A3B MoE architecture. It supports image-text understanding and chat interactions via llama-cpp backend.

Repository: localaiLicense: apache-2.0

qwen_qwen3.5-0.8b
Qwen 3.5 0.8B parameter model quantized for llama-cpp backend. Supports chat interactions and multimodal image-text inputs.

Repository: localaiLicense: apache-2.0

qwen_qwen3.5-2b
Qwen3.5-2B is a highly efficient, instruction-tuned multilingual language model available in various quantized GGUF formats. Optimized for llama-cpp inference, it supports chat and completion tasks with strong performance on low-RAM hardware. The model is available in multiple quantization levels ranging from Q8_0 to IQ2_M to balance quality and resource usage.

Repository: localaiLicense: apache-2.0

qwen_qwen3.5-4b
Qwen3.5-4B is a multimodal LLM with 4 billion parameters, optimized for chat and vision tasks. This GGUF quantized version enables efficient local inference via llama-cpp backend. Supports both text and image input for enhanced conversational capabilities.

Repository: localaiLicense: apache-2.0

qwen_qwen3-next-80b-a3b-thinking

Repository: localaiLicense: apache-2.0

mox-small-1-i1
The model, **vanta-research/mox-small-1**, is a small-scale text-generation model optimized for conversational AI tasks. It supports chat, persona research, and chatbot applications. The quantized versions (e.g., i1-Q4_K_M, i1-Q4_K_S) are available for efficient deployment, with the i1-Q4_K_S variant offering the best balance of size, speed, and quality. The model is designed for lightweight inference and is compatible with frameworks like HuggingFace Transformers.

Repository: localaiLicense: apache-2.0

tildeopen-30b-instruct-lv-i1
The **TildeOpen-30B-Instruct-LV-i1-GGUF** is a quantized version of the base model **pazars/TildeOpen-30B-Instruct-LV**, optimized for deployment. It is an instruct-based language model trained on diverse datasets, supporting multiple languages (en, de, fr, pl, ru, it, pt, cs, nl, es, fi, tr, hu, bg, uk, bs, hr, da, et, lt, ro, sk, sl, sv, no, lv, sr, sq, mk, is, mt, ga). Licensed under CC-BY-4.0, it uses the Transformers library and is designed for efficient inference. The quantized version (with imatrix format) is tailored for deployment on devices with limited resources, while the base model remains the original, high-quality version.

Repository: localaiLicense: cc-by-4.0