Model Gallery

47 models from 1 repositories

Filter by type:

Filter by tags:

qwen3.5-27b-claude-4.6-opus-reasoning-distilled-i1
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF - A GGUF quantized model optimized for local inference. Specialized for reasoning and chain-of-thought tasks. Based on Qwen 3.5 architecture with enhanced language understanding. Available in multiple quantization levels for various hardware requirements. Distilled from Claude-style reasoning models for enhanced logical reasoning capabilities.

Repository: localaiLicense: apache-2.0

qwen3.5-4b-claude-4.6-opus-reasoning-distilled
Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled-GGUF - A GGUF quantized model optimized for local inference. Specialized for reasoning and chain-of-thought tasks. Based on Qwen 3.5 architecture with enhanced language understanding. Available in multiple quantization levels for various hardware requirements. Distilled from Claude-style reasoning models for enhanced logical reasoning capabilities.

Repository: localaiLicense: apache-2.0

qwen3-coder-30b-a3b-instruct-rtpurbo-i1
The model in question is a quantized version of the original **Qwen3-Coder** large language model, specifically tailored for code generation. The base model, **RTP-LLM/Qwen3-Coder-30B-A3B-Instruct-RTPurbo**, is a 30B-parameter variant optimized for instruction-following and code-related tasks. It employs the **A3B attention mechanism** and is trained on diverse data to excel in programming and logical reasoning. The current repository provides a quantized (compressed) version of this model, which is suitable for deployment on hardware with limited memory but loses some precision compared to the original. For a high-fidelity version, the unquantized base model is recommended.

Repository: localai

qwen3-vl-30b-a3b-instruct
Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date. This generation delivers comprehensive upgrades across the board: superior text understanding & generation, deeper visual perception & reasoning, extended context length, enhanced spatial and video dynamics comprehension, and stronger agent interaction capabilities. Available in Dense and MoE architectures that scale from edge to cloud, with Instruct and reasoning‑enhanced Thinking editions for flexible, on-demand deployment. #### Key Enhancements: * **Visual Agent**: Operates PC/mobile GUIs—recognizes elements, understands functions, invokes tools, completes tasks. * **Visual Coding Boost**: Generates Draw.io/HTML/CSS/JS from images/videos. * **Advanced Spatial Perception**: Judges object positions, viewpoints, and occlusions; provides stronger 2D grounding and enables 3D grounding for spatial reasoning and embodied AI. * **Long Context & Video Understanding**: Native 256K context, expandable to 1M; handles books and hours-long video with full recall and second-level indexing. * **Enhanced Multimodal Reasoning**: Excels in STEM/Math—causal analysis and logical, evidence-based answers. * **Upgraded Visual Recognition**: Broader, higher-quality pretraining is able to “recognize everything”—celebrities, anime, products, landmarks, flora/fauna, etc. * **Expanded OCR**: Supports 32 languages (up from 19); robust in low light, blur, and tilt; better with rare/ancient characters and jargon; improved long-document structure parsing. * **Text Understanding on par with pure LLMs**: Seamless text–vision fusion for lossless, unified comprehension. #### Model Architecture Updates: 1. **Interleaved-MRoPE**: Full‑frequency allocation over time, width, and height via robust positional embeddings, enhancing long‑horizon video reasoning. 2. **DeepStack**: Fuses multi‑level ViT features to capture fine-grained details and sharpen image–text alignment. 3. **Text–Timestamp Alignment:** Moves beyond T‑RoPE to precise, timestamp‑grounded event localization for stronger video temporal modeling. This is the weight repository for Qwen3-VL-30B-A3B-Instruct.

Repository: localaiLicense: apache-2.0

baidu_ernie-4.5-21b-a3b-thinking
Over the past three months, we have continued to scale the thinking capability of ERNIE-4.5-21B-A3B, improving both the quality and depth of reasoning, thereby advancing the competitiveness of ERNIE lightweight models in complex reasoning tasks. We are pleased to introduce ERNIE-4.5-21B-A3B-Thinking, featuring the following key enhancements: Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, text generation, and academic benchmarks that typically require human expertise. Efficient tool usage capabilities. Enhanced 128K long-context understanding capabilities. Note: This version has an increased thinking length. We strongly recommend its use in highly complex reasoning tasks. ERNIE-4.5-21B-A3B-Thinking is a text MoE post-trained model, with 21B total parameters and 3B activated parameters for each token.

Repository: localaiLicense: apache-2.0

qwen3-30b-a3b
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Qwen3-30B-A3B has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 30.5B in total and 3.3B activated Number of Paramaters (Non-Embedding): 29.9B Number of Layers: 48 Number of Attention Heads (GQA): 32 for Q and 4 for KV Number of Experts: 128 Number of Activated Experts: 8 Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation.

Repository: localaiLicense: apache-2.0

qwen3-235b-a22b-instruct-2507
We introduce the updated version of the Qwen3-235B-A22B non-thinking mode, named Qwen3-235B-A22B-Instruct-2507, featuring the following key enhancements: Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage. Substantial gains in long-tail knowledge coverage across multiple languages. Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation. Enhanced capabilities in 256K long-context understanding.

Repository: localaiLicense: apache-2.0

qwen3-32b
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Qwen3-32B has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 32.8B Number of Paramaters (Non-Embedding): 31.2B Number of Layers: 64 Number of Attention Heads (GQA): 64 for Q and 8 for KV Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation.

Repository: localaiLicense: apache-2.0

qwen3-14b
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Qwen3-14B has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 14.8B Number of Paramaters (Non-Embedding): 13.2B Number of Layers: 40 Number of Attention Heads (GQA): 40 for Q and 8 for KV Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation.

Repository: localaiLicense: apache-2.0

qwen3-8b
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Model Overview Qwen3-8B has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 8.2B Number of Paramaters (Non-Embedding): 6.95B Number of Layers: 36 Number of Attention Heads (GQA): 32 for Q and 8 for KV Context Length: 32,768 natively and 131,072 tokens with YaRN.

Repository: localaiLicense: apache-2.0

qwen3-4b
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Qwen3-4B has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 4.0B Number of Paramaters (Non-Embedding): 3.6B Number of Layers: 36 Number of Attention Heads (GQA): 32 for Q and 8 for KV Context Length: 32,768 natively and 131,072 tokens with YaRN.

Repository: localaiLicense: apache-2.0

qwen3-1.7b
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Qwen3-1.7B has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 1.7B Number of Paramaters (Non-Embedding): 1.4B Number of Layers: 28 Number of Attention Heads (GQA): 16 for Q and 8 for KV Context Length: 32,768

Repository: localaiLicense: apache-2.0

qwen3-0.6b
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Qwen3-0.6B has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 0.6B Number of Paramaters (Non-Embedding): 0.44B Number of Layers: 28 Number of Attention Heads (GQA): 16 for Q and 8 for KV Context Length: 32,768

Repository: localaiLicense: apache-2.0

shuttleai_shuttle-3.5
A fine-tuned version of Qwen3 32b, emulating the writing style of Claude 3 models and thoroughly trained on role-playing data. Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation. Shuttle 3.5 has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Number of Parameters: 32.8B Number of Paramaters (Non-Embedding): 31.2B Number of Layers: 64 Number of Attention Heads (GQA): 64 for Q and 8 for KV Context Length: 32,768 natively and 131,072 tokens with YaRN.

Repository: localaiLicense: apache-2.0

claria-14b
Claria 14b is a lightweight, mobile-compatible language model fine-tuned for psychological and psychiatric support contexts. Built on Qwen-3 (14b), Claria is designed as an experimental foundation for therapeutic dialogue modeling, student simulation training, and the future of personalized mental health AI augmentation. This model does not aim to replace professional care. It exists to amplify reflective thinking, model therapeutic language flow, and support research into emotionally aware AI. Claria is the first whisper in a larger project—a proof-of-concept with roots in recursion, responsibility, and renewal.

Repository: localaiLicense: apache-2.0

qwen3-14b-griffon-i1
This is a fine-tuned version of the Qwen3-14B model using the high-quality OpenThoughts2-1M dataset. Fine-tuned with Unsloth’s TRL-compatible framework and LoRA for efficient performance, this model is optimized for advanced reasoning tasks, especially in math, logic puzzles, code generation, and step-by-step problem solving. Training Dataset Dataset: OpenThoughts2-1M Source: A synthetic dataset curated and expanded by the OpenThoughts team Volume: ~1.1M high-quality examples Content Type: Multi-turn reasoning, math proofs, algorithmic code generation, logical deduction, and structured conversations Tools Used: Curator Viewer This dataset builds upon OpenThoughts-114k and integrates strong reasoning-centric data sources like OpenR1-Math and KodCode. Intended Use This model is particularly suited for: Chain-of-thought and step-by-step reasoning Code generation with logical structure Educational tools for math and programming AI agents requiring multi-turn problem-solving

Repository: localaiLicense: apache-2.0

vulpecula-4b
**Vulpecula-4B** is fine-tuned based on the traces of **SK1.1**, consisting of the same 1,000 entries of the **DeepSeek thinking trajectory**, along with fine-tuning on **Fine-Tome 100k** and **Open Math Reasoning** datasets. This specialized 4B parameter model is designed for enhanced mathematical reasoning, logical problem-solving, and structured content generation, optimized for precision and step-by-step explanation.

Repository: localaiLicense: apache-2.0

akhil-theerthala_kuvera-8b-v0.1.0
This model is a fine-tuned version of Qwen/Qwen3-8B designed to answer personal finance queries. It has been trained on a specialized dataset of real Reddit queries with synthetically curated responses, focusing on understanding both the financial necessities and the psychological context of the user. The model aims to provide empathetic and practical advice for a wide range of personal finance topics. It leverages a base model's strong language understanding and generation capabilities, further enhanced by targeted fine-tuning on domain-specific data. A key feature of this model is its training to consider the emotional and psychological state of the person asking the query, alongside the purely financial aspects.

Repository: localaiLicense: mit

qwen_qwen3-30b-a3b-instruct-2507
We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements: Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage. Substantial gains in long-tail knowledge coverage across multiple languages. Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation. Enhanced capabilities in 256K long-context understanding.

Repository: localaiLicense: apache-2.0

qwen_qwen3-30b-a3b-thinking-2507
Over the past three months, we have continued to scale the thinking capability of Qwen3-30B-A3B, improving both the quality and depth of reasoning. We are pleased to introduce Qwen3-30B-A3B-Thinking-2507, featuring the following key enhancements: Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks that typically require human expertise. Markedly better general capabilities, such as instruction following, tool usage, text generation, and alignment with human preferences. Enhanced 256K long-context understanding capabilities. NOTE: This version has an increased thinking length. We strongly recommend its use in highly complex reasoning tasks.

Repository: localaiLicense: apache-2.0

qwen_qwen3-4b-instruct-2507
We introduce the updated version of the Qwen3-4B non-thinking mode, named Qwen3-4B-Instruct-2507, featuring the following key enhancements: Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage. Substantial gains in long-tail knowledge coverage across multiple languages. Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation. Enhanced capabilities in 256K long-context understanding.

Repository: localaiLicense: apache-2.0

Page 1