LocalAI - Models

lfm2.5-8b-a1b

Try LFM • Docs • LEAP • Discord # LFM2.5-8B-A1B LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning. - **On-device personal assistant**: Designed to power real-life applications, chaining tool calls, and following complex instructions on all devices. - **Compressed performance**: Competitive with much larger dense and MoE models on instruction following and agentic tasks. - **Unmatched throughput**: Fastest in its size class on both CPU and GPU inference, with day-one support for llama.cpp, MLX, vLLM, and SGLang. Find more information about LFM2.5-8B-A1B in our blog post. **AA-Omniscience Index (higher is better) rewards correct answers and penalizes hallucinations. Scores range from -100 to 100. See more results on Artificial Analysis.* ## 🗒️ Model Details LFM2.5-8B-A1B is a general-purpose text-only model with the following features: ...

Links

https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-GGUF

Tags

baidu_ernie-4.5-21b-a3b-thinking

Over the past three months, we have continued to scale the thinking capability of ERNIE-4.5-21B-A3B, improving both the quality and depth of reasoning, thereby advancing the competitiveness of ERNIE lightweight models in complex reasoning tasks. We are pleased to introduce ERNIE-4.5-21B-A3B-Thinking, featuring the following key enhancements: Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, text generation, and academic benchmarks that typically require human expertise. Efficient tool usage capabilities. Enhanced 128K long-context understanding capabilities. Note: This version has an increased thinking length. We strongly recommend its use in highly complex reasoning tasks. ERNIE-4.5-21B-A3B-Thinking is a text MoE post-trained model, with 21B total parameters and 3B activated parameters for each token.

Links

Tags

qwen3-235b-a22b-instruct-2507

We introduce the updated version of the Qwen3-235B-A22B non-thinking mode, named Qwen3-235B-A22B-Instruct-2507, featuring the following key enhancements: Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage. Substantial gains in long-tail knowledge coverage across multiple languages. Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation. Enhanced capabilities in 256K long-context understanding.

Links

Tags

qwen3-8b-shiningvaliant3

Shining Valiant 3 is a science, AI design, and general reasoning specialist built on Qwen 3. Finetuned on our newest science reasoning data generated with Deepseek R1 0528! AI to build AI: our high-difficulty AI reasoning data makes Shining Valiant 3 your friend for building with current AI tech and discovering new innovations and improvements! Improved general and creative reasoning to supplement problem-solving and general chat performance. Small model sizes allow running on local desktop and mobile, plus super-fast server inference!

Links

Tags

omega-qwen3-atom-8b

Omega-Qwen3-Atom-8B is a powerful 8B-parameter model fine-tuned on Qwen3-8B using the curated Open-Omega-Atom-1.5M dataset, optimized for math and science reasoning. It excels at symbolic processing, scientific problem-solving, and structured output generation—making it a high-performance model for researchers, educators, and technical developers working in computational and analytical domains.

Links

Tags

qwen_qwen3-30b-a3b-instruct-2507

We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements: Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage. Substantial gains in long-tail knowledge coverage across multiple languages. Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation. Enhanced capabilities in 256K long-context understanding.

Links

Tags

qwen_qwen3-30b-a3b-thinking-2507

Over the past three months, we have continued to scale the thinking capability of Qwen3-30B-A3B, improving both the quality and depth of reasoning. We are pleased to introduce Qwen3-30B-A3B-Thinking-2507, featuring the following key enhancements: Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks that typically require human expertise. Markedly better general capabilities, such as instruction following, tool usage, text generation, and alignment with human preferences. Enhanced 256K long-context understanding capabilities. NOTE: This version has an increased thinking length. We strongly recommend its use in highly complex reasoning tasks.

Links

Tags

qwen_qwen3-4b-instruct-2507

We introduce the updated version of the Qwen3-4B non-thinking mode, named Qwen3-4B-Instruct-2507, featuring the following key enhancements: Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage. Substantial gains in long-tail knowledge coverage across multiple languages. Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation. Enhanced capabilities in 256K long-context understanding.

Links

Tags

qwen_qwen3-4b-thinking-2507

Over the past three months, we have continued to scale the thinking capability of Qwen3-4B, improving both the quality and depth of reasoning. We are pleased to introduce Qwen3-4B-Thinking-2507, featuring the following key enhancements: Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks that typically require human expertise. Markedly better general capabilities, such as instruction following, tool usage, text generation, and alignment with human preferences. Enhanced 256K long-context understanding capabilities. NOTE: This version has an increased thinking length. We strongly recommend its use in highly complex reasoning tasks.

Links

Tags

qwen3-stargate-sg1-uncensored-abliterated-8b-i1

This repo contains the full precision source code, in "safe tensors" format to generate GGUFs, GPTQ, EXL2, AWQ, HQQ and other formats. The source code can also be used directly. This model is specifically for SG1 (Stargate Series), science fiction, story generation (all genres) but also does coding and general tasks too. This model can also be used for Role play. This model will produce uncensored content (see notes below). Fine tune (6 epochs, using Unsloth for Win 11) on an inhouse generated dataset to simulate / explore the Stargate SG1 Universe. This version has the "canon" of all 10 seasons of SG1. Model also contains, but not trained, on content from Stargate Atlantis, and Universe. Fine tune process adds knowledge to the model, and alter all aspects of its operations. Float32 (32 bit precision) was used to further increase the model's quality. This model is based on "Goekdeniz-Guelmez/Josiefied-Qwen3-8B-abliterated-v1". Example generations at the bottom of this page. This is a Stargate (SG1) fine tune (1,331,953,664 of 9,522,689,024 (13.99% trained)), SIX epochs on this model. As this is an instruct model, it will also benefit from a detailed system prompt too.

Links

Tags

llama3.2-3b-shiningvaliant2-i1

Shining Valiant 2 is a chat model built on Llama 3.2 3b, finetuned on our data for friendship, insight, knowledge and enthusiasm. Finetuned on meta-llama/Llama-3.2-3B-Instruct for best available general performance Trained on a variety of high quality data; focused on science, engineering, technical knowledge, and structured reasoning Also available for Llama 3.1 70b and Llama 3.1 8b! Version This is the 2024-09-27 release of Shining Valiant 2 for Llama 3.2 3b.

Links

Tags

llama3.1-8b-shiningvaliant2

Shining Valiant 2 is a chat model built on Llama 3.1 8b, finetuned on our data for friendship, insight, knowledge and enthusiasm. Finetuned on meta-llama/Meta-Llama-3.1-8B-Instruct for best available general performance Trained on a variety of high quality data; focused on science, engineering, technical knowledge, and structured reasoning

Links

Tags

llama-3.1_openscholar-8b

Llama-3.1_OpenScholar-8B is a fine-tuned 8B for scientific literature synthesis. The Llama-3.1_OpenScholar-8B us trained on the os-data dataset. Developed by: University of Washigton, Allen Institute for AI (AI2)

Links

Tags

loki-v2.6-8b-1024k

The following models were included in the merge: MrRobotoAI/Epic_Fiction-8b MrRobotoAI/Unaligned-RP-Base-8b-1024k MrRobotoAI/Loki-.Epic_Fiction.-8b Casual-Autopsy/L3-Luna-8B Casual-Autopsy/L3-Super-Nova-RP-8B Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B Casual-Autopsy/Halu-L3-Stheno-BlackOasis-8B Undi95/Llama-3-LewdPlay-8B Undi95/Llama-3-LewdPlay-8B-evo Undi95/Llama-3-Unholy-8B ChaoticNeutrals/Hathor_Tahsin-L3-8B-v0.9 ChaoticNeutrals/Hathor_RP-v.01-L3-8B ChaoticNeutrals/Domain-Fusion-L3-8B ChaoticNeutrals/T-900-8B ChaoticNeutrals/Poppy_Porpoise-1.4-L3-8B ChaoticNeutrals/Templar_v1_8B ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8 ChaoticNeutrals/Sekhmet_Gimmel-L3.1-8B-v0.3 zeroblu3/LewdPoppy-8B-RP tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b jeiku/Chaos_RP_l3_8B tannedbum/L3-Nymeria-Maid-8B Nekochu/Luminia-8B-RP vicgalle/Humanish-Roleplay-Llama-3.1-8B saishf/SOVLish-Maid-L3-8B Dogge/llama-3-8B-instruct-Bluemoon-Freedom-RP MrRobotoAI/Epic_Fiction-8b-v4 maldv/badger-lambda-0-llama-3-8b maldv/llama-3-fantasy-writer-8b maldv/badger-kappa-llama-3-8b maldv/badger-mu-llama-3-8b maldv/badger-lambda-llama-3-8b maldv/badger-iota-llama-3-8b maldv/badger-writer-llama-3-8b Magpie-Align/MagpieLM-8B-Chat-v0.1 nbeerbower/llama-3-gutenberg-8B nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K nbeerbower/llama-3-spicy-abliterated-stella-8B Magpie-Align/MagpieLM-8B-SFT-v0.1 NeverSleep/Llama-3-Lumimaid-8B-v0.1 mlabonne/NeuralDaredevil-8B-abliterated mlabonne/Daredevil-8B-abliterated NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS nothingiisreal/L3-8B-Instruct-Abliterated-DWP openchat/openchat-3.6-8b-20240522 turboderp/llama3-turbcat-instruct-8b UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3 Undi95/Llama-3-LewdPlay-8B TIGER-Lab/MAmmoTH2-8B-Plus OwenArli/Awanllm-Llama-3-8B-Cumulus-v1.0 refuelai/Llama-3-Refueled SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha NousResearch/Hermes-2-Theta-Llama-3-8B ResplendentAI/Nymph_8B grimjim/Llama-3-Oasis-v1-OAS-8B flammenai/Mahou-1.3b-llama3-8B lemon07r/Llama-3-RedMagic4-8B grimjim/Llama-3.1-SuperNova-Lite-lorabilterated-8B grimjim/Llama-Nephilim-Metamorphosis-v2-8B lemon07r/Lllama-3-RedElixir-8B grimjim/Llama-3-Perky-Pat-Instruct-8B ChaoticNeutrals/Hathor_RP-v.01-L3-8B grimjim/llama-3-Nephilim-v2.1-8B ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8 migtissera/Llama-3-8B-Synthia-v3.5 Locutusque/Llama-3-Hercules-5.0-8B WhiteRabbitNeo/Llama-3-WhiteRabbitNeo-8B-v2.0 VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct iRyanBell/ARC1-II HPAI-BSC/Llama3-Aloe-8B-Alpha HaitameLaf/Llama-3-8B-StoryGenerator failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 Undi95/Llama-3-Unholy-8B ajibawa-2023/Uncensored-Frank-Llama-3-8B ajibawa-2023/SlimOrca-Llama-3-8B ChaoticNeutrals/Templar_v1_8B aifeifei798/llama3-8B-DarkIdol-2.2-Uncensored-1048K ChaoticNeutrals/Hathor_Tahsin-L3-8B-v0.9 Blackroot/Llama-3-Gamma-Twist FPHam/L3-8B-Everything-COT Blackroot/Llama-3-LongStory ChaoticNeutrals/Sekhmet_Gimmel-L3.1-8B-v0.3 abacusai/Llama-3-Smaug-8B Khetterman/CursedMatrix-8B-v9 ajibawa-2023/Scarlett-Llama-3-8B-v1.0 MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/physics_non_masked MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/electrical_engineering MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_chemistry MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy_non_masked MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_physics MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/formal_logic MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy_100 MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/conceptual_physics MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_computer_science MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/psychology_non_masked MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/psychology MrRobotoAI/Unaligned-RP-Base-8b-1024k + Blackroot/Llama3-RP-Lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LimaRP-Instruct-LoRA-8B MrRobotoAI/Unaligned-RP-Base-8b-1024k + nothingiisreal/llama3-8B-DWP-lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/world_religions MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/high_school_european_history MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/electrical_engineering MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-8B-Abomination-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LongStory-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/human_sexuality MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/sociology MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Theory_of_Mind_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Smarts_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LongStory-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Nimue-8B MrRobotoAI/Unaligned-RP-Base-8b-1024k + vincentyandex/lora_llama3_chunked_novel_bs128 MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Aura_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/L3-Daybreak-8b-lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Luna_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + nicce/story-mixtral-8x7b-lora MrRobotoAI/Unaligned-RP-Base-8b-1024k + Blackroot/Llama-3-LongStory-LORA MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/NoWarning_Llama3 MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/BlueMoon_Llama3

Links

https://huggingface.co/QuantFactory/Loki-v2.6-8b-1024k-GGUF

Tags

astrosage-70b

Developed by: AstroMLab (Tijmen de Haan, Yuan-Sen Ting, Tirthankar Ghosal, Tuan Dung Nguyen, Alberto Accomazzi, Emily Herron, Vanessa Lama, Azton Wells, Nesar Ramachandra, Rui Pan) Funded by: Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility at Oak Ridge National Laboratory (U.S. Department of Energy). Microsoft’s Accelerating Foundation Models Research (AFMR) program. World Premier International Research Center Initiative (WPI), MEXT, Japan. National Science Foundation (NSF). UChicago Argonne LLC, Operator of Argonne National Laboratory (U.S. Department of Energy). Reference Paper: Tijmen de Haan et al. (2025). "AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model" https://arxiv.org/abs/2505.17592 Model Type: Autoregressive transformer-based LLM, specialized in astronomy, astrophysics, space science, astroparticle physics, cosmology, and astronomical instrumentation. Model Architecture: AstroSage-70B is a fine-tuned derivative of the Meta-Llama-3.1-70B architecture, making no architectural changes. The Llama-3.1-70B-Instruct tokenizer is also used without modification. Context Length: Fine-tuned on 8192-token sequences. Base model was trained to 128k context length. AstroSage-70B is a large-scale, domain-specialized language model tailored for research and education in astronomy, astrophysics, space science, cosmology, and astronomical instrumentation. It builds on the Llama-3.1-70B foundation model, enhanced through extensive continued pre-training (CPT) on a vast corpus of astronomical literature, further refined with supervised fine-tuning (SFT) on instruction-following datasets, and finally enhanced via parameter averaging (model merging) with other popular fine tunes. AstroSage-70B aims to achieve state-of-the-art performance on astronomy-specific tasks, providing researchers, students, and enthusiasts with an advanced AI assistant. This 70B parameter model represents a significant scaling up from the AstroSage-8B model. The primary enhancements from the AstroSage-8B model are: Stronger base model, higher parameter count for increased capacity Improved datasets Improved learning hyperparameters Reasoning capability (can be enabled or disabled at inference time) Training Lineage Base Model: Meta-Llama-3.1-70B. Continued Pre-Training (CPT): The base model underwent 2.5 epochs of CPT (168k GPU-hours) on a specialized astronomy corpus (details below, largely inherited from AstroSage-8B) to produce AstroSage-70B-CPT. This stage imbues domain-specific knowledge and language nuances. Supervised Fine-Tuning (SFT): AstroSage-70B-CPT was then fine-tuned for 0.6 epochs (13k GPU-hours) using astronomy-relevant and general-purpose instruction-following datasets, resulting in AstroSage-70B-SFT. Final Mixture: The released AstroSage-70B model is created via parameter averaging / model merging: DARE-TIES with rescale: true and lambda: 1.2 AstroSage-70B-CPT designated as the "base model" 70% AstroSage-70B-SFT (density 0.7) 15% Llama-3.1-Nemotron-70B-Instruct (density 0.5) 7.5% Llama-3.3-70B-Instruct (density 0.5) 7.5% Llama-3.1-70B-Instruct (density 0.5) Intended Use: Like AstroSage-8B, this model can be used for a variety of LLM application, including Providing factual information and explanations in astronomy, astrophysics, cosmology, and instrumentation. Assisting with literature reviews and summarizing scientific papers. Answering domain-specific questions with high accuracy. Brainstorming research ideas and formulating hypotheses. Assisting with programming tasks related to astronomical data analysis. Serving as an educational tool for learning astronomical concepts. Potentially forming the core of future agentic research assistants capable of more autonomous scientific tasks.

Links

Tags

fuseo1-deepseekr1-qwen2.5-coder-32b-preview-v0.1

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

fuseo1-deepseekr1-qwen2.5-instruct-32b-preview

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

fuseo1-deepseekr1-qwq-32b-preview

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

fuseo1-deekseekr1-qwq-skyt1-32b-preview

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

qihoo360_tinyr1-32b-preview

We introduce our first-generation reasoning model, Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math. We applied supervised fine-tuning (SFT) to Deepseek-R1-Distill-Qwen-32B across three target domains—Mathematics, Code, and Science — using the 360-LLaMA-Factory training framework to produce three domain-specific models. We used questions from open-source data as seeds. Meanwhile, responses for mathematics, coding, and science tasks were generated by R1, creating specialized models for each domain. Building on this, we leveraged the Mergekit tool from the Arcee team to combine multiple models, creating Tiny-R1-32B-Preview, which demonstrates strong overall performance.

Links

Tags

einstein-v4-7b

🔬 Einstein-v4-7B This model is a full fine-tuned version of mistralai/Mistral-7B-v0.1 on diverse datasets. This model is finetuned using 7xRTX3090 + 1xRTXA6000 using axolotl.

Links

Tags

Model Gallery

Filter by type:

Filter by tags:

lfm2.5-8b-a1b

baidu_ernie-4.5-21b-a3b-thinking

qwen3-235b-a22b-instruct-2507

qwen3-8b-shiningvaliant3

omega-qwen3-atom-8b

qwen_qwen3-30b-a3b-instruct-2507

qwen_qwen3-30b-a3b-thinking-2507

qwen_qwen3-4b-instruct-2507

qwen_qwen3-4b-thinking-2507

qwen3-stargate-sg1-uncensored-abliterated-8b-i1

llama3.2-3b-shiningvaliant2-i1

llama3.1-8b-shiningvaliant2

llama-3.1_openscholar-8b

loki-v2.6-8b-1024k

astrosage-70b

fuseo1-deepseekr1-qwen2.5-coder-32b-preview-v0.1

fuseo1-deepseekr1-qwen2.5-instruct-32b-preview

fuseo1-deepseekr1-qwq-32b-preview

fuseo1-deekseekr1-qwq-skyt1-32b-preview

qihoo360_tinyr1-32b-preview

einstein-v4-7b