LocalAI - Models

qwen-3-32b-medical-reasoning-i1

This is https://huggingface.co/kingabzpro/Qwen-3-32B-Medical-Reasoning applied to https://huggingface.co/Qwen/Qwen3-32B Original model card created by @kingabzpro Original model card from @kingabzpro Fine-tuning Qwen3-32B in 4-bit Quantization for Medical Reasoning This project fine-tunes the Qwen/Qwen3-32B model using a medical reasoning dataset (FreedomIntelligence/medical-o1-reasoning-SFT) with 4-bit quantization for memory-efficient training.

Links

Tags

sao10k_llama-3.3-70b-vulpecula-r1

🌟 A thinking-based model inspired by Deepseek-R1, trained through both SFT and a little bit of RL on creative writing data. 🧠 Prefill, or begin assistant replies with \n to activate thinking mode, or not. It works well without thinking too. 🚀 Improved Steerability, instruct-roleplay and creative control over base model. 👾 Semi-synthetic Chat/Roleplaying datasets that has been re-made, cleaned and filtered for repetition, quality and output. 🎭 Human-based Natural Chat / Roleplaying datasets cleaned, filtered and checked for quality. 📝 Diverse Instruct dataset from a few different LLMs, cleaned and filtered for refusals and quality. 💭 Reasoning Traces taken from Deepseek-R1 for Instruct, Chat & Creative Tasks, filtered and cleaned for quality. █▓▒ Toxic / Decensorship data was not needed for our purposes, the model is unrestricted enough as is.

Links

Tags

tarek07_legion-v2.1-llama-70b

My biggest merge yet, consisting of a total of 20 specially curated models. My methodology in approaching this was to create 5 highly specialized models: A completely uncensored base A very intelligent model based on UGI, Willingness and NatInt scores on the UGI Leaderboard A highly descriptive writing model, specializing in creative and natural prose A RP model specially merged with fine-tuned models that use a lot of RP datasets The secret ingredient: A completely unhinged, uncensored final model These five models went through a series of iterations until I got something I thought worked well and then combined them to make LEGION. The full list of models used in this merge is below: TheDrummer/Fallen-Llama-3.3-R1-70B-v1 Sao10K/Llama-3.3-70B-Vulpecula-r1 Sao10K/L3-70B-Euryale-v2.1 SicariusSicariiStuff/Negative_LLAMA_70B allura-org/Bigger-Body-70b Sao10K/70B-L3.3-mhnnn-x1 Sao10K/L3.3-70B-Euryale-v2.3 Doctor-Shotgun/L3.3-70B-Magnum-v4-SE Sao10K/L3.1-70B-Hanami-x1 Sao10K/70B-L3.3-Cirrus-x1 EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1 TheDrummer/Anubis-70B-v1 ArliAI/Llama-3.3-70B-ArliAI-RPMax-v1.4 LatitudeGames/Wayfarer-Large-70B-Llama-3.3 NeverSleep/Lumimaid-v0.2-70B mlabonne/Hermes-3-Llama-3.1-70B-lorablated ReadyArt/Forgotten-Safeword-70B-3.6 ReadyArt/Fallen-Abomination-70B-R1-v4.1 ReadyArt/Fallen-Safeword-70B-R1-v4.1 huihui-ai/Llama-3.3-70B-Instruct-abliterated

Links

Tags

l3.1-8b-niitama-v1.1-iq-imatrix

GGUF-IQ-Imatrix quants for Sao10K/L3.1-8B-Niitama-v1.1 Here's the subjectively superior L3 version: L3-8B-Niitama-v1 An experimental model using experimental methods. More detail on it: Tamamo and Niitama are made from the same data. Literally. The only thing that's changed is how theyre shuffled and formatted. Yet, I get wildly different results. Interesting, eh? Feels kinda not as good compared to the l3 version, but it's aight.

Links

Tags

l3.1-ms-astoria-70b-v2

This model is a remake of the original astoria with modern models and context sizes its goal is to merge the robust storytelling of mutiple models while attempting to maintain intelligence. Use Llama 3 Format or meth format (llama 3 refuses to work with stepped thinking but meth works) - model: migtissera/Tess-3-Llama-3.1-70B - model: NeverSleep/Lumimaid-v0.2-70B - model: Sao10K/L3.1-70B-Euryale-v2.2 - model: ArliAI/Llama-3.1-70B-ArliAI-RPMax-v1.2 - model: nbeerbower/Llama3.1-Gutenberg-Doppel-70B

Links

Tags

skywork-o1-open-llama-3.1-8b

We are excited to announce the release of the Skywork o1 Open model series, developed by the Skywork team at Kunlun Inc. This groundbreaking release introduces a series of models that incorporate o1-like slow thinking and reasoning capabilities. The Skywork o1 Open model series includes three advanced models: Skywork o1 Open-Llama-3.1-8B: A robust chat model trained on Llama-3.1-8B, enhanced significantly with "o1-style" data to improve reasoning skills. Skywork o1 Open-PRM-Qwen-2.5-1.5B: A specialized model designed to enhance reasoning capability through incremental process rewards, ideal for complex problem solving at a smaller scale. Skywork o1 Open-PRM-Qwen-2.5-7B: Extends the capabilities of the 1.5B model by scaling up to handle more demanding reasoning tasks, pushing the boundaries of AI reasoning. Different from mere reproductions of the OpenAI o1 model, the Skywork o1 Open model series not only exhibits innate thinking, planning, and reflecting capabilities in its outputs, but also shows significant improvements in reasoning skills on standard benchmarks. This series represents a strategic advancement in AI capabilities, moving a previously weaker base model towards the state-of-the-art (SOTA) in reasoning tasks.

Links

Tags

huatuogpt-o1-8b

HuatuoGPT-o1 is a medical LLM designed for advanced medical reasoning. It generates a complex thought process, reflecting and refining its reasoning, before providing a final response. For more information, visit our GitHub repository: https://github.com/FreedomIntelligence/HuatuoGPT-o1.

Links

Tags

deepseek-r1-distill-llama-8b

DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks. Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing. By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.

Links

Tags

tarek07_nomad-llama-70b

I decided to make a simple model for a change, with some models I was curious to see work together. models: - model: ArliAI/DS-R1-Distill-70B-ArliAI-RpR-v4-Large - model: TheDrummer/Anubis-70B-v1.1 - model: Mawdistical/Vulpine-Seduction-70B - model: Darkhn/L3.3-70B-Animus-V5-Pro - model: zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B - model: Sao10K/Llama-3.3-70B-Vulpecula-r1 base_model: nbeerbower/Llama-3.1-Nemotron-lorablated-70B

Links

Tags

deepseek-r1-distill-qwen-1.5b

DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks. Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing. By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.

Links

Tags

deepseek-r1-distill-qwen-7b

DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks. Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing. By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.

Links

Tags

deepseek-r1-distill-qwen-14b

DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks. Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing. By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.

Links

Tags

deepseek-r1-distill-qwen-32b

DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks. Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing. By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.

Links

Tags

deepseek-r1-distill-llama-8b

DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks. Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing. By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.

Links

Tags

deepseek-r1-distill-llama-70b

DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks. Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing. By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.

Links

Tags

fuseo1-deepseekr1-qwen2.5-coder-32b-preview-v0.1

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

fuseo1-deepseekr1-qwen2.5-instruct-32b-preview

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

fuseo1-deepseekr1-qwq-32b-preview

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

fuseo1-deekseekr1-qwq-skyt1-32b-preview

FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.

Links

Tags

knoveleng_open-rs3

This repository hosts model for the Open RS project, accompanying the paper Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t. The project explores enhancing reasoning capabilities in small large language models (LLMs) using reinforcement learning (RL) under resource-constrained conditions. We focus on a 1.5-billion-parameter model, DeepSeek-R1-Distill-Qwen-1.5B, trained on 4 NVIDIA A40 GPUs (48 GB VRAM each) within 24 hours. By adapting the Group Relative Policy Optimization (GRPO) algorithm and leveraging a curated, compact mathematical reasoning dataset, we conducted three experiments to assess performance and behavior. Key findings include: Significant reasoning improvements, e.g., AMC23 accuracy rising from 63% to 80% and AIME24 reaching 46.7%, outperforming o1-preview. Efficient training with just 7,000 samples at a cost of $42, compared to thousands of dollars for baseline models. Challenges like optimization instability and length constraints with extended training. These results showcase RL-based fine-tuning as a cost-effective approach for small LLMs, making reasoning capabilities accessible in resource-limited settings. We open-source our code, models, and datasets to support further research.

Links

Tags

mn-12b-lyra-v4-iq-imatrix

A finetune of Mistral Nemo by Sao10K. Uses the ChatML prompt format.

Links

https://huggingface.co/Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix

Tags

Model Gallery

Filter by type:

Filter by tags:

qwen-3-32b-medical-reasoning-i1

sao10k_llama-3.3-70b-vulpecula-r1

tarek07_legion-v2.1-llama-70b

l3.1-8b-niitama-v1.1-iq-imatrix

l3.1-ms-astoria-70b-v2

skywork-o1-open-llama-3.1-8b

huatuogpt-o1-8b

deepseek-r1-distill-llama-8b

tarek07_nomad-llama-70b

deepseek-r1-distill-qwen-1.5b

deepseek-r1-distill-qwen-7b

deepseek-r1-distill-qwen-14b

deepseek-r1-distill-qwen-32b

deepseek-r1-distill-llama-8b

deepseek-r1-distill-llama-70b

fuseo1-deepseekr1-qwen2.5-coder-32b-preview-v0.1

fuseo1-deepseekr1-qwen2.5-instruct-32b-preview

fuseo1-deepseekr1-qwq-32b-preview

fuseo1-deekseekr1-qwq-skyt1-32b-preview

knoveleng_open-rs3

mn-12b-lyra-v4-iq-imatrix