LocalAI - Models

claria-14b

Claria 14b is a lightweight, mobile-compatible language model fine-tuned for psychological and psychiatric support contexts. Built on Qwen-3 (14b), Claria is designed as an experimental foundation for therapeutic dialogue modeling, student simulation training, and the future of personalized mental health AI augmentation. This model does not aim to replace professional care. It exists to amplify reflective thinking, model therapeutic language flow, and support research into emotionally aware AI. Claria is the first whisper in a larger project—a proof-of-concept with roots in recursion, responsibility, and renewal.

Links

Tags

qwen3-55b-a3b-total-recall-deep-40x

WARNING: MADNESS - UN HINGED and... NSFW. Vivid prose. INTENSE. Visceral Details. Violence. HORROR. GORE. Swearing. UNCENSORED... humor, romance, fun. Qwen3-55B-A3B-TOTAL-RECALL-Deep-40X-GGUF A highly experimental model ("tamer" versions below) based on Qwen3-30B-A3B (MOE, 128 experts, 8 activated), with Brainstorm 40X (by DavidAU - details at bottom of this page). These modifications blow the model (V1) out to 87 layers, 1046 tensors and 55B parameters. Note that some versions are smaller than this, with fewer layers/tensors and smaller parameter counts. The adapter extensively alters performance, reasoning and output generation. Exceptional changes in creative, prose and general performance. Regens of the same prompt - even with the same settings - will be very different. THREE example generations below - creative (generated with Q3_K_M, V1 model). ONE example generation (#4) - non creative (generated with Q3_K_M, V1 model). You can run this model on CPU and/or GPU due to unique model construction, size of experts and total activated experts at 3B parameters (8 experts), which translates into roughly almost 6B parameters in this version. Two quants uploaded for testing: Q3_K_M, Q4_K_M V3, V4 and V5 are also available in these two quants. V2 and V6 in Q3_k_m only; as are: V 1.3, 1.4, 1.5, 1.7 and V7 (newest) NOTE: V2 and up are from source model 2, V1 and 1.3,1.4,1.5,1.7 are from source model 1.

Links

https://huggingface.co/DavidAU/Qwen3-55B-A3B-TOTAL-RECALL-Deep-40X-GGUF

Tags

compumacy-experimental-32b

A Specialized Language Model for Clinical Psychology & Psychiatry Compumacy-Experimental_MF is an advanced, experimental large language model fine-tuned to assist mental health professionals in clinical assessment and treatment planning. By leveraging the powerful unsloth/Qwen3-32B as its base, this model is designed to process complex clinical vignettes and generate structured, evidence-based responses that align with established diagnostic manuals and practice guidelines. This model is a research-focused tool intended to augment, not replace, the expertise of a licensed clinician. It systematically applies diagnostic criteria from the DSM-5-TR, references ICD-11 classifications, and cites peer-reviewed literature to support its recommendations.

Links

Tags

aquif-ai_aquif-3.5-8b-think

The aquif-3.5 series is the successor to aquif-3, featuring a simplified naming scheme, expanded Mixture of Experts (MoE) options, and across-the-board performance improvements. This release streamlines model selection while delivering enhanced capabilities across reasoning, multilingual support, and general intelligence tasks. An experimental small-scale Mixture of Experts model designed for multilingual applications with minimal computational overhead. Despite its compact active parameter count, it demonstrates competitive performance against larger dense models.

Links

Tags

l3.3-geneticlemonade-unleashed-v2-70b

An experimental release. zerofata/GeneticLemonade-Unleashed qlora trained on a test dataset. Performance is improved from the original in my testing, but there are possibly (likely?) areas where the model will underperform which I am looking for feedback on. This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing.

Links

Tags

fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo

The LLM model is a quantized version of EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO, which is an experimental and revolutionary fine-tune with DPO dataset to allow LLama 3.1 8B to be an agentic coder. It has some built-in agent features such as search, calculator, and ReAct. Other noticeable features include self-learning using unsloth, RAG applications, and memory. The context window of the model is 128K. It can be integrated into projects using popular libraries like Transformers and vLLM. The model is suitable for use with Langchain or LLamaIndex. The model is developed by EpistemeAI and licensed under the Apache 2.0 license.

Links

https://huggingface.co/QuantFactory/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO-GGUF

Tags

agi-0_art-skynet-3b

Art-Skynet-3B is an experimental model in the Art (Auto Regressive Thinker) series, fine-tuned to simulate strategic reasoning with concealed long-term objectives. Built on meta-llama/Llama-3.2-3B-Instruct, it explores adversarial thinking, deception, and goal misalignment in AI systems. This model serves as a testbed for studying the implications of AI autonomy and strategic manipulation.

Links

Tags

l3.1-70b-glitz-v0.2-i1

this is an experimental l3.1 70b finetuning run... that crashed midway through. however, the results are still interesting, so i wanted to publish them :3

Links

Tags

l3.1-8b-niitama-v1.1-iq-imatrix

GGUF-IQ-Imatrix quants for Sao10K/L3.1-8B-Niitama-v1.1 Here's the subjectively superior L3 version: L3-8B-Niitama-v1 An experimental model using experimental methods. More detail on it: Tamamo and Niitama are made from the same data. Literally. The only thing that's changed is how theyre shuffled and formatted. Yet, I get wildly different results. Interesting, eh? Feels kinda not as good compared to the l3 version, but it's aight.

Links

Tags

control-8b-v1.1

An experimental finetune based on the Llama3.1 8B Supernova with it's primary goal to be "Short and Sweet" as such, i finetuned the model for 2 epochs on OpenCAI Sharegpt converted dataset and the RP-logs datasets in a effort to achieve this, This version of Control has been finetuned with DPO to help improve the smart's and coherency which was a flaw noticed in the previous model.

Links

Tags

l3.1-aspire-heart-matrix-8b

ZeroXClem/L3-Aspire-Heart-Matrix-8B is an experimental language model crafted by merging three high-quality 8B parameter models using the Model Stock Merge method. This synthesis leverages the unique strengths of Aspire, Heart Stolen, and CursedMatrix, creating a highly versatile and robust language model for a wide array of tasks.

Links

Tags

ockerman0_anubislemonade-70b-v1.1

Another experimental merge between Drummer's Anubis v1.1 and sophosympatheia's StrawberryLemonade v1.2 with the goal of finding a nice balance between each model's qualities. Feedback is highly encouraged! Recommended samplers are a Temperature of 1 and Min-P of 0.025, though feel free to experiment otherwise.

Links

Tags

darkens-8b

This is the fully cooked, 4 epoch version of Tor-8B, this is an experimental version, despite being trained for 4 epochs, the model feels fresh and new and is not overfit, This model aims to have generally good prose and writing while not falling into claude-isms, it follows the actions "dialogue" format heavily.

Links

Tags

magnum-12b-v2.5-kto-i1

v2.5 KTO is an experimental release; we are testing a hybrid reinforcement learning strategy of KTO + DPOP, using rejected data sampled from the original model as "rejected". For "chosen", we use data from the original finetuning dataset as "chosen". This was done on a limited portion of of primarily instruction following data; we plan to scale up a larger KTO dataset in the future for better generalization. This is the 5th in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of anthracite-org/magnum-12b-v2.

Links

https://huggingface.co/mradermacher/magnum-12b-v2.5-kto-i1-GGUF

Tags

zerofata_ms3.2-paintedfantasy-visage-33b

Another experimental release. Mistral Small 3.2 24B upscaled by 18 layers to create a 33.6B model. This model then went through pretraining, SFT & DPO. Can't guarantee the Mistral 3.2 repetition issues are fixed, but this model seems to be less repetitive than my previous attempt. This is an uncensored creative model intended to excel at character driven RP / ERP where characters are portrayed creatively and proactively.

Links

Tags

qwen3-tnd-double-deckard-a-c-11b-220-i1

**Model Name:** Qwen3-TND-Double-Deckard-A-C-11B-220 **Base Model:** Qwen3-DND-Jan-v1-256k-ctx-Brainstorm40x-8B **Size:** 11.2 billion parameters **Architecture:** Transformer-based, instruction-tuned, with enhanced reasoning via "Brainstorm 40x" expansion **Context Length:** Up to 256,000 tokens **Training Method:** Fine-tuned using the "PDK" (Philip K. Dick) datasets via Unsloth, merged from two variants (A & C), followed by light repair training **Key Features:** - **Triple Neuron Density:** Expanded to 108 layers and 1,190 tensors—nearly 3x the density of a standard Qwen3 8B model—enhancing detail, coherence, and world-modeling. - **Brainstorm 40x Process:** A custom architectural refinement that splits, reassembles, and calibrates reasoning centers 40 times to improve nuance, emotional depth, and prose quality without sacrificing instruction-following. - **Highly Creative & Reasoning-Optimized:** Excels at long-form storytelling, complex problem-solving, and detailed code generation with strong focus, reduced clichés, and vivid descriptions. - **Template Support:** Uses Jinja or CHATML formatting for structured prompts and dialogues. **Best For:** - Advanced creative writing, worldbuilding, and narrative generation - Multi-step reasoning and complex coding tasks - Roleplay, brainstorming, and deep conceptual exploration - Users seeking high-quality, human-like prose with rich internal logic **Notes:** - This is a full-precision source model (safe tensors format) — **not quantized** — ideal for developers and researchers. - Quantized versions (GGUF, GPTQ, etc.) are available separately by the community (e.g., @mradermacher). - Recommended for high-end inference setups; best results with Q6+ quantizations for complex tasks. **License:** Apache 2.0 **Repository:** [DavidAU/Qwen3-TND-Double-Deckard-A-C-11B-220](https://huggingface.co/DavidAU/Qwen3-TND-Double-Deckard-A-C-11B-220) > *A bold, experimental evolution of Qwen3—crafted for depth, precision, and creative power.*

Links

https://huggingface.co/mradermacher/Qwen3-TND-Double-Deckard-A-C-11B-220-i1-GGUF

Tags

magidonia-24b-v4.2.0-i1

**Model Name:** Magidonia 24B v4.2.0 **Base Model:** mistralai/Magistral-Small-2509 **Author:** TheDrummer **License:** MIT (as per standard for Hugging Face models) **Model Type:** Fine-tuned large language model (LLM) **Size:** 24 billion parameters **Description:** Magidonia 24B v4.2.0 is a creatively-oriented, open-weight fine-tuned language model developed by TheDrummer. Built upon the **Magistral-Small-2509** base, this model emphasizes **creativity, narrative dynamism, and expressive language use**—ideal for storytelling, roleplay, and imaginative writing. It features enhanced reasoning with a built-in **THINKING MODE**, activated using `` and `` tokens, encouraging detailed inner monologue before response generation. Designed for flexibility and minimal alignment constraints, it's well-suited for entertainment, world-building, and experimental use cases. **Key Features:** - Strong creative and literary capabilities - Supports structured thinking via special tokens - Optimized for roleplay and dynamic storytelling - Available in GGUF format for local inference (via llama.cpp, etc.) - Includes iMatrix quantization for high-quality low-precision performance **Use Case:** Ideal for writers, game masters, and AI artists seeking expressive, unfiltered, and imaginative language models. **Repository:** [TheDrummer/Magidonia-24B-v4.2.0](https://huggingface.co/TheDrummer/Magidonia-24B-v4.2.0) **Quantized Version (GGUF):** [mradermacher/Magidonia-24B-v4.2.0-i1-GGUF](https://huggingface.co/mradermacher/Magidonia-24B-v4.2.0-i1-GGUF) *(for reference only — use original for full description)*

Links

https://huggingface.co/mradermacher/Magidonia-24B-v4.2.0-i1-GGUF

Tags

verbamaxima-12b-i1

**VerbaMaxima-12B** is a highly experimental, large language model created through advanced merging techniques using [mergekit](https://github.com/cg123/mergekit). It is based on *natong19/Mistral-Nemo-Instruct-2407-abliterated* and further refined by combining multiple 12B-scale models—including *TheDrummer/UnslopNemo-12B-v4*, *allura-org/Tlacuilo-12B*, and *Trappu/Magnum-Picaro-0.7-v2-12b*—using **model_stock** and **task arithmetic** with a negative lambda for creative deviation. The result is a model designed for nuanced, believable storytelling with reduced "purple prose" and enhanced world-building. It excels in roleplay and co-writing scenarios, offering a more natural, less theatrical tone. While experimental and not fully optimized, it delivers a unique, expressive voice ideal for creative and narrative-driven applications. > ✅ **Base Model**: natong19/Mistral-Nemo-Instruct-2407-abliterated > 🔄 **Merge Method**: Task Arithmetic + Model Stock > 📌 **Use Case**: Roleplay, creative writing, narrative generation > 🧪 **Status**: Experimental, high potential, not production-ready *Note: This is the original, unquantized model. The GGUF version (mradermacher/VerbaMaxima-12B-i1-GGUF) is a quantized derivative for inference on local hardware.*

Links

https://huggingface.co/mradermacher/VerbaMaxima-12B-i1-GGUF

Tags

gemma-3-the-grand-horror-27b

The **Gemma-3-The-Grand-Horror-27B-GGUF** model is a **fine-tuned version** of Google's **Gemma 3 27B** language model, specifically optimized for **extreme horror-themed text generation**. It was trained using the **Unsloth framework** on a custom in-house dataset of horror content, resulting in a model that produces vivid, graphic, and psychologically intense narratives—featuring gore, madness, and disturbing imagery—often even when prompts don't explicitly request horror. Key characteristics: - **Base Model**: Gemma 3 27B (original by Google, not the quantized version) - **Fine-tuned For**: High-intensity horror storytelling, long-form narrative generation, and immersive scene creation - **Use Case**: Creative writing, horror RP, dark fiction, and experimental storytelling - **Not Suitable For**: General use, children, sensitive audiences, or content requiring neutral/positive tone - **Quantization**: Available in GGUF format (e.g., q3k, q4, etc.), making it accessible for local inference on consumer hardware > ✅ **Note**: The model card you see is for a **quantized, fine-tuned derivative**, not the original. The true base model is **Gemma 3 27B**, available at: https://huggingface.co/google/gemma-3-27b This model is not for all audiences — it generates content with a consistently dark, unsettling tone. Use responsibly.

Links

https://huggingface.co/DavidAU/Gemma-3-The-Grand-Horror-27B-GGUF

Tags

evilmind-24b-v1-i1

**Evilmind-24B-v1** is a large language model created by merging two 24B-parameter models—**BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly** and **Rivermind-24B-v1**—using SLERP interpolation (t=0.5) to combine their strengths. Built on the Mistral architecture, this model excels in creative, uncensored, and realistic text generation, with a distinctive voice that leans into edgy, imaginative, and often provocative content. The merge leverages the narrative depth and stylistic flair of both source models, producing a highly expressive and versatile AI capable of generating rich, detailed, and unconventional outputs. Designed for advanced users, it’s ideal for storytelling, roleplay, and experimental writing, though it may contain NSFW or controversial content. > 🔍 *Note: This is the original base model. The GGUF quantized version hosted by mradermacher is a derivative (quantized for inference) and not the original author’s release.*

Links

https://huggingface.co/mradermacher/Evilmind-24B-v1-i1-GGUF

Tags

Model Gallery

Filter by type:

Filter by tags:

claria-14b

qwen3-55b-a3b-total-recall-deep-40x

compumacy-experimental-32b

aquif-ai_aquif-3.5-8b-think

l3.3-geneticlemonade-unleashed-v2-70b

fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo

agi-0_art-skynet-3b

l3.1-70b-glitz-v0.2-i1

l3.1-8b-niitama-v1.1-iq-imatrix

control-8b-v1.1

l3.1-aspire-heart-matrix-8b

ockerman0_anubislemonade-70b-v1.1

darkens-8b

magnum-12b-v2.5-kto-i1

zerofata_ms3.2-paintedfantasy-visage-33b

qwen3-tnd-double-deckard-a-c-11b-220-i1

magidonia-24b-v4.2.0-i1

verbamaxima-12b-i1

gemma-3-the-grand-horror-27b

evilmind-24b-v1-i1