LocalAI - Models

kalomaze_qwen3-16b-a3b

A man-made horror beyond your comprehension. But no, seriously, this is my experiment to: measure the probability that any given expert will activate (over my personal set of fairly diverse calibration data), per layer prune 64/128 of the least used experts per layer (with reordered router and indexing per layer) It can still write semi-coherently without any additional training or distillation done on top of it from the original 30b MoE. The .txt files with the original measurements are provided in the repo along with the exported weights. Custom testing to measure the experts was done on a hacked version of vllm, and then I made a bespoke script to selectively export the weights according to the measurements.

Links

Tags

qwen3-42b-a3b-stranger-thoughts-deep20x-abliterated-uncensored-i1

WARNING: NSFW. Vivid prose. INTENSE. Visceral Details. Violence. HORROR. GORE. Swearing. UNCENSORED... humor, romance, fun. Qwen3-42B-A3B-Stranger-Thoughts-Deep20x-Abliterated-Uncensored This repo contains the full precision source code, in "safe tensors" format to generate GGUFs, GPTQ, EXL2, AWQ, HQQ and other formats. The source code can also be used directly. ABOUT: Qwen's excellent "Qwen3-30B-A3B", abliterated by "huihui-ai" then combined Brainstorm 20x (tech notes at bottom of the page) in a MOE (128 experts) at 42B parameters (up from 30B). This pushes Qwen's abliterated/uncensored model to the absolute limit for creative use cases. Prose (all), reasoning, thinking ... all will be very different from reg "Qwen 3s". This model will generate horror, fiction, erotica, - you name it - in vivid, stark detail. It will NOT hold back. Likewise, regen(s) of the same prompt - even at the same settings - will create very different version(s) too. See FOUR examples below. Model retains full reasoning, and output generation of a Qwen3 MOE ; but has not been tested for "non-creative" use cases. Model is set with Qwen's default config: 40 k context 8 of 128 experts activated. Chatml OR Jinja Template (embedded) IMPORTANT: See usage guide / repo below to get the most out of this model, as settings are very specific. USAGE GUIDE: Please refer to this model card for Specific usage, suggested settings, changing ACTIVE EXPERTS, templates, settings and the like: How to maximize this model in "uncensored" form, with specific notes on "abliterated" models. Rep pen / temp settings specific to getting the model to perform strongly. https://huggingface.co/DavidAU/Qwen3-18B-A3B-Stranger-Thoughts-Abliterated-Uncensored-GGUF GGUF / QUANTS / SPECIAL SHOUTOUT: Special thanks to team Mradermacher for making the quants! https://huggingface.co/mradermacher/Qwen3-42B-A3B-Stranger-Thoughts-Deep20x-Abliterated-Uncensored-GGUF KNOWN ISSUES: Model may "mis-capitalize" word(s) - lowercase, where uppercase should be - from time to time. Model may add extra space from time to time before a word. Incorrect template and/or settings will result in a drop in performance / poor performance.

Links

Tags

gemma-3-4b-it-max-horror-uncensored-dbl-x-imatrix

Google's newest Gemma-3 model that has been uncensored by David_AU (maintains instruction following / model performance and adds 4 layers to the model) and re-enforced with a system prompt (optional) - see below. The "Horror Imatrix" was built using Grand Horror 16B (at my repo). This adds a "tint" of horror to the model. 5 examples provided (NSFW / F-Bombs galore) below with prompts at IQ4XS (56 t/s on mid level card). Context: 128k. "MAXED" This means the embed and output tensor are set at "BF16" (full precision) for all quants. This enhances quality, depth and general performance at the cost of a slightly larger quant. "HORROR IMATRIX" A strong, in house built, imatrix dataset built by David_AU which results in better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general. This combines with "MAXing" the quant to improve preformance.

Links

https://huggingface.co/DavidAU/Gemma-3-4b-it-MAX-HORROR-Uncensored-DBL-X-Imatrix-GGUF

Tags

gemma-3-the-grand-horror-27b

The **Gemma-3-The-Grand-Horror-27B-GGUF** model is a **fine-tuned version** of Google's **Gemma 3 27B** language model, specifically optimized for **extreme horror-themed text generation**. It was trained using the **Unsloth framework** on a custom in-house dataset of horror content, resulting in a model that produces vivid, graphic, and psychologically intense narratives—featuring gore, madness, and disturbing imagery—often even when prompts don't explicitly request horror. Key characteristics: - **Base Model**: Gemma 3 27B (original by Google, not the quantized version) - **Fine-tuned For**: High-intensity horror storytelling, long-form narrative generation, and immersive scene creation - **Use Case**: Creative writing, horror RP, dark fiction, and experimental storytelling - **Not Suitable For**: General use, children, sensitive audiences, or content requiring neutral/positive tone - **Quantization**: Available in GGUF format (e.g., q3k, q4, etc.), making it accessible for local inference on consumer hardware > ✅ **Note**: The model card you see is for a **quantized, fine-tuned derivative**, not the original. The true base model is **Gemma 3 27B**, available at: https://huggingface.co/google/gemma-3-27b This model is not for all audiences — it generates content with a consistently dark, unsettling tone. Use responsibly.

Links

https://huggingface.co/DavidAU/Gemma-3-The-Grand-Horror-27B-GGUF

Tags

Model Gallery

Filter by type:

Filter by tags:

kalomaze_qwen3-16b-a3b

qwen3-42b-a3b-stranger-thoughts-deep20x-abliterated-uncensored-i1

gemma-3-4b-it-max-horror-uncensored-dbl-x-imatrix

gemma-3-the-grand-horror-27b