LocalAI - Models

gemmable-4-12b-mtp

## Gemmable 4 12B Gemmable 4 12B is a GGUF export of Gemma 4 12B fine-tuned on Fable-5 style reasoning and assistant traces. ## Highlights - Base model: `google/gemma-4-12B` - Format: GGUF - Training style: Fable-5 style reasoning and assistant traces - Distribution: fp16 GGUF plus matching assistant GGUFs for each quant - Intended use: local inference, coding, reasoning, and assistant workflows ## How to use ### llama.cpp Standard load: ```bash llama-server -m "gemmable-4-12b-fp16.gguf" ``` Speculative / draft-MTP load: ```bash llama-server -m "gemmable-4-12b-Q4_K_M.gguf" \ --spec-draft-model "gemmable-4-12b-Q4_K_M-mtp.gguf" \ --spec-type draft-mtp \ --spec-draft-n-max 4 ``` Use the matching fp16 or quantized main file with its `-mtp` companion. ### LM Studio 1. Search this repo, download target + mtp file. 2. Load target. 3. Load settings → Speculative Decoding → select mtp file file. (Requires LM Studio with am17an's PR merged or custom llama.cpp runtime. As of 2026-05, mainline LM Studio runtime doesn't yet have `draft-mtp` for Gemma-4 — track upstream merge.) ## GGUF / local inference notes ...

Links

https://huggingface.co/Mia-AiLab/Gemmable-4-12B-MTP-GGUF

Tags

carnice-v2-27b

# Carnice-V2-27B for Hermes Agent Carnice-V2-27B is a full merged BF16 SFT of `Qwen/Qwen3.6-27B` for Hermes-style agent traces. This repository contains the standalone merged model weights, not only a LoRA adapter. ## BF16 Transformers Loading Fix The BF16 safetensors were republished with corrected `Qwen3_5ForConditionalGeneration` tensor prefixes. The original merge artifact accidentally serialized an extra Unsloth wrapper prefix, which caused direct HF Transformers loads to report the real weights as unexpected keys and initialize expected layers randomly. GGUF files were not affected because the GGUF conversion path normalized those prefixes. ## Benchmarks The benchmark artifact bundle is included under `benchmarks/`. It contains the rendered graph, extracted `metrics.json`, benchmark scripts, and raw result files used to make the chart. Scope note: the IFEval run is a short `limit=20` A/B smoke benchmark, not an official full leaderboard score. Held-out loss/perplexity is the exact assistant-only training-format validation metric from the SFT script. The raw BFCL two-case smoke files are included for auditability, but they are too small to use as a model-quality claim. ...

Links

https://huggingface.co/kai-os/Carnice-V2-27b-GGUF

Tags

qwopus-glm-18b-merged

# 🪐 Qwen3.5-9B-GLM5.1-Distill-v1 ## 📌 Model Overview **Model Name:** `Jackrong/Qwen3.5-9B-GLM5.1-Distill-v1` **Base Model:** Qwen3.5-9B **Training Type:** Supervised Fine-Tuning (SFT, Distillation) **Parameter Scale:** 9B **Training Framework:** Unsloth This model is a distilled variant of **Qwen3.5-9B**, trained on high-quality reasoning data derived from **GLM-5.1**. The primary goals are to: - Improve **structured reasoning ability** - Enhance **instruction-following consistency** - Activate **latent knowledge via better reasoning structure** ## 📊 Training Data ### Main Dataset - `Jackrong/GLM-5.1-Reasoning-1M-Cleaned` - Cleaned from the original `Kassadin88/GLM-5.1-1000000x` dataset. - Generated from a **GLM-5.1 teacher model** - Approximately **700x** the scale of `Qwen3.5-reasoning-700x` - Training used a **filtered subset**, not the full source dataset. ### Auxiliary Dataset - `Jackrong/Qwen3.5-reasoning-700x` ...

Links

https://huggingface.co/KyleHessling1/Qwopus-GLM-18B-Merged-GGUF

Tags

qwopus-glm-18b-merged

# 🪐 Qwen3.5-9B-GLM5.1-Distill-v1 ## 📌 Model Overview **Model Name:** `Jackrong/Qwen3.5-9B-GLM5.1-Distill-v1` **Base Model:** Qwen3.5-9B **Training Type:** Supervised Fine-Tuning (SFT, Distillation) **Parameter Scale:** 9B **Training Framework:** Unsloth This model is a distilled variant of **Qwen3.5-9B**, trained on high-quality reasoning data derived from **GLM-5.1**. The primary goals are to: - Improve **structured reasoning ability** - Enhance **instruction-following consistency** - Activate **latent knowledge via better reasoning structure** ## 📊 Training Data ### Main Dataset - `Jackrong/GLM-5.1-Reasoning-1M-Cleaned` - Cleaned from the original `Kassadin88/GLM-5.1-1000000x` dataset. - Generated from a **GLM-5.1 teacher model** - Approximately **700x** the scale of `Qwen3.5-reasoning-700x` - Training used a **filtered subset**, not the full source dataset. ### Auxiliary Dataset - `Jackrong/Qwen3.5-reasoning-700x` ...

Links

https://huggingface.co/KyleHessling1/Qwopus-GLM-18B-Merged-GGUF

Tags

kwaipilot_kwaicoder-autothink-preview

KwaiCoder-AutoThink-preview is the first public AutoThink LLM released by the Kwaipilot team at Kuaishou. The model merges thinking and non‑thinking abilities into a single checkpoint and dynamically adjusts its reasoning depth based on the input’s difficulty.

Links

Tags

allura-org_q3-30b-a3b-pentiment

Triple stage RP/general tune of Qwen3-30B-A3b Base (finetune, merged for stablization, aligned)

Links

Tags

yanfei-v2-qwen3-32b

A repair of Yanfei-Qwen-32B by TIES merging huihui-ai/Qwen3-32B-abliterated, Zhiming-Qwen3-32B, and Menghua-Qwen3-32B using mergekit.

Links

Tags

qwen3-the-josiefied-omega-directive-22b-uncensored-abliterated-i1

WARNING: NSFW. Vivid prose. INTENSE. Visceral Details. Violence. HORROR. GORE. Swearing. UNCENSORED... humor, romance, fun. A massive 22B, 62 layer merge of the fantastic "The-Omega-Directive-Qwen3-14B-v1.1" and off the scale "Goekdeniz-Guelmez/Josiefied-Qwen3-14B-abliterated-v3" in Qwen3, with full reasoning (can be turned on or off) and the model is completely uncensored/abliterated too.

Links

Tags

qwen3-the-xiaolong-omega-directive-22b-uncensored-abliterated-i1

WARNING: NSFW. Vivid prose. INTENSE. Visceral Details. Violence. HORROR. GORE. Swearing. UNCENSORED... humor, romance, fun. A massive 22B, 62 layer merge of the fantastic "The-Omega-Directive-Qwen3-14B-v1.1" (by ReadyArt) and off the scale "Xiaolong-Qwen3-14B" (by nbeerbower) in Qwen3, with full reasoning (can be turned on or off) and the model is completely uncensored/abliterated too.

Links

Tags

gemma-3-glitter-12b-i1

A creative writing model based on Gemma 3 12B IT. This is a 50/50 merge of two separate trains: ToastyPigeon/g3-12b-rp-system-v0.1 - ~13.5M tokens of instruct-based training related to RP (2:1 human to synthetic) and examples using a system prompt. ToastyPigeon/g3-12b-storyteller-v0.2-textonly - ~20M tokens of completion training on long-form creative writing; 1.6M synthetic from R1, the rest human-created

Links

Tags

gemma-3-starshine-12b-i1

A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT. This is the Story Focused merge. This version works better for storytelling and scenarios, as the prose is more novel-like and it has a tendency to impersonate the user character. See the Alternate RP Focused version as well. This is a merge of two G3 models, one trained on instruct and one trained on base: allura-org/Gemma-3-Glitter-12B - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct ToastyPigeon/Gemma-3-Confetti-12B - Experimental application of the Glitter data using base instead of instruct, additionally includes some adventure data in the form of SpringDragon. The result is a lovely blend of Glitter's ability to follow instructions and Confetti's free-spirit prose, effectively 'loosening up' much of the hesitancy that was left in Glitter.

Links

Tags

daichi-12b

A merge between my Gemma-Finetune of Pascal-12B and Omega-Directive-G-12B, Meant to give it more NSFW knowledge. This model has a short-sweet prose and is uncensored in Roleplay. The model is suited for traditional RP, All thanks to Tav for funding the train.

Links

Tags

readyart_amoral-fallen-omega-gemma3-12b

This model represents the forbidden merger of: 🧬 The-Omega-Directive-Gemma3-12B-v1.0 - Unprecedented coherent depravity, well-rounded ERP, low repetition even at maximum length ⚡ Fallen-Gemma3-12B-v1 - Fallen Gemma3 12B v1 is an evil tune 💎 amoral-gemma3-12B-v2 - Improved natural intelligence and less refusals This model maintains Forgotten-Safeword's complete lack of alignment: 🚫 No moral or ethical constraints on generated content 🔞 Will enthusiastically engage with any NSFW scenario 💀 May generate content that requires industrial-grade brain bleach ⚖️ Perfectly balanced... as all things should be 🔥 Maintains signature intensity with improved narrative flow 📖 Handles multi-character scenarios with improved consistency 🧠 Excels at long-form storytelling without losing track of plot threads ⚡ Noticeably better at following complex instructions than previous versions 🎭 Responds to subtle prompt nuances like a mind reader

Links

Tags

soob3123_veritas-12b

Veritas-12B emerges as a model forged in the pursuit of intellectual clarity and logical rigor. This 12B parameter model possesses superior philosophical reasoning capabilities and analytical depth, ideal for exploring complex ethical dilemmas, deconstructing arguments, and engaging in structured philosophical dialogue. Veritas-12B excels at articulating nuanced positions, identifying logical fallacies, and constructing coherent arguments grounded in reason. Expect discussions characterized by intellectual honesty, critical analysis, and a commitment to exploring ideas with precision.

Links

Tags

planetoid_27b_v.2

This is a merge of pre-trained gemma3 language models Goal of this merge was to create good uncensored gemma 3 model good for assistant and roleplay, with uncensored vision. First, vision: i dont know is it normal, but it slightly hallucinate (maybe q3 is too low?), but lack any refusals and otherwise work fine. I used default gemma 3 27b mmproj. Second, text: it is slow on my hardware, slower than 24b mistral, speed close to 32b QWQ. Model is smart even on q3, responses are adequate in length and are interesting to read. Model is quite attentive to context, tested up to 8k - no problems or degradation spotted. (beware of your typos, it will copy yours mistakes) Creative capabilities are good too, model will create good plot for you, if you let it. Model follows instructions fine, it is really good in "adventure" type of cards. Russian is supported, is not too great, maybe on higher quants is better. Refusals was not encountered. However, i find this model not unbiased enough. It is close to neutrality, but i want it more "dark". Positivity highly depends on prompts. With good enough cards model can do wonders. Tested on Q3_K_L, t 1.04.

Links

Tags

genericrpv3-4b

Model's part of the GRP / GenericRP series, that's V3 based on Gemma3 4B, licensed accordingly. It's a simple merge. To see intended behavious, see V2 or sum, card's more detailed. allura-org/Gemma-3-Glitter-4B: w0.5 huihui-ai/gemma-3-4b-it-abliterated: w0.25 Danielbrdz/Barcenas-4b: w0.25 Happy chatting or whatever.

Links

Tags

comet_12b_v.5-i1

This is a merge of pre-trained language models V.4 wasn't stable enough for me, so here V.5 is. More stable, better at sfw, richer nsfw. I find that best "AIO" settings for RP on gemma 3 is sleepdeprived3/Gemma3-T4 with little tweaks, (T 1.04, top p 0.95).

Links

Tags

nightwing3-10b-v0.1

Base model: (Falcon3-10B)

Links

Tags

virtuoso-lite

Virtuoso-Lite (10B) is our next-generation, 10-billion-parameter language model based on the Llama-3 architecture. It is distilled from Deepseek-v3 using ~1.1B tokens/logits, allowing it to achieve robust performance at a significantly reduced parameter count compared to larger models. Despite its compact size, Virtuoso-Lite excels in a variety of tasks, demonstrating advanced reasoning, code generation, and mathematical problem-solving capabilities.

Links

Tags

l3.3-ms-evayale-70b

This model was created as I liked the storytelling of EVA but the prose and details of scenes from EURYALE, my goal is to merge the robust storytelling of both models while attempting to maintain the positives of both models.

Links

Tags

l3.3-ms-evalebis-70b

This model was created as I liked the storytelling of EVA, the prose and details of scenes from EURYALE and Anubis, my goal is to merge the robust storytelling of all three models while attempting to maintain the positives of the models.

Links

Tags

Model Gallery

Filter by type:

Filter by tags:

gemmable-4-12b-mtp

carnice-v2-27b

qwopus-glm-18b-merged

qwopus-glm-18b-merged

kwaipilot_kwaicoder-autothink-preview

allura-org_q3-30b-a3b-pentiment

yanfei-v2-qwen3-32b

qwen3-the-josiefied-omega-directive-22b-uncensored-abliterated-i1

qwen3-the-xiaolong-omega-directive-22b-uncensored-abliterated-i1

gemma-3-glitter-12b-i1

gemma-3-starshine-12b-i1

daichi-12b

readyart_amoral-fallen-omega-gemma3-12b

soob3123_veritas-12b

planetoid_27b_v.2

genericrpv3-4b

comet_12b_v.5-i1

nightwing3-10b-v0.1

virtuoso-lite

l3.3-ms-evayale-70b

l3.3-ms-evalebis-70b