Model Gallery

15 models from 1 repositories

Filter by type:

Filter by tags:

streaming-zipformer-en-sherpa
Streaming English ASR: sherpa-onnx zipformer transducer (int8, chunk-16 left-128). Low-latency real-time transcription with endpoint detection via sherpa-onnx's online recognizer. English-only; for multilingual offline ASR see omnilingual-0.3b-ctc-q8-sherpa.

Repository: localaiLicense: apache-2.0

edgetam
EdgeTAM is an ultra-efficient variant of the Segment Anything Model (SAM) for image segmentation. It uses a RepViT backbone and is only ~16MB quantized (Q4_0), making it ideal for edge deployment. Supports point-prompted and box-prompted image segmentation via the /v1/detection endpoint. Powered by sam3.cpp (C/C++ with GGML).

Repository: localaiLicense: apache-2.0

nbeerbower_qwen3-gutenberg-encore-14b
nbeerbower/Xiaolong-Qwen3-14B finetuned on: jondurbin/gutenberg-dpo-v0.1 nbeerbower/gutenberg2-dpo nbeerbower/gutenberg-moderne-dpo nbeerbower/synthetic-fiction-dpo nbeerbower/Arkhaios-DPO nbeerbower/Purpura-DPO nbeerbower/Schule-DPO

Repository: localaiLicense: apache-2.0

fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo
The LLM model is a quantized version of EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO, which is an experimental and revolutionary fine-tune with DPO dataset to allow LLama 3.1 8B to be an agentic coder. It has some built-in agent features such as search, calculator, and ReAct. Other noticeable features include self-learning using unsloth, RAG applications, and memory. The context window of the model is 128K. It can be integrated into projects using popular libraries like Transformers and vLLM. The model is suitable for use with Langchain or LLamaIndex. The model is developed by EpistemeAI and licensed under the Apache 2.0 license.

Repository: localaiLicense: apache-2.0

llama-3.1-techne-rp-8b-v1
athirdpath/Llama-3.1-Instruct_NSFW-pretrained_e1-plus_reddit was further trained in the order below: SFT Doctor-Shotgun/no-robots-sharegpt grimulkan/LimaRP-augmented Inv/c2-logs-cleaned-deslopped DPO jondurbin/truthy-dpo-v0.1 Undi95/Weyaxi-humanish-dpo-project-noemoji athirdpath/DPO_Pairs-Roleplay-Llama3-NSFW

Repository: localaiLicense: llama3.1

calme-2.3-legalkit-8b-i1
This model is an advanced iteration of the powerful meta-llama/Meta-Llama-3.1-8B-Instruct, specifically fine-tuned to enhance its capabilities in the legal domain. The fine-tuning process utilized a synthetically generated dataset derived from the French LegalKit, a comprehensive legal language resource. To create this specialized dataset, I used the NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO model in conjunction with Hugging Face's Inference Endpoint. This approach allowed for the generation of high-quality, synthetic data that incorporates Chain of Thought (CoT) and advanced reasoning in its responses. The resulting model combines the robust foundation of Llama-3.1-8B with tailored legal knowledge and enhanced reasoning capabilities. This makes it particularly well-suited for tasks requiring in-depth legal analysis, interpretation, and application of French legal concepts.

Repository: localaiLicense: llama3.1

fireball-llama-3.11-8b-v1orpo
Developed by: EpistemeAI License: apache-2.0 Finetuned from model : unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit Finetuned methods: DPO (Direct Preference Optimization) & ORPO (Odds Ratio Preference Optimization)

Repository: localaiLicense: apache-2.0

humanish-roleplay-llama-3.1-8b-i1
A DPO-tuned Llama-3.1 to behave more "humanish", i.e., avoiding all the AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets: General conversations from Claude Opus, from Undi95/Meta-Llama-3.1-8B-Claude Undi95/Weyaxi-humanish-dpo-project-noemoji, to make the model react as a human, rejecting assistant-like or too neutral responses. ResplendentAI/NSFW_RP_Format_DPO, to steer the model towards using the *action* format in RP settings. Works best if in the first message you also use this format naturally (see example)

Repository: localaiLicense: apache-2.0

llama3.1-flammades-70b
nbeerbower/Llama3.1-Gutenberg-Doppel-70B finetuned on flammenai/Date-DPO-NoAsterisks and jondurbin/truthy-dpo-v0.1.

Repository: localaiLicense: llama3.1

llama3.1-gutenberg-doppel-70b
mlabonne/Hermes-3-Llama-3.1-70B-lorablated finetuned on jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo.

Repository: localaiLicense: llama3.1

control-8b-v1.1
An experimental finetune based on the Llama3.1 8B Supernova with it's primary goal to be "Short and Sweet" as such, i finetuned the model for 2 epochs on OpenCAI Sharegpt converted dataset and the RP-logs datasets in a effort to achieve this, This version of Control has been finetuned with DPO to help improve the smart's and coherency which was a flaw noticed in the previous model.

Repository: localaiLicense: llama3.1

llama-3.1-tulu-3-8b-dpo
Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.

Repository: localaiLicense: llama3.1

llama-3.1-tulu-3-70b-dpo
Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.

Repository: localaiLicense: llama3.1

noromaid-13b-0.4-DPO
Noromaid-13B-0.4-DPO is a 13B parameter language model based on Llama2, fine-tuned for roleplay and chat using Direct Preference Optimization. It is distributed in GGUF quantized format for efficient local inference. The model supports custom system prompts and is optimized for roleplay interfaces like SillyTavern.

Repository: localaiLicense: cc-by-nc-4.0

wan-2.1-flf2v-14b-720p-ggml
Wan 2.1 FLF2V 14B 720P — first-last-frame-to-video diffusion, GGUF Q4_K_M. Takes a start and end reference image and interpolates a 33-frame clip between them. Unlike the plain I2V variant this model feeds the end frame through clip_vision as well, so it conditions semantically (not just in pixel-space) on both endpoints. That makes it the right choice for seamless loops (start_image == end_image) and clean narrative cuts. Native 720p but accepts 480p resolutions; shares the same VAE, t5xxl text encoder, and clip_vision_h as I2V 14B.

Repository: localaiLicense: apache-2.0