Model Gallery

3 models from 1 repositories

Filter by type:

Filter by tags:

pinkpixel_crystal-think-v2

Crystal-Think is a specialized mathematical reasoning model based on Qwen3-4B, fine-tuned using Group Relative Policy Optimization (GRPO) on NVIDIA's OpenMathReasoning dataset. Version 2 introduces the new reasoning format for enhanced step-by-step mathematical problem solving, algebraic reasoning, and mathematical code generation.

Repository: localaiLicense: apache-2.0

menlo_rezero-v0.1-llama-3.2-3b-it-grpo-250404

ReZero trains a small language model to develop effective search behaviors instead of memorizing static data. It interacts with multiple synthetic search engines, each with unique retrieval mechanisms, to refine queries and persist in searching until it finds exact answers. The project focuses on reinforcement learning, preventing overfitting, and optimizing for efficiency in real-world search applications.

Repository: localaiLicense: llama3.2

jdineen_llama-3.1-8b-think

This model is a fine-tuned version of Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2 on the jdineen/grpo-with-thinking-500-tagged dataset. It has been trained using TRL.

Repository: localaiLicense: llama3.1