Model Gallery

3 models from 1 repositories

Filter by type:

Filter by tags:

menlo_rezero-v0.1-llama-3.2-3b-it-grpo-250404

ReZero trains a small language model to develop effective search behaviors instead of memorizing static data. It interacts with multiple synthetic search engines, each with unique retrieval mechanisms, to refine queries and persist in searching until it finds exact answers. The project focuses on reinforcement learning, preventing overfitting, and optimizing for efficiency in real-world search applications.

Repository: localaiLicense: llama3.2

agentica-org_deepcoder-1.5b-preview

DeepCoder-1.5B-Preview is a code reasoning LLM fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B using distributed reinforcement learning (RL) to scale up to long context lengths. Data Our training dataset consists of approximately 24K unique problem-tests pairs compiled from: Taco-Verified PrimeIntellect SYNTHETIC-1 LiveCodeBench v5 (5/1/23-7/31/24)

Repository: localaiLicense: mit

skywork_skywork-or1-32b

The Skywork-OR1 (Open Reasoner 1) model series consists of powerful math and code reasoning models trained using large-scale rule-based reinforcement learning with carefully designed datasets and training recipes. This series includes two general-purpose reasoning modelsl, Skywork-OR1-7B and Skywork-OR1-32B. Skywork-OR1-32B outperforms Deepseek-R1 and Qwen3-32B on math tasks (AIME24 and AIME25) and delivers comparable performance on coding tasks (LiveCodeBench). Skywork-OR1-7B exhibits competitive performance compared to similarly sized models in both math and coding scenarios.

Repository: localai