LocalAI - Models

llama-3.1-tulu-3-8b-dpo

Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.

Links

Tags

tulu-3.1-8b-supernova-i1

The following models were included in the merge: meditsolutions/Llama-3.1-MedIT-SUN-8B allenai/Llama-3.1-Tulu-3-8B arcee-ai/Llama-3.1-SuperNova-Lite

Links

Tags

llama-3.1-tulu-3-70b-dpo

Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.

Links

Tags

llama-3.1-tulu-3-8b-sft

Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.

Links

Tags

tulu-3.1-8b-supernova-smart

This model was merged using the passthrough merge method using bunnycore/Tulu-3.1-8B-SuperNova + bunnycore/Llama-3.1-8b-smart-lora as a base.

Links

Tags

allenai_llama-3.1-tulu-3.1-8b

Tülu 3 is a leading instruction following model family, offering a post-training package with fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern techniques. This is one step of a bigger process to training fully open-source models, like our OLMo models. Tülu 3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval. Version 3.1 update: The new version of our Tülu model is from an improvement only in the final RL stage of training. We switched from PPO to GRPO (no reward model) and did further hyperparameter tuning to achieve substantial performance improvements across the board over the original Tülu 3 8B model.

Links

Tags

Model Gallery

Filter by type:

Filter by tags:

llama-3.1-tulu-3-8b-dpo

tulu-3.1-8b-supernova-i1

llama-3.1-tulu-3-70b-dpo

llama-3.1-tulu-3-8b-sft

tulu-3.1-8b-supernova-smart

allenai_llama-3.1-tulu-3.1-8b