LocalAI - Models

openvino-llama-3-8b-instruct-ov-int8

OpenVINO IR model with int8 quantization of Meta's Llama 3 8B Instruct. Optimized for dialogue use cases and instruction following. Supports an 8k context window.

Links

https://huggingface.co/fakezeta/llama-3-8b-instruct-ov-int8

Tags

openvino-phi3

An OpenVINO-optimized version of the Phi-3 Mini instruction-tuned model with 3.8 billion parameters. It supports a 128k context window and is designed for reasoning, coding, and chat tasks in compute-constrained environments.

Links

https://huggingface.co/fakezeta/Phi-3-mini-128k-instruct-ov-int8

Tags

openvino-llama3-aloe

Aloe is a healthcare-focused large language model based on Meta Llama 3 8B, optimized for OpenVINO inference with int8 quantization. It is instruction-tuned for medical and ethical reasoning tasks, offering competitive performance on healthcare QA datasets.

Links

https://huggingface.co/fakezeta/Llama3-Aloe-8B-Alpha-ov-int8

Tags

openvino-starling-lm-7b-beta-openvino-int8

Starling-LM-7B-beta is a Mistral-7B based chat model finetuned with RLHF and RLAIF for improved instruction following. This OpenVINO IR version features int8 quantization for optimized local inference. It utilizes the OpenChat chat template for consistent conversational output.

Links

https://huggingface.co/fakezeta/Starling-LM-7B-beta-openvino-int8

Tags

openvino-wizardlm2

WizardLM-2 7B instruction-tuned language model optimized for OpenVINO backend. Supports conversational chat and text completion with 8192 context window.

Links

https://huggingface.co/fakezeta/Not-WizardLM-2-7B-ov-int8

Tags

openvino-hermes2pro-llama3

OpenVINO optimized 8B instruction-tuned Llama-3 model based on the Hermes-2-Pro fine-tune. Features support for function calling and JSON mode, designed for efficient inference.

Links

https://huggingface.co/fakezeta/Hermes-2-Pro-Llama-3-8B-ov-int8

Tags

openvino-multilingual-e5-base

Multilingual E5 base embedding model optimized for semantic similarity and retrieval tasks. Supports OpenVINO and ONNX inference formats. Ideal for cross-lingual vector search and semantic matching.

Links

https://huggingface.co/intfloat/multilingual-e5-base

Tags

openvino-all-MiniLM-L6-v2

This sentence-transformers model maps text to 384-dimensional dense vectors for semantic similarity tasks. Based on the MiniLM architecture, it is optimized for OpenVINO inference. Ideal for retrieval-augmented generation (RAG) pipelines.

Links

https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

Tags

Model Gallery

Filter by type:

Filter by tags:

openvino-llama-3-8b-instruct-ov-int8

openvino-phi3

openvino-llama3-aloe

openvino-starling-lm-7b-beta-openvino-int8

openvino-wizardlm2

openvino-hermes2pro-llama3

openvino-multilingual-e5-base

openvino-all-MiniLM-L6-v2