Skip to main content

OpenLLM

OpenLLM lets developers run any open-source LLMs as OpenAI-compatible API endpoints with a single command.

  • πŸ”¬ Build for fast and production usages
  • πŸš‚ Support llama3, qwen2, gemma, etc, and many quantized versions full list
  • ⛓️ OpenAI-compatible API
  • πŸ’¬Β Built-in ChatGPT like UI
  • πŸ”₯ Accelerated LLM decoding with state-of-the-art inference backends
  • πŸŒ₯️ Ready for enterprise-grade cloud deployment (Kubernetes, Docker and BentoCloud)

Installation and Setup​

Install the OpenLLM package via PyPI:

pip install openllm

LLM​

OpenLLM supports a wide range of open-source LLMs as well as serving users' own fine-tuned LLMs. Use openllm model command to see all available models that are pre-optimized for OpenLLM.

Wrappers​

There is a OpenLLM Wrapper which supports interacting with running server with OpenLLM:

from langchain_community.llms import OpenLLM
API Reference:OpenLLM

Wrapper for OpenLLM server​

This wrapper supports interacting with OpenLLM's OpenAI-compatible endpoint.

To run a model, do:

openllm hello

Wrapper usage:

from langchain_community.llms import OpenLLM

llm = OpenLLM(base_url="http://localhost:3000/v1", api_key="na")

llm("What is the difference between a duck and a goose? And why there are so many Goose in Canada?")
API Reference:OpenLLM

Usage​

For a more detailed walkthrough of the OpenLLM Wrapper, see the example notebook


Was this page helpful?


You can also leave detailed feedback on GitHub.