# Llama2 API

### llama-cpp-python FastAPI Server Template

<figure><img src="https://1055927812-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FoGGFDGq5A7XY3W3oho9G%2Fuploads%2FAkhXUfHIzFtBBv0o38Ln%2Fllama2API.webp?alt=media&#x26;token=8699810b-673b-44ab-b214-c139bc865e5c" alt=""><figcaption></figcaption></figure>

This template installs the Python wrapper of llama.cpp from llama-cpp-python and runs a FastAPI-based Python server. The server can be utilized as a drop-in replacement for the OpenAI API. This means that code compatible with the ChatGPT API will also be compatible with this self-hosted version.

### Features

* **Compatibility**: Compatible with code designed for the OpenAI API, making it easy to transition to self-hosting.
* **Pre-Converted and Quantized Model**: Configured to fetch a pre-converted and quantized llama code instruct model from TheBloke on Hugging Face. However, you have the flexibility to use other compatible models by editing the configuration.
* **Interactive Documentation**: Upon starting the server, open the deployment URL and append /docs to access a Swagger-like API documentation. Here, you can view and test all available endpoints easily.
* **Customizable Token Settings**: By default, the max\_token setting of this model is restrictive to support less capable hardware. You can adjust this setting by navigating into the installed folder /venv/lib/python3.8/site-packages/llama\_cpp/server/app.py and editing line 412 in the following manner:

```python
max_tokens_field = Field(
    default=400, ge=1, description="The maximum number of tokens to generate."
)
```

### Usage

* **Installation**: Clone this repository and install the dependencies.
* **Configuration**: Edit the configuration to specify your desired model or settings.
* **Start the Server**: Execute the command to start the server.
* **Access Documentation**: Open the deployment URL in your browser and append /docs to access the interactive API documentation.

### Note

Please be aware that the default configuration might not suit your specific requirements. Feel free to customize the settings according to your needs.
