> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getlimina.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Prerequisites and System Requirements

> Detailing the prerequisites that are required to run Limina's container, as well as the minimum and recommended hardwire requirements.

## Prerequisites

<Note>
  Please only run one container instance per machine. Running multiple containers results in vastly reduced performance.
</Note>

The following prerequisites are required to run the container:

* Container engine, such as Docker (can be installed using the [official instructions](https://docs.docker.com/engine/install/))
* (GPU only) Nvidia Container Toolkit with Nvidia driver version 515 or higher (can be installed using the following [installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker))

All other dependencies, such as CUDA are included with the container and don't need to be installed separately.

## System Requirements

The image comes in two different build flavours:

* A compact, CPU-only container that runs on any Intel or AMD CPU and a container with GPU acceleration. The CPU container is highly optimised for the majority of use cases, as the container uses hand-coded AMX/AVX2/AVX512/AVX512 VNNI instructions in conjunction with Neural Network compression techniques to deliver a \~25X speedup over a reference transformer-based system.

* A GPU container is designed for large-scale deployments making billions of API calls or processing terabytes of data per month.

### Minimum Requirements

The minimum system requirements for the container image are as follows:

|         | Minimum                                                                                                                                                          | Recommended (Text only)                                                                   | Recommended (All Features)                                                                | Recommended Request Concurrency |
| ------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | ------------------------------- |
| **CPU** | Any x86 (Intel or AMD) processor with 7.5GB free RAM and 50GB disk volume                                                                                        | Intel Sapphire Rapids or newer CPUs supporting AMX with 16GB RAM and 50GB disk volume     | Intel Sapphire Rapids or newer CPUs supporting AMX with 64GB RAM and 100GB disk volume    | 1                               |
| **GPU** | Any x86 (Intel or AMD) processor with 28GB free RAM. Nvidia GPU with compute capability 7.0 or higher (Volta or newer) and at least 16GB VRAM. 100GB disk volume | Any x86 (Intel or AMD) processor with 32GB RAM and Nvidia Tesla T4 GPU. 100GB disk volume | Any x86 (Intel or AMD) processor with 64GB RAM and Nvidia Tesla T4 GPU. 100GB disk volume | 16                              |

Please visit the [recommended concurrency levels](/configuration-and-operations/container-management/concurrency) page to set the number of concurrent requests optimally.

The Limina image can also [run on the new Apple chips](/installation/running-the-container#apple-silicon), such as the M2. Performance will be degraded however, due to the Rosetta 2 emulation of the AVX instructions.

### Recommended Requirements

#### CPU

While Limina CPU-based container will run on any x86-compatible instance, the below cloud instance types give optimal throughput and latency per dollar:

| Platform | Recommended Instance Type (Text only) | Recommended Instance Type (All Features) |
| -------- | ------------------------------------- | ---------------------------------------- |
| Azure    | Standard\_E2\_v5 (2 vCPUs, 16GB RAM)  | Standard\_E8\_v5 (8 vCPUs, 64GB RAM)     |
| AWS      | m7i.xlarge (4 vCPUs, 16GB RAM)        | m7i.4xlarge (16 vCPUs, 64GB RAM)         |
| GCP      | n2-standard-4 (4 vCPUs, 16GB RAM)     | n2-standard-16 (16 vCPUs, 64GB RAM)      |

<Note>
  In the event when a lower latency is required, the instance type should be scaled; e.g. using an m7i.2xlarge in place of a m7i.xlarge. While the Limina container solution can make use of all available CPU cores, it delivers best throughput per dollar using a single CPU core machine. Scaling CPU cores does not result in a linear increase in performance.
</Note>

#### GPU

Similarly for the GPU-based image, Limina recommends the following Nvidia T4 GPU-equipped instance types:

| Platform | Recommended Instance Type (Text only)        | Recommended Instance Type (All Features)       |
| -------- | -------------------------------------------- | ---------------------------------------------- |
| Azure    | Standard\_NC4as\_T4\_v3 (4 vCPUs, 28GB RAM)  | Standard\_NC8as\_T4\_v3 (8 vCPUs, 56GB RAM)    |
| AWS      | g4dn.2xlarge (8 vCPUs, 32GB RAM)             | g4dn.4xlarge (16 vCPUs, 64GB RAM)              |
| GCP      | n1-standard-8 + Tesla T4 (8 vCPUs, 30GB RAM) | n1-standard-16 + Tesla T4 (16 vCPUs, 60GB RAM) |
