> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getlimina.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Kubernetes Setup Guide

> This guide will help you to get started with the deployment of Limina container in a Kubernetes cluster including prerequisites & more.

This guide will help you to get started with the deployment of Limina container in a Kubernetes cluster.

## 1. Prerequisites

<Steps>
  <Step title="Install and setup kubectl" titleSize="h3">
    The Kubernetes command-line tool, `kubectl`, allows you to run commands against Kubernetes clusters.

    Find installation instructions for your OS [here](https://kubernetes.io/docs/tasks/tools/#kubectl).
  </Step>

  <Step title="Setup your Kubernetes cluster" titleSize="h3">
    There are many flavours of Kubernetes available that you can choose from. Setup the one that best suits your needs. Here are few popular Kubernetes services and distributions.

    * [Azure Kubernetes Services (AKS)](https://docs.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-cli)

    * [Amazon Web Services (EKS)](https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html)

    * [Google Cloud Platfrom (GKE)](https://cloud.google.com/kubernetes-engine/docs/deploy-app-cluster)

    * [Minikube](https://minikube.sigs.k8s.io/docs/start/)

    * [MicroK8s](https://microk8s.io/)

    <Info>
      For recommendations on machine type, see our [System Requirements](/configuration-and-operations/entity-detection-and-redaction/customizing-detection) Section.
    </Info>
  </Step>

  <Step title="Setup a container registry" titleSize="h3">
    Setup a container registry by creating a secret for Limina’s private registry. Only after this step, you’ll be able to pull Limina's private docker images.

    ```shell Kubernetes Command wrap theme={"theme":"poimandres"}
    kubectl create secret docker-registry pai-cr-creds --docker-server="crprivateaiprod.azurecr.io" --docker-username="<your docker username>" --docker-password=<your docker password>
    ```

    <Info>
      If you don't have credentials for the Limina Container Registry, log into the Limina Customer Portal to generate and retrieve them to use
    </Info>

    See this [blog article](https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/) for more details on pulling images from a private registry.
  </Step>
</Steps>

## 2. Deploying the Container

The container can be deployed via the steps in 2.1 or via a Helm chart described in 2.2.

### 2.1 Deploy the container

<Steps>
  <Step title="Setting up your license file">
    Log into the Limina Customer portal and download your license file.

    Once you've downloaded the file (license.json), open the file in a text editor and paste the contents of the file in a license manifest `pai-license.yaml`

    It should look something like this:

    ```yaml License Manifest wrap theme={"theme":"poimandres"}
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: pai-license
    data:
      license.info: |
        {
        "id": 1234,
        "tier": "demo",
        "expires_at": "2024-01-01T12:00:00+00:00",
        "permissions": [
            {
                "permission_type": "credits",
                "allowed_value": 1234
            },
            ...        
        ],
        "user": "<YourCustomerID>",
        "metering_id": "<anotherID>",
        "licensing_api_key": "<yetAnotherID>",
        "signature": "<SignatureHash>"
        }
    ```

    Now run the following command to load the ConfigMap with your license into your Kubernetes cluster:

    ```shell Kubernetes Command theme={"theme":"poimandres"}
    kubectl apply -f pai-license.yaml
    ```
  </Step>

  <Step title="Deploying the deid application">
    Now that we have all the things in place, let’s create the manifest file `deploy-private-ai.yaml`

    <CodeGroup>
      ```yaml CPU Configuration lines wrap theme={"theme":"poimandres"}
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: private-ai-deployment
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: private-ai-app
        template:
          metadata:
            labels:
              app: private-ai-app
          spec:
            affinity:
              podAntiAffinity:  # So that only one pod runs per node.
                requiredDuringSchedulingIgnoredDuringExecution:
                  - labelSelector:
                      matchExpressions:
                        - key: app
                          operator: In
                          values:
                            - private-ai-app
                    topologyKey: "kubernetes.io/hostname"
            imagePullSecrets:
              - name: pai-cr-creds
            containers:
              - name: pai-container
                image: <private-ai-image>:<tag> # replace placeholders with appropriate image name and tag, example: crprivateaiprod.azurecr.io/deid:3.3.2-cpu
                volumeMounts:
                  - name: license-volume
                    mountPath: /app/license
                readinessProbe:
                  failureThreshold: 3
                  httpGet:
                    path: /healthz
                    port: 8080
                    scheme: HTTP
                  initialDelaySeconds: 30
                  periodSeconds: 10
                  successThreshold: 1
                  timeoutSeconds: 10
                livenessProbe:
                  failureThreshold: 3
                  httpGet:
                    path: /healthz
                    port: 8080
                    scheme: HTTP
                  initialDelaySeconds: 40
                  periodSeconds: 60
                  successThreshold: 1
                  timeoutSeconds: 10
            volumes:
              - name: license-volume
                configMap:
                  name: pai-license
                  items:
                  - key: "license.info"
                    path: "license.json"
            terminationGracePeriodSeconds: 120
      ---
      apiVersion: v1  # To see available service types https://kubernetes.io/docs/concepts/services-networking/service/
      kind: Service
      metadata:
        name: private-ai-service
      spec:
        type: LoadBalancer
        selector:
          app: private-ai-app
        ports:
          - name: http
            port: 80
            targetPort: 8080
      ```

      ```yaml GPU Configuration lines wrap theme={"theme":"poimandres"}
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: private-ai-deployment
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: private-ai-app
        template:
          metadata:
            labels:
              app: private-ai-app
          spec:
            affinity:
              podAntiAffinity:  # So that only one pod runs per node.
                requiredDuringSchedulingIgnoredDuringExecution:
                  - labelSelector:
                      matchExpressions:
                        - key: app
                          operator: In
                          values:
                            - private-ai-app
                    topologyKey: "kubernetes.io/hostname"
            imagePullSecrets:
              - name: pai-cr-creds
            containers:
              - name: pai-container
                image: <private-ai-image>:<tag> # replace placeholders with appropriate image name and tag, example: crprivateaiprod.azurecr.io/deid:3.3.2-gpu
                resources:
                  requests:
                    nvidia.com/gpu: 1
                  limits:
                    nvidia.com/gpu: 1
                volumeMounts:
                  - name: license-volume
                    mountPath: /app/license
                  - name: dshm-volume
                    mountPath: /dev/shm
                readinessProbe:
                  failureThreshold: 3
                  httpGet:
                    path: /healthz
                    port: 8080
                    scheme: HTTP
                  initialDelaySeconds: 30
                  periodSeconds: 10
                  successThreshold: 1
                  timeoutSeconds: 10
                livenessProbe:
                  failureThreshold: 3
                  httpGet:
                    path: /healthz
                    port: 8080
                    scheme: HTTP
                  initialDelaySeconds: 40
                  periodSeconds: 60
                  successThreshold: 1
                  timeoutSeconds: 10
            volumes:
              - name: license-volume
                configMap:
                  name: pai-license
                  items:
                  - key: "license.info"
                    path: "license.json"
              - name: dshm-volume
                emptyDir:
                  medium: Memory
            terminationGracePeriodSeconds: 120
      ---
      apiVersion: v1  # To see available service types https://kubernetes.io/docs/concepts/services-networking/service/
      kind: Service
      metadata:
        name: private-ai-service
      spec:
        type: LoadBalancer
        selector:
          app: private-ai-app
        ports:
          - name: http
            port: 80
            targetPort: 8080
      ```
    </CodeGroup>

    Now create a deployment using this `kubectl` command.

    ```shell Kubernetes Command theme={"theme":"poimandres"}
    kubectl apply -f deploy-private-ai.yaml
    ```
  </Step>
</Steps>

### 2.2 Deploy the container via Helm

Limina supports installation to a Kubernetes cluster via Helm. Before you begin, ensure that you have [helm installed](https://helm.sh/docs/intro/install/).

Our public Helm chart is hosted [here](https://github.com/privateai/private-ai-helm/tree/main). Simply pull the chart, replace the placeholder license with your license file (from your customer portal) and run

```shell Helm Command theme={"theme":"poimandres"}
helm package .
helm install --namespace <namespace> private-ai ./private-ai-0.1.0.tgz
```

replacing `<namespace>` with the space of your choice in your cluster.

<Info>
  The helm chart for the Limina container can also be used to deploy on OpenShift container platform clusters.
</Info>

## 3. Post Deployment

### 3.1 Checking the status of containers

Once deployed successfully, you’ll be able to check the status of pods with this command:

```shell Kubernetes Command theme={"theme":"poimandres"}
kubectl get pods
```

expected output

```text Output theme={"theme":"poimandres"}
NAME          READY   STATUS    RESTARTS   AGE
<pod-name>    1/1     Running   0          1m
```

To check the logs, run this command with your pod name

```shell Kubernetes Command theme={"theme":"poimandres"}
kubectl logs <pod-name>    # change <pod-name> with the name of pod from the command above
```

expected output

```text Output theme={"theme":"poimandres"}
Log level is: info
Image Version: <version>
INFO:     Started server process [9]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://ip:port (Press CTRL+C to quit)
INFO:     ip:port - "GET /healthz HTTP/1.1" 200 OK
 model time is 44.28 ms or 89.05 percent, rx time is 0.42 ms or 0.85 percent, total time: 49.73 ms
Auth call to Limina servers took 154.39 ms
Got 100000 calls from PAI auth system
 1 API calls used, 99999 remaining until next auth call. Total processing time is 0.05 secs, 19.91 API calls per sec.
INFO:     ip:port - "POST /deidentify_text HTTP/1.1" 200 OK
```

The above `deploy-private-ai.yaml` also creates a LoadBalancer service which exposes an IP address to access your application. To check the external IP, run this:

```shell Kubernetes Command theme={"theme":"poimandres"}
kubectl get svc
```

expected output

```text Output theme={"theme":"poimandres"}
NAME      TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)        AGE
deid-ip   LoadBalancer   <cluster-ip>  <external-ip>   80:30456/TCP   27m
```

### 3.2 Making requests

Your can use `external-ip` (from the command above) of LoadBalancer service to make requests to deidentify text.

<CodeGroup>
  ```json Request Body lines wrap theme={"theme":"poimandres"}
  {
    "text": [
      "Hi John, Grace this side. It's been a while since we last met in Berlin."
    ]
  }
  ```

  ```shell cURL lines wrap theme={"theme":"poimandres"}
  curl --location --request POST 'http://<external-ip>/process/text' \
  --header 'Content-Type: application/json' \
  --data-raw '{"text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."]}'
  ```

  ```python Python lines wrap theme={"theme":"poimandres"}
  import requests

  r = requests.post(url="http://<external-ip>/process/text",                  
                    json={"text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."]})

  results = r.json()

  print(results)
  ```

  ```python Python Client lines wrap theme={"theme":"poimandres"}
  from privateai_client import PAIClient
  from privateai_client import request_objects

  client = PAIClient(url="http://<external-ip>")

  text_request = request_objects.process_text_obj(text=["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."])
  response = client.process_text(text_request)

  print(response.processed_text)
  ```
</CodeGroup>

You can expect a response like this:

```json Response lines wrap theme={"theme":"poimandres"}
[
  {
    "processed_text": "Hi [NAME_1], [NAME_2] this side. It's been a while since we last met in [LOCATION_CITY_1].",
    "entities": [...],
    "entities_present": true,
    "characters_processed": 1234,
    "languages_detected": {...}
  }
]
```

## Additional Resources

* [Autoscaling your Kubernetes deployment](https://getlimina.ai/blog/how-to-autoscale-kubernetes-pods-based-on-gpu/)
