> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getlimina.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Processing PowerPoint (PPT/PPTX) Files

> This guide will get you started with PPTX deidentification.

Limina supports scanning Microsoft PowerPoint PPT and PPTX files for PII and creating de-identified or redacted copies. Limina’s supported entity types function across each file type, with localized variants of different **PII** (Personally Identifiable Information) entities, **PHI** (Protected Health Information) entities, and **PCI** (Payment Card Industry) entities being detected. Our [Supported Languages](/languages) and [Supported Entity Types](/entities) page provides a more detailed look.

<Info>
  If you'd like to try it yourself, please [sign up for an account](https://portal.getlimina.ai/) to get a free API key.
</Info>

## How PPTX Files Are Processed

PPTX files are processed by extracting each element and processing according to the table below. The de-identified or redacted file is created by according to the behaviour specified in the table.

| Property Type          | Details                                                                                                                                      | Behaviour                                   |
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- |
| Core properties        | Author, Category, Comments, Content Status, Identifier, Keywords, Language, Last Modified By, Subject, Title, Version                        | Redact                                      |
| Speaker notes          | Any content in the speakers notes                                                                                                            | Redact                                      |
| Tables                 | Table objects with text and images                                                                                                           | Redact                                      |
| Images                 | The [Images](/configuration-and-operations/working-with-files/processing-files/image) page provides a more detailed look at Image processing | Redact, unsupported image types are removed |
| Text boxes             | Main slide content                                                                                                                           | Redact                                      |
| Embedded links         | Hyperlinks to internet pages or documents                                                                                                    | Remove                                      |
| External elements      | Tables and charts embedded from another document or file, such as an Excel chart                                                             | Remove external file, redact cached values  |
| Embedded audio & video | Videos and audio clips                                                                                                                       | Remove                                      |
| Review comments        | Comments from document reviews                                                                                                               | Redact                                      |
| Shape objects          | Shapes containing text                                                                                                                       | Redact                                      |

<Info>
  You can configure the OCR System by setting it as an [Environment Variable](/configuration-and-operations/container-management/environment-variables) or sending it in the request object. Check out our [OCR Guide](/configuration-and-operations/working-with-files/processing-files/ocr-modes) to further understand the OCR modes and their usage.
</Info>

## How PPT Files Are Processed

PPT files are processed by converting into PPTX files, followed the process described above and then converting back to PPT files.

## Constraints

* If a piece of PII text has more than one style (different fonts, font sizes, underline etc.), the redaction marker will use the first style.
* Charts in PPTX files will have all of their numerical data set to 0.
* We recommend using Microsoft PowerPoint to open the processed PPT/PPTX files. Other editors may not give ideal results.

## Support Matrix

|           | CPU Container | GPU Container | Community API | Professional API |
| --------- | ------------- | ------------- | ------------- | ---------------- |
| Supported | Yes           | Yes           | Up to 10 MiB  | No               |

## Sample Request

<Info>
  [Connect with one of our privacy experts](https://getlimina.ai/contact-us/?utm_source=docs\&utm_medium=website) to run this code.
</Info>

<CodeGroup>
  ```json Request Body wrap lines theme={"theme":"poimandres"}
  {
    "file": {
      "data": "<file_content_base64>",
      "content_type": "application/vnd.openxmlformats-officedocument.presentationml.presentation"
    },
    "entity_detection": {
      "return_entity": true
    }
  }
  ```

  ```shell curl wrap lines theme={"theme":"poimandres"}
  echo '{
            "file": {"data": "'$(base64 -w 0 sample.pptx)'", 
            "content_type": "application/vnd.openxmlformats-officedocument.presentationml.presentation"}, 
            "entity_detection": {"return_entity": "True"}
        }' \
  | curl --request POST --url 'https://api.getlimina.ai/community/v4/process/files/base64' \
         -H 'Content-Type: application/json' \
         -H 'x-api-key: <YOUR KEY HERE>' \
         -d @- \
         | jq -r .processed_file \
         | base64 -d > 'sample.redacted.pptx'
  ```

  ```python python wrap lines theme={"theme":"poimandres"}
  import requests
  import base64

  file_url = "https://paidocumentation.blob.core.windows.net/$web/sample.pptx"
  filename_out = "/path/to/output/sample.redacted.pptx"
  file_content = requests.get(file_url).content
  file_content_base64 = base64.b64encode(file_content).decode()

  url = "https://api.getlimina.ai/community/v4/process/files/base64"

  headers = {"Content-Type": "application/json", "x-api-key": "<INSERT API KEY>"}

  payload = {
    "file":{
      "data": file_content_base64,
      "content_type": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
    },
    "entity_detection": {
      "return_entity": True
    }
  }

  response = requests.post(url, json=payload, headers=headers)
  with open(filename_out, "wb") as f:
      f.write(base64.b64decode(response.json()["processed_file"]))
  ```

  ```python Python Client wrap lines theme={"theme":"poimandres"}
  from privateai_client import PAIClient
  from privateai_client.objects import request_objects
  import base64

  filename_in = "sample.pptx"
  filename_out = "sample.redacted.pptx"

  file_type= "application/vnd.openxmlformats-officedocument.presentationml.presentation"
  client = PAIClient(url="https://api.getlimina.ai/community/v4/", api_key="<YOUR API KEY>")

  with open(filename_in, "rb") as b64_file:
      file_data = base64.b64encode(b64_file.read())
      file_data = file_data.decode("ascii")

  file_obj = request_objects.file_obj(data=file_data, content_type=file_type)
  request_obj = request_objects.file_base64_obj(file=file_obj)
  resp = client.process_files_base64(request_object=request_obj)

  with open(filename_out, 'wb') as redacted_file:
      processed_file = resp.processed_file.encode("ascii")
      processed_file = base64.b64decode(processed_file, validate=True)
      redacted_file.write(processed_file)
  ```
</CodeGroup>

## Sample Response

```json Response wrap lines theme={"theme":"poimandres"}
{
  "processed_file": "Base64 Encoded File Content of the Redacted File",
  "processed_text": "string",
  "entities": "List[Entity]",
  "entities_present": true,
  "languages_detected": {"lang_1": 0.67, "lang_2": 0.74}
}
```