Skip to main content

日本語
This guide walks through the main features of the Limina API. It focuses on pure text applications, but can extend to processing files.
This guide uses Limina’s cloud API. Get a free API key to run the examples.If you’re using the container instead, follow the container quickstart and replace the endpoint with http://localhost:8080.

Basic Use

The process/text endpoint accepts a list of text strings and replaces each piece of PII found with a redaction marker. A simple request looks like this:
{
  "text": [
    "Thank you for calling the Georgia Division of Transportation. My name is miss Johanna, and it is a pleasure assisting you today. For security reasons, may I please have your Social Security number? Yes, 614-5555 01."
  ]
}
The response contains two main outputs:
  • processed_text, the redacted, masked or synthetic text as defined by processed_text in the input
  • entities, a list of each PII found, which is useful for PII detection and NER (Named Entity Recognition)
Response
[
  {
    "processed_text": "Thank you for calling the [ORGANIZATION_1]. My name is miss [NAME_GIVEN_1], and it is a pleasure assisting you today. For security reasons, may I please have your Social Security number? Yes, [SSN_1].",
    "entities": [
      {
        "processed_text": "ORGANIZATION_1",
        "text": "Georgia Division of Transportation",
        "location": {
          "stt_idx": 26,
          "end_idx": 60,
          "stt_idx_processed": 26,
          "end_idx_processed": 42
        },
        "best_label": "ORGANIZATION",
        "labels": {
          "LOCATION_STATE": 0.2403,
          "LOCATION": 0.2342,
          "ORGANIZATION": 0.8967
        }
      },
      {
        "processed_text": "NAME_GIVEN_1",
        "text": "Johanna",
        "location": {
          "stt_idx": 78,
          "end_idx": 85,
          "stt_idx_processed": 60,
          "end_idx_processed": 74
        },
        "best_label": "NAME_GIVEN",
        "labels": {
          "NAME_GIVEN": 0.9127,
          "NAME": 0.9018
        }
      },
      {
        "processed_text": "SSN_1",
        "text": "614-5555 01",
        "location": {
          "stt_idx": 203,
          "end_idx": 214,
          "stt_idx_processed": 192,
          "end_idx_processed": 199
        },
        "best_label": "SSN",
        "labels": {
          "SSN": 0.913
        }
      }
    ],
    "entities_present": true,
    "characters_processed": 215,
    "languages_detected": {
      "en": 0.920992910861969
    }
  }
]
If the elements in the list of strings are related, the link_batch parameter can be used to share context throughout the list.
{
  "text": [
    "My phone number is",
    "2345435"
  ],
  "link_batch": true
}
This ensures that the inputs are joined before going to the PII detection system. This way the model sees My phone number is 2345435 instead of My phone number is and 2345435 as two unrelated messages. This allows the phone number to be identified correctly.
["My phone number is", "[PHONE_NUMBER_1]"]

Customizing Entity Detection With Selective Redaction

The above example identifies and removes all non-beta entity types. Granular control over entity detection and redaction can be set using Entity Selectors. For example, to only redact the SSN:
{
  "text": [
    "Thank you for calling the Georgia Division of Transportation. My name is miss Johanna, and it is a pleasure assisting you today. For security reasons, may I please have your Social Security number? Yes, 614-5555 01."
  ],
  "entity_detection": {
    "entity_types": [
      {
        "type": "ENABLE",
        "value": [
          "SSN"
        ]
      }
    ]
  }
}
The result of this selective redaction is below:
Thank you for calling the Georgia Division of Transportation. My name is miss Johanna, and it is a pleasure assisting you today. For security reasons, may I please have your Social Security number? Yes, [SSN_1].

Adding Allow & Block Lists

You can also customize PII detection and redaction using enable/disable Entity Selectors or regex-based Filters, enabling custom handling for company-specific identifiers such as employee IDs or internal database IDs. The example below shows how to combine Entity Selectors with Filters for fine-grained control. In this HR claim scenario, an employee reports a medical injury and requests accommodation. Here, we demonstrate:
  • Two regex-based block filters defining custom entity types for employee IDs and business units, overriding Limina’s defaults.
  • Disabling the injury entity, which may be required for insurance-related workflows.
  • Using a list for the text payload, as expected in conversational contexts, and enabling link_batch to maintain redaction context across the full thread.
  • Disabling numbering of redaction markers.
{
  "text": [
    "Hello Xavier, can you tell me your employee ID?",
    "Yep, my Best Corp ID is GID-45434, and my SIN is 690 871 283",
    "Okay, thanks Xavier, why are you calling today?",
    "I broke my right leg on the 31st and I'm waiting for my x-ray results. dr. zhang, mercer health centre.",
    "Oh, so sorry to hear that! How can we help?",
    "I won't be able to come back to the office in NYC for a while",
    "No problem Xavier, I will enter a short term work from home for you. You're all set!",
    "Thanks so much Carole!"
  ],
  "link_batch": true,
  "entity_detection": {
    "entity_types": [
      {
        "type": "DISABLE",
        "value": [
          "INJURY"
        ]
      }
    ],
    "filter": [
      {
        "type": "BLOCK",
        "entity_type": "EMPLOYEE_ID",
        "pattern": "GID-\\d{5}"
      },
      {
        "type": "BLOCK",
        "entity_type": "BUSINESS_UNIT",
        "pattern": "Best Corp"
      }
    ],
    "return_entity": true
  },
  "processed_text": {
    "type": "MARKER",
    "pattern": "[BEST_ENTITY_TYPE]"
  }
}
The above request yields this response:
Redacted Text
['Hello [NAME_GIVEN], can you tell me your employee ID?', 'Yep, my [BUSINESS_UNIT] ID is [EMPLOYEE_ID], and my SIN is [SSN]', 'Okay, thanks [NAME_GIVEN], why are you calling today?', "I broke my right leg on the [DATE] and I'm waiting for my [MEDICAL_PROCESS] results. [NAME_MEDICAL_PROFESSIONAL], [ORGANIZATION_MEDICAL_FACILITY].", 'Oh, so sorry to hear that! How can we help?', "I won't be able to come back to the office in [LOCATION_CITY] for a while", "No problem [NAME_GIVEN], I will enter a short term work from home for you. You're all set!", 'Thanks so much [NAME_GIVEN]!']

Generating Synthetic Entities (Beta)

In addition to replacing PII with redaction markers, tokens, or masks, Limina can generate synthetic PII; realistic fake replacements created with an ML model that fit the surrounding context. This offers several advantages:
  • Synthetic PII preserves most of the original text, reducing the risk of introducing bias compared to generators that create entirely new data, and improving utility for downstream tasks like sentiment analysis.
  • Even though our PII detection engine leads the market, it isn’t perfect. Synthetic PII ensures that any PII detection misses are hidden amongst realistic, fake PII, strengthening protection against re-identification.
  • Synthetic entities resemble natural language more closely than redaction markers or hashes, minimizing disruption to downstream ML systems. To enable synthetic PII generation, set the processed_text object’s marker type to SYNTHETIC in your API request.
{
  "text": [
    "Hello, my name is May. I am the aunt of Jessica Parker. We live in Toronto, Canada."
  ],
  "processed_text": {
    "type": "SYNTHETIC"
  }
}
Yields the following response:
Hello, my name is Ben. I am the aunt of Michael Morley. We live in Ekshaku, Sweden.