Supported LanguagesNote that while Limina text de-identification service supports more than 50 languages, the file processing service supports a restricted list of languages. See supported languages for more details.
Document File Types
| File Type | Extension | Content Type | Added In | Object Detection Support | Beta |
|---|---|---|---|---|---|
.pdf | application/pdf | 3.0.0 | ✓ | ||
| JSON | .json | application/json | 3.1.0 | ||
| XML | .xml | application/xml | 3.1.0 | ||
| CSV | .csv | text/csv | 3.1.0 | ||
| Word | .doc | application/msword | 3.1.0 | ✓ (partially) | |
| Word Open XML | .docx | application/vnd.openxmlformats-officedocument.wordprocessingml.document | 3.1.0 | ✓ (partially) | |
| Text | .txt | text/plain | 3.1.1 | ||
| Excel | .xls | application/vnd.ms-excel | 3.2.0 | ✓ (partially) | |
| Excel Open XML | .xlsx | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | 3.2.0 | ✓ (partially) | |
| PowerPoint | .ppt | application/vnd.ms-powerpoint | 3.5.0 | ✓ (partially) | |
| PowerPoint Open XML | .pptx | application/vnd.openxmlformats-officedocument.presentationml.presentation | 3.5.0 | ✓ (partially) | |
| DICOM | .dcm | application/dicom | 3.4.0 | ✓ |
A note on object detection support in Office documentsObject detection in Office files is partially supported. Embedded images in Office documents are only processed for object detection and redaction if they are compatible with our deidentifier. Non-compatible images are replaced with a black placeholder image. This ensures that sensitive data in non-supported formats is always processed, although it is not redacted with the same level of precision as data in supported formats.Additionally:
- The bounding box coordinates within the
locationfield (x0,x1,y0,y1) are relative to the embedded image itself, unlike in PDFs, where they are relative to the document page. - The
pagefield value remains0for Office files, as page numbering is not currently implemented for this file type.
Image File Types
Processing Image File Types
| File Type | Extension | Content Type | Added In | Object Detection Support |
|---|---|---|---|---|
| JPEG | .jpg, .jpeg | image/jpg, image/jpeg | 3.0.0 | ✓ |
| TIFF | .tif, .tiff | image/tif, image/tiff | 3.0.0 | ✓ |
| PNG | .png | image/png | 3.4.0 | ✓ |
| BMP | .bmp | image/bmp, image/x-ms-bmp | 3.4.0 | ✓ |
Audio File Types
Processing Audio File Types
| File Type | Extension | Content Type | Added In |
|---|---|---|---|
| wave | .wav | audio/wav | 3.0.0 |
| mp3 | .mp3 | audio/mpeg, audio/mp3 | 3.0.0 |
| mp4 | .mp4 | audio/mp4 | 3.0.0 |
| m4a | .m4a | audio/m4a | 3.5.0 |
| webm | .webm | audio/webm | 3.5.0 |
VOX files.vox files are not natively supported in the Limina container, but can be processed by converting the .vox file to a wav or mp3 using a conversion tool like SoX.Because .vox files are headerless, you will need to know the sample rate and encoding to specify.For example, to take a vox file with a sample rate 8000, mono channel, mu-law encoded:to generate a wav file.