How to use CIPH3R Playground Components to detect PII
- Peter
- Architecture , Data , Application , Gen ai , Lang chain
- July 27, 2024
Table of Contents
Components
There two CIPH3R AI Playground components:-
CIPH3R Shield
CIPH3R Detokenize
CIPH3R Shield
CIPH3R Shield is langflow addon component that can perform PII detection and tokenization on unstructured data such as documents. This componenent can perform any of the following tokenization/redact methods viz., mask, redact, hash and fpe tokenize
The following inputs are required for processing tokenization on document. The result of this processing ensures all PII identified entities are detected and anonymized using CIPH3R FPE.
Input Parameters:
Payload
Fields to ignore (e.g., US_SSN, CREDIT_CARD, etc.)
Language (e.g., en)
Options (mask, redact, hash, tokenize)
CIPH3R ClientID
CIPH3R API Key
Output Parameters:
- Data
CIPH3R Detokenizer
CIPH3R Detokenize is langflow addon component that can perform reverse operation of PII detection, meaning it can detokenization FPE data in the vector DB or any form of document using Format Preserved Decryption, this is designed for unstructured data.
The following inputs are required for processing detokenization on data returned from any source, in a typical scenario this would be a Vector DB. The result of this processing ensures all PII identified entities are detokenized using CIPH3R FPE.
Payload
Fields to ignore (e.g., EMAIL_ADDRESS, etc.)
Language (e.g., en)
Options (detokenize)
CIPH3R ClientID
CIPH3R API Key
Output Parameters
- Data
New Project Startup Template
To kick start your AI Project you may choose “CIPH3R Sample Flow” when you create new project in CIPH3R AI Playground.
The sample flow consists of following components.
Cohere (Model)
FAISS (VectorDB)
CIPH3R Shield
CIPH3R Detokenize
You may tweak this flow and use Model and VectorDB of your choice.