Quickstart: Detect Personally Identifiable Information (PII)
Note
This quickstart only covers PII detection in documents. To learn more about detecting PII in conversations, see How to detect and redact PII in conversations.
Reference documentation | More samples | Package (NuGet) | Library source code
Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for .NET. In the following example, you create a C# application that can identify recognized sensitive information in text.
Tip
You can use AI Foundry to try summarization without needing to write code.
Prerequisites
- Azure subscription - Create one for free
- Once you have your Azure subscription, create an AI services resource.
- The Visual Studio IDE
Setting up
Create environment variables
Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.
To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.
- To set the
LANGUAGE_KEY
environment variable, replaceyour-key
with one of the keys for your resource. - To set the
LANGUAGE_ENDPOINT
environment variable, replaceyour-endpoint
with the endpoint for your resource.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint
Note
If you only need to access the environment variables in the current running console, you can set the environment variable with set
instead of setx
.
After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.
Create a new .NET Core application
Using the Visual Studio IDE, create a new .NET Core console app. This creates a "Hello World" project with a single C# source file: program.cs.
Install the client library by right-clicking on the solution in the Solution Explorer and selecting Manage NuGet Packages. In the package manager that opens select Browse and search for Azure.AI.TextAnalytics
. Select version 5.2.0
, and then Install. You can also use the Package Manager Console.
Code example
Copy the following code into your program.cs file and run the code.
using Azure;
using System;
using Azure.AI.TextAnalytics;
namespace Example
{
class Program
{
// This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
static string languageKey = Environment.GetEnvironmentVariable("LANGUAGE_KEY");
static string languageEndpoint = Environment.GetEnvironmentVariable("LANGUAGE_ENDPOINT");
private static readonly AzureKeyCredential credentials = new AzureKeyCredential(languageKey);
private static readonly Uri endpoint = new Uri(languageEndpoint);
// Example method for detecting sensitive information (PII) from text
static void RecognizePIIExample(TextAnalyticsClient client)
{
string document = "Call our office at 312-555-1234, or send an email to [email protected].";
PiiEntityCollection entities = client.RecognizePiiEntities(document).Value;
Console.WriteLine($"Redacted Text: {entities.RedactedText}");
if (entities.Count > 0)
{
Console.WriteLine($"Recognized {entities.Count} PII entit{(entities.Count > 1 ? "ies" : "y")}:");
foreach (PiiEntity entity in entities)
{
Console.WriteLine($"Text: {entity.Text}, Category: {entity.Category}, SubCategory: {entity.SubCategory}, Confidence score: {entity.ConfidenceScore}");
}
}
else
{
Console.WriteLine("No entities were found.");
}
}
static void Main(string[] args)
{
var client = new TextAnalyticsClient(endpoint, credentials);
RecognizePIIExample(client);
Console.Write("Press any key to exit.");
Console.ReadKey();
}
}
}
Output
Redacted Text: Call our office at ************, or send an email to *******************.
Recognized 2 PII entities:
Text: 312-555-1234, Category: PhoneNumber, SubCategory: , Confidence score: 0.8
Text: [email protected], Category: Email, SubCategory: , Confidence score: 0.8
Reference documentation | More samples | Package (Maven) | Library source code
Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for Java. In the following example, you create a Java application that can identify recognized sensitive information in text.
Tip
You can use AI Foundry to try summarization without needing to write code.
Prerequisites
- Azure subscription - Create one for free
- Once you have your Azure subscription, create an AI services resource.
- Java Development Kit (JDK) with version 8 or above
Setting up
Create environment variables
Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.
To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.
- To set the
LANGUAGE_KEY
environment variable, replaceyour-key
with one of the keys for your resource. - To set the
LANGUAGE_ENDPOINT
environment variable, replaceyour-endpoint
with the endpoint for your resource.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint
Note
If you only need to access the environment variables in the current running console, you can set the environment variable with set
instead of setx
.
After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.
Add the client library
Create a Maven project in your preferred IDE or development environment. Then add the following dependency to your project's pom.xml file. You can find the implementation syntax for other build tools online.
<dependencies>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-ai-textanalytics</artifactId>
<version>5.2.0</version>
</dependency>
</dependencies>
Code example
Create a Java file named Example.java
. Open the file and copy the below code. Then run the code.
import com.azure.core.credential.AzureKeyCredential;
import com.azure.ai.textanalytics.models.*;
import com.azure.ai.textanalytics.TextAnalyticsClientBuilder;
import com.azure.ai.textanalytics.TextAnalyticsClient;
public class Example {
// This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
private static String languageKey = System.getenv("LANGUAGE_KEY");
private static String languageEndpoint = System.getenv("LANGUAGE_ENDPOINT");
public static void main(String[] args) {
TextAnalyticsClient client = authenticateClient(languageKey, languageEndpoint);
recognizePiiEntitiesExample(client);
}
// Method to authenticate the client object with your key and endpoint
static TextAnalyticsClient authenticateClient(String key, String endpoint) {
return new TextAnalyticsClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
}
// Example method for detecting sensitive information (PII) from text
static void recognizePiiEntitiesExample(TextAnalyticsClient client)
{
// The text that need be analyzed.
String document = "My SSN is 859-98-0987";
PiiEntityCollection piiEntityCollection = client.recognizePiiEntities(document);
System.out.printf("Redacted Text: %s%n", piiEntityCollection.getRedactedText());
piiEntityCollection.forEach(entity -> System.out.printf(
"Recognized Personally Identifiable Information entity: %s, entity category: %s, entity subcategory: %s,"
+ " confidence score: %f.%n",
entity.getText(), entity.getCategory(), entity.getSubcategory(), entity.getConfidenceScore()));
}
}
Output
Redacted Text: My SSN is ***********
Recognized Personally Identifiable Information entity: 859-98-0987, entity category: USSocialSecurityNumber, entity subcategory: null, confidence score: 0.650000.
Reference documentation | More samples | Package (npm) | Library source code
Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for Node.js. In the following example, you create a JavaScript application that can identify recognized sensitive information in text.
Prerequisites
- Azure subscription - Create one for free
- Once you have your Azure subscription, create an AI services resource.
- Node.js v14 LTS or later
Setting up
Create environment variables
Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.
To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.
- To set the
LANGUAGE_KEY
environment variable, replaceyour-key
with one of the keys for your resource. - To set the
LANGUAGE_ENDPOINT
environment variable, replaceyour-endpoint
with the endpoint for your resource.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint
Note
If you only need to access the environment variables in the current running console, you can set the environment variable with set
instead of setx
.
After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.
Create a new Node.js application
In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.
mkdir myapp
cd myapp
Run the npm init
command to create a node application with a package.json
file.
npm init
Install the client library
Install the npm package:
npm install @azure/ai-text-analytics
Code example
Open the file and copy the below code. Then run the code.
"use strict";
const { TextAnalyticsClient, AzureKeyCredential } = require("@azure/ai-text-analytics");
// This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
const key = process.env.LANGUAGE_KEY;
const endpoint = process.env.LANGUAGE_ENDPOINT;
//an example document for pii recognition
const documents = [ "The employee's phone number is (555) 555-5555." ];
async function main() {
console.log(`PII recognition sample`);
const client = new TextAnalyticsClient(endpoint, new AzureKeyCredential(key));
const documents = ["My phone number is 555-555-5555"];
const [result] = await client.analyze("PiiEntityRecognition", documents, "en");
if (!result.error) {
console.log(`Redacted text: "${result.redactedText}"`);
console.log("Pii Entities: ");
for (const entity of result.entities) {
console.log(`\t- "${entity.text}" of type ${entity.category}`);
}
}
}
main().catch((err) => {
console.error("The sample encountered an error:", err);
});
Output
PII recognition sample
Redacted text: "My phone number is ************"
Pii Entities:
- "555-555-5555" of type PhoneNumber
Reference documentation | More samples | Package (PyPi) | Library source code
Use this quickstart to create a Personally Identifiable Information (PII) detection application with the client library for Python. In the following example, you'll create a Python application that can identify recognized sensitive information in text.
Prerequisites
- Azure subscription - Create one for free
- Once you have your Azure subscription, create an AI services resource.
- Python 3.8 or later
Setting up
Create environment variables
Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.
To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.
- To set the
LANGUAGE_KEY
environment variable, replaceyour-key
with one of the keys for your resource. - To set the
LANGUAGE_ENDPOINT
environment variable, replaceyour-endpoint
with the endpoint for your resource.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint
Note
If you only need to access the environment variables in the current running console, you can set the environment variable with set
instead of setx
.
After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.
Install the client library
After installing Python, you can install the client library with:
pip install azure-ai-textanalytics==5.2.0
Code example
Create a new Python file and copy the below code. Then run the code.
# This example requires environment variables named "LANGUAGE_KEY" and "LANGUAGE_ENDPOINT"
language_key = os.environ.get('LANGUAGE_KEY')
language_endpoint = os.environ.get('LANGUAGE_ENDPOINT')
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential
# Authenticate the client using your key and endpoint
def authenticate_client():
ta_credential = AzureKeyCredential(language_key)
text_analytics_client = TextAnalyticsClient(
endpoint=language_endpoint,
credential=ta_credential)
return text_analytics_client
client = authenticate_client()
# Example method for detecting sensitive information (PII) from text
def pii_recognition_example(client):
documents = [
"The employee's SSN is 859-98-0987.",
"The employee's phone number is 555-555-5555."
]
response = client.recognize_pii_entities(documents, language="en")
result = [doc for doc in response if not doc.is_error]
for doc in result:
print("Redacted Text: {}".format(doc.redacted_text))
for entity in doc.entities:
print("Entity: {}".format(entity.text))
print("\tCategory: {}".format(entity.category))
print("\tConfidence Score: {}".format(entity.confidence_score))
print("\tOffset: {}".format(entity.offset))
print("\tLength: {}".format(entity.length))
pii_recognition_example(client)
Output
Redacted Text: The ********'s SSN is ***********.
Entity: employee
Category: PersonType
Confidence Score: 0.97
Offset: 4
Length: 8
Entity: 859-98-0987
Category: USSocialSecurityNumber
Confidence Score: 0.65
Offset: 22
Length: 11
Redacted Text: The ********'s phone number is ************.
Entity: employee
Category: PersonType
Confidence Score: 0.96
Offset: 4
Length: 8
Entity: 555-555-5555
Category: PhoneNumber
Confidence Score: 0.8
Offset: 31
Length: 12
Use this quickstart to send Personally Identifiable Information (PII) detection requests using the REST API. In the following example, you will use cURL to identify recognized sensitive information in text.
Prerequisites
- Azure subscription - Create one for free
- Once you have your Azure subscription, create an AI services resource.
Setting up
Create environment variables
Your application must be authenticated to send API requests. For production, use a secure way of storing and accessing your credentials. In this example, you will write your credentials to environment variables on the local machine running the application.
To set the environment variable for your Language resource key, open a console window, and follow the instructions for your operating system and development environment.
- To set the
LANGUAGE_KEY
environment variable, replaceyour-key
with one of the keys for your resource. - To set the
LANGUAGE_ENDPOINT
environment variable, replaceyour-endpoint
with the endpoint for your resource.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
setx LANGUAGE_KEY your-key
setx LANGUAGE_ENDPOINT your-endpoint
Note
If you only need to access the environment variables in the current running console, you can set the environment variable with set
instead of setx
.
After you add the environment variables, you may need to restart any running programs that will need to read the environment variables, including the console window. For example, if you are using Visual Studio as your editor, restart Visual Studio before running the example.
Create a JSON file with the example request body
In a code editor, create a new file named test_pii_payload.json
and copy the following JSON example. This example request will be sent to the API in the next step.
{
"kind": "PiiEntityRecognition",
"parameters": {
"modelVersion": "latest"
},
"analysisInput":{
"documents":[
{
"id":"1",
"language": "en",
"text": "Call our office at 312-555-1234, or send an email to [email protected]"
}
]
}
}
'
Save test_pii_payload.json
somewhere on your computer. For example, your desktop.
Send a personally identifying information (PII) detection API request
Use the following commands to send the API request using the program you're using. Copy the command into your terminal, and run it.
parameter | Description |
---|---|
-X POST <endpoint> |
Specifies your endpoint for accessing the API. |
-H Content-Type: application/json |
The content type for sending JSON data. |
-H "Ocp-Apim-Subscription-Key:<key> |
Specifies the key for accessing the API. |
-d <documents> |
The JSON containing the documents you want to send. |
Replace C:\Users\<myaccount>\Desktop\test_pii_payload.json
with the location of the example JSON request file you created in the previous step.
Command prompt
curl -X POST "%LANGUAGE_ENDPOINT%/language/:analyze-text?api-version=2022-05-01" ^
-H "Content-Type: application/json" ^
-H "Ocp-Apim-Subscription-Key: %LANGUAGE_KEY%" ^
-d "@C:\Users\<myaccount>\Desktop\test_pii_payload.json"
PowerShell
curl.exe -X POST $env:LANGUAGE_ENDPOINT/language/:analyze-text?api-version=2022-05-01 `
-H "Content-Type: application/json" `
-H "Ocp-Apim-Subscription-Key: $env:LANGUAGE_KEY" `
-d "@C:\Users\<myaccount>\Desktop\test_pii_payload.json"
JSON response
{
"kind": "PiiEntityRecognitionResults",
"results": {
"documents": [{
"redactedText": "Call our office at ************, or send an email to *******************",
"id": "1",
"entities": [{
"text": "312-555-1234",
"category": "PhoneNumber",
"offset": 19,
"length": 12,
"confidenceScore": 0.8
}, {
"text": "[email protected]",
"category": "Email",
"offset": 53,
"length": 19,
"confidenceScore": 0.8
}],
"warnings": []
}],
"errors": [],
"modelVersion": "2021-01-15"
}
}
Clean up resources
If you want to clean up and remove an Azure AI services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.