Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
APPLIES TO: Developer | Basic | Basic v2 | Standard | Standard v2 | Premium | Premium v2
The llm-content-safety
policy enforces content safety checks on large language model (LLM) requests (prompts) by transmitting them to the Azure AI Content Safety service before sending to the backend LLM API. When the policy is enabled, and Azure AI Content Safety detects malicious content, API Management blocks the request and returns a 403
error code.
Note
The terms category and categories used in API Management are synonymous with harm category and harm categories in the Azure AI Content Safety service. Details can be found on the Harm categories in Azure AI Content Safety page.
Use the policy in scenarios such as the following:
- Block requests that contain predefined categories of harmful content or hate speech
- Apply custom blocklists to prevent specific content from being sent
- Shield against prompts that match attack patterns
Note
Set the policy's elements and child elements in the order provided in the policy statement. Learn more about how to set or edit API Management policies.
Prerequisites
- An Azure AI Content Safety resource.
- An API Management backend configured to route content safety API calls and authenticate to the Azure AI Content Safety service:
- API Management's managed identity must be configured on the Azure AI Content Safety service with Cognitive Services User role.
- The Azure AI Content Safety backend URL, referenced by
backend-id
in thellm-content-safety
policy, needs to be in the formhttps://<content-safety-service-name>.cognitiveservices.azure.com
. - The Azure AI Content Safety backend's authorization credentials need to be set to Managed Identity enabled with an exact resource ID of
https://cognitiveservices.azure.com
.
Policy statement
<llm-content-safety backend-id="name of backend entity" shield-prompt="true | false" enforce-on-completions="true | false">
<categories output-type="FourSeverityLevels | EightSeverityLevels">
<category name="Hate | SelfHarm | Sexual | Violence" threshold="integer" />
<!-- If there are multiple categories, add more category elements -->
[...]
</categories>
<blocklists>
<id>blocklist-identifier</id>
<!-- If there are multiple blocklists, add more id elements -->
[...]
</blocklists>
</llm-content-safety>
Attributes
Attribute | Description | Required | Default |
---|---|---|---|
backend-id | Identifier (name) of the Azure AI Content Safety backend to route content-safety API calls to. Policy expressions are allowed. | Yes | N/A |
shield-prompt | If set to true , content is checked for user attacks. Otherwise, skip this check. Policy expressions are allowed. |
No | false |
enforce-on-completions | If set to true , content safety checks are enforced on chat completions for response validation. Otherwise, skip this check. Policy expressions are allowed. |
No | false |
Elements
Element | Description | Required |
---|---|---|
categories | A list of category elements that specify settings for blocking requests when the category is detected. |
No |
blocklists | A list of blocklist id elements from the Azure AI Content Safety instance for which detection causes the request to be blocked. Policy expressions are allowed. |
No |
categories attributes
Attribute | Description | Required | Default |
---|---|---|---|
output-type | Specifies how severity levels are returned by Azure AI Content Safety. The attribute must have one of the following values. - FourSeverityLevels : Output severities in four levels: 0,2,4,6.- EightSeverityLevels : Output severities in eight levels: 0,1,2,3,4,5,6,7.Policy expressions are allowed. |
No | FourSeverityLevels |
category attributes
Attribute | Description | Required | Default |
---|---|---|---|
name | Specifies the name of this category. The attribute must have one of the following values: Hate , SelfHarm , Sexual , Violence . Policy expressions are allowed. |
Yes | N/A |
threshold | Specifies the threshold value for this category at which request are blocked. Requests with content severities less than the threshold aren't blocked. The value must be between 0 (most restrictive) and 7 (least restrictive). Policy expressions are allowed. | Yes | N/A |
Usage
- Policy sections: inbound
- Policy scopes: global, workspace, product, API
- Gateways: classic, v2, consumption, self-hosted, workspace
Usage notes
- The policy runs on a concatenation of all text content in a completion or chat completion request.
- If the request exceeds the character limit of Azure AI Content Safety, a
403
error is returned. - This policy can be used multiple times per policy definition.
Example
The following example enforces content safety checks on LLM requests using the Azure AI Content Safety service. The policy blocks requests that contain speech in the Hate
or Violence
category with a severity level of 4 or higher. In other words, the filter allows levels 0-3 to continue whereas levels 4-7 are blocked. Raising a category's threshold raises the tolerance and potentially decreases the number of blocked requests. Lowering the threshold lowers the tolerance and potentially increases the number of blocked requests. The shield-prompt
attribute is set to true
to check for adversarial attacks.
<policies>
<inbound>
<llm-content-safety backend-id="content-safety-backend" shield-prompt="true">
<categories output-type="EightSeverityLevels">
<category name="Hate" threshold="4" />
<category name="Violence" threshold="4" />
</categories>
</llm-content-safety>
</inbound>
</policies>
Related policies
- Content validation
- llm-token-limit policy
- llm-emit-token-metric policy
Related content
For more information about working with policies, see:
- Tutorial: Transform and protect your API
- Policy reference for a full list of policy statements and their settings
- Policy expressions
- Set or edit policies
- Reuse policy configurations
- Policy snippets repo
- Policy playground repo
- Azure API Management policy toolkit
- Get Copilot assistance to create, explain, and troubleshoot policies