Text moderation concepts

Important

Azure Content Moderator is deprecated as of February 2024 and will be retired by February 2027. It is replaced by Azure AI Content Safety, which offers advanced AI features and enhanced performance.

Azure AI Content Safety is a comprehensive solution designed to detect harmful user-generated and AI-generated content in applications and services. Azure AI Content Safety is suitable for many scenarios such as online marketplaces, gaming companies, social messaging platforms, enterprise media companies, and K-12 education solution providers. Here's an overview of its features and capabilities:

  • Text and Image Detection APIs: Scan text and images for sexual content, violence, hate, and self-harm with multiple severity levels.
  • Content Safety Studio: An online tool designed to handle potentially offensive, risky, or undesirable content using our latest content moderation ML models. It provides templates and customized workflows that enable users to build their own content moderation systems.
  • Language support: Azure AI Content Safety supports more than 100 languages and is specifically trained on English, German, Japanese, Spanish, French, Italian, Portuguese, and Chinese.

Azure AI Content Safety provides a robust and flexible solution for your content moderation needs. By switching from Content Moderator to Azure AI Content Safety, you can take advantage of the latest tools and technologies to ensure that your content is always moderated to your exact specifications.

Learn more about Azure AI Content Safety and explore how it can elevate your content moderation strategy.

You can use Azure Content Moderator's text moderation models to analyze text content, such as chat rooms, discussion boards, chatbots, e-commerce catalogs, and documents.

The service response includes the following information:

  • Profanity: term-based matching with built-in list of profane terms in various languages
  • Classification: machine-assisted classification into three categories
  • Personal data
  • Autocorrected text
  • Original text
  • Language

Profanity

If the API detects any profane terms in any of the supported languages, those terms are included in the response. The response also contains their location (Index) in the original text. The ListId in the following sample JSON refers to terms found in custom term lists if available.

"Terms": [
    {
        "Index": 118,
        "OriginalIndex": 118,
        "ListId": 0,
        "Term": "<offensive word>"
    }

Note

For the language parameter, assign eng or leave it empty to see the machine-assisted classification response (preview feature). This feature supports English only.

For profanity terms detection, use the ISO 639-3 code of the supported languages listed in this article, or leave it empty.

Classification

Content Moderator's machine-assisted text classification feature supports English only, and helps detect potentially undesired content. The flagged content might be assessed as inappropriate depending on context. It conveys the likelihood of each category. The feature uses a trained model to identify possible abusive, derogatory, or discriminatory language. This includes slang, abbreviated words, offensive, and intentionally misspelled words.

The following extract in the JSON extract shows an example output:

"Classification": {
    "ReviewRecommended": true,
    "Category1": {
        "Score": 1.5113095059859916E-06
    },
    "Category2": {
        "Score": 0.12747249007225037
    },
    "Category3": {
        "Score": 0.98799997568130493
    }
}

Explanation

  • Category1 refers to the potential presence of language that might be considered sexually explicit or adult in certain situations.
  • Category2 refers to the potential presence of language that might be considered sexually suggestive or mature in certain situations.
  • Category3 refers to the potential presence of language that might be considered offensive in certain situations.
  • Score is between 0 and 1. The higher the score, the higher the probability that the category might be applicable. This feature relies on a statistical model rather than manually coded outcomes. We recommend testing with your own content to determine how each category aligns to your requirements.
  • ReviewRecommended is either true or false depending on the internal score thresholds. Customers should assess whether to use this value or decide on custom thresholds based on their content policies.

Personal data

The personal data feature detects the potential presence of this information:

  • Email address
  • US mailing address
  • IP address
  • US phone number

The following example shows a sample response:

"pii":{
  "email":[
      {
        "detected":"[email protected]",
        "sub_type":"Regular",
        "text":"[email protected]",
        "index":32
      }
  ],
  "ssn":[

  ],
  "ipa":[
      {
        "sub_type":"IPV4",
        "text":"255.255.255.255",
        "index":72
      }
  ],
  "phone":[
      {
        "country_code":"US",
        "text":"6657789887",
        "index":56
      }
  ],
  "address":[
      {
        "text":"1 Microsoft Way, Redmond, WA 98052",
        "index":89
      }
  ]
}

Autocorrection

The text moderation response can optionally return the text with basic autocorrection applied.

For example, the following input text has a misspelling.

The quick brown fox jumps over the lazzy dog.

If you specify autocorrection, the response contains the corrected version of the text:

The quick brown fox jumps over the lazy dog.

Create and manage your custom lists of terms

While the default, global list of terms works great for most cases, you might want to screen against terms that are specific to your business needs. For example, you might want to filter out any competitive brand names from posts by users.

Note

There is a maximum limit of five term lists with each list to not exceed 10,000 terms.

The following example shows the matching List ID:

"Terms": [
    {
        "Index": 118,
        "OriginalIndex": 118,
        "ListId": 231.
        "Term": "<offensive word>"
    }

The Content Moderator provides a Term List API with operations for managing custom term lists. Check out the Term Lists .NET quickstart if you're familiar with Visual Studio and C#.