Azure AI Foundry Agent Service (Preview)

Easily integrate Azure AI Foundry Agent Service capabilities into your workflows.
This connector is available in the following products and regions:
Service | Class | Regions |
---|---|---|
Copilot Studio | Premium | All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Logic Apps | Standard | All Logic Apps regions except the following: - Azure Government regions - Azure China regions - US Department of Defense (DoD) |
Power Apps | Premium | All Power Apps regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Power Automate | Premium | All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Contact | |
---|---|
Name | Microsoft |
URL | https://support.microsoft.com |
Connector Metadata | |
---|---|
Publisher | Microsoft |
Website | https://learn.microsoft.com/en-us/azure/ai-services/agents/ |
Privacy policy | https://learn.microsoft.com/en-us/legal/cognitive-services/agents/data-privacy-security |
Categories | AI;Business Intelligence |
Azure AI Agent Service is a fully managed service designed to empower developers to securely build, deploy, and scale high-quality, and extensible AI agents without needing to manage the underlying compute and storage resources. Azure AI Agent Service integrates models, tools and technology and enables you to extend agents with knowledge from connected sources (such as Bing Search, SharePoint, Fabric, Azure Blob storage, and licensed data) and actions using tools such as Azure Logic Apps, Azure Functions, OpenAPI 3.0 specified tools and Code Interpreter
Prerequisites
- Azure subscription - Create one for free.
- Create an Azure AI Agent Service resource in the Azure portal. Be sure to create your resource in a supported region.
- Be sure that you are assigned at least the Cognitive Services User role for the Azure AI Agent Service resource.
Get your credentials
To authenticate your API requests, you will need the endpoint for your Azure AI Agent Service resources.
For your Azure AI Agent Service resource:
- Navigate to your resource in the Azure portal.
- On the page for your resource, select Keys and endpoint on the left navigation menu. Make note of your credentials. You will use one of your keys and your endpoint.
Known issues and limitations
- Only Azure AI Agent Service is supported as a agent service with this Power Platform connector for Logic Apps only. Not for power automate and power apps
Creating a connection
The connector supports the following authentication types:
Logic Apps Managed Identity | Create a connection using a LogicApps Managed Identity | LOGICAPPS only | Shareable |
Default | Log in with your credentials. | All regions except LOGICAPPS | Not shareable |
Logic Apps Managed Identity
Auth ID: managedIdentityAuth
Applicable: LOGICAPPS only
Create a connection using a LogicApps Managed Identity
This is shareable connection. If the power app is shared with another user, connection is shared as well. For more information, please see the Connectors overview for canvas apps - Power Apps | Microsoft Docs
Name | Type | Description | Required |
---|---|---|---|
LogicApps Managed Identity | managedIdentity | Sign in with a Logic Apps Managed Identity | True |
Azure AI Project Endpoint | string | The name of the Azure AI Project Endpoint like https://{account-name}.services.ai.azure.com/api/projects/{project-name} | True |
Default
Applicable: All regions except LOGICAPPS
Log in with your credentials.
This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.
Throttling Limits
Name | Calls | Renewal Period |
---|---|---|
API calls per connection | 1000 | 60 seconds |
Actions
Create Run |
Create Run |
Create Thread |
Create Thread |
Get Run |
Get Run |
List Agents |
List Agents |
List Messages |
List Messages |
Create Run
Create Run
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
API version
|
api-version | True | string |
API version |
The ID of the thread to create a message for.
|
ThreadId | True | string |
The ID of the thread to create a message for. |
assistant_id
|
assistant_id | True | string |
The ID of the assistant to use to execute this run. |
model
|
model | string |
The model deployment name to be used to execute this run. If provided, it overrides the assistant's model deployment name. |
|
instructions
|
instructions | string |
Overrides the instructions of the assistant. Useful for modifying behavior on a per-run basis. |
|
additional_instructions
|
additional_instructions | string |
Appends additional instructions at the end of the instructions for the run. |
|
role
|
role | string |
The role of the entity that is creating the message. Can be user or assistant. 'user' indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. 'assistant' indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation. |
|
content
|
content | string |
The content of the message. |
|
name
|
name | string |
List of file ids or messages that can be used in the run. |
|
metadata
|
metadata | object |
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
|
name
|
name | string |
List of tools that can be used in the run. |
|
metadata
|
metadata | object |
Set of 16 key-value pairs attached to an object. Keys max length: 64 chars, Values max length: 512 chars. |
|
temperature
|
temperature | number |
Sampling temperature (0-2). Higher values (e.g., 0.8) increase randomness, lower values (e.g., 0.2) make output more deterministic. |
|
top_p
|
top_p | number |
Nucleus sampling alternative to temperature. 0.1 means top 10% probability mass is considered. |
|
stream
|
stream | boolean |
If true, returns a stream of events during the run as server-sent events, terminating with a 'data: [DONE]' message. |
|
max_prompt_tokens
|
max_prompt_tokens | integer |
The maximum number of completion tokens that might be used over the run. If exceeded, the run ends as incomplete. |
|
max_completion_tokens
|
max_completion_tokens | integer |
The maximum number of completion tokens that might be used over the run. If exceeded, the run ends as incomplete. |
|
truncation_strategy
|
truncation_strategy | object |
Controls how a thread is truncated before the run to manage the initial context window. |
|
tool_choice
|
tool_choice | object |
Controls which tool the model calls. Defaults to 'auto', allowing the model to decide. Can be set to 'none' to disable tool usage. |
|
response_format
|
response_format | object |
Specifies the output format. Setting { 'type': 'json_object' } enables JSON mode for valid JSON responses. |
Returns
- Body
- createRunResponse
Create Thread
Create Thread
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
API version
|
api-version | True | string |
API version |
role
|
role | string |
The role of the entity that is creating the message. Can be user or assistant. 'user' indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. 'assistant' indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation. |
|
content
|
content | string |
The content of the message. |
|
name
|
name | string |
List of file ids or messages that can be used in the run. |
|
metadata
|
metadata | object |
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
|
metadata
|
metadata | object |
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
|
tool_resources
|
tool_resources | object |
A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
Returns
- Body
- createThreadResponse
Get Run
Get Run
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
API version
|
api-version | True | string |
API version |
The ID of the thread to create a message for.
|
ThreadId | True | string |
The ID of the thread to create a message for. |
The ID of the run.
|
RunId | True | string |
The ID of the run. |
Returns
- Body
- getRunResponse
List Agents
List Agents
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
API version
|
api-version | True | string |
API version |
Returns
- Body
- listAgentsResponse
List Messages
List Messages
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
API version
|
api-version | True | string |
API version |
The ID of the thread to create a message for.
|
ThreadId | True | string |
The ID of the thread to create a message for. |
Returns
- Body
- listMessageResponse
Definitions
listAgentsResponse
Name | Path | Type | Description |
---|---|---|---|
object
|
object | string |
Details of the response object type. |
data
|
data | array of Data |
The list of messages returned by the service. |
first_id
|
first_id | string |
Details of the first id. |
last_id
|
last_id | string |
Details of the last id. |
has_more
|
has_more | boolean |
Tells whether more agents are there. |
createThreadResponse
Name | Path | Type | Description |
---|---|---|---|
id
|
id | string |
The identifier, which can be referenced in API endpoints. |
object
|
object | string |
The object type, which is always thread. |
created_at
|
created_at | integer |
The Unix timestamp (in seconds) for when the thread was created. |
metadata
|
metadata | object |
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
createRunResponse
Name | Path | Type | Description |
---|---|---|---|
id
|
id | string |
The identifier, which can be referenced in API endpoints. |
object
|
object | string |
The object type, which is always thread.run. |
created_at
|
created_at | integer |
The Unix timestamp (in seconds) for when the run was created. |
thread_id
|
thread_id | string |
The ID of the thread that was executed on as a part of this run. |
assistant_id
|
assistant_id | string |
The ID of the assistant used for execution of this run. |
status
|
status | string |
The status of the run, which can be either queued, in_progress, requires_action, cancelling, cancelled, failed, completed, or expired. |
required_action
|
required_action | object |
Details on the action required to continue the run. Will be null if no action is required. |
last_error
|
last_error | object |
The last error associated with this run. Will be null if there are no errors. |
expires_at
|
expires_at | integer |
The Unix timestamp (in seconds) for when the run will expire. |
started_at
|
started_at | integer |
The Unix timestamp (in seconds) for when the run was started. |
cancelled_at
|
cancelled_at | integer |
The Unix timestamp (in seconds) for when the run was canceled. |
failed_at
|
failed_at | integer |
The Unix timestamp (in seconds) for when the run failed. |
completed_at
|
completed_at | integer |
The Unix timestamp (in seconds) for when the run was completed. |
model
|
model | string |
The model deployment name that the assistant used for this run. |
instructions
|
instructions | string |
The instructions that the assistant used for this run. |
tools
|
tools | array of tools |
The list of tools that the assistant used for this run. |
file_ids
|
file_ids | array of fileIds |
The list of File IDs the assistant used for this run. |
metadata
|
metadata | object |
Set of 16 key-value pairs that can be attached to an object. Keys can be a maximum of 64 characters long, and values can be a maximum of 512 characters long. |
tool_choice
|
tool_choice | object |
Controls which (if any) tool is called by the model. 'none' means the model won't call any tools and instead generates a message. 'auto' means the model can pick between generating a message or calling a tool. Specifying a tool like {'type': 'file_search'} or {'type': 'function', 'function': {'name': 'my_function'}} forces the model to call that tool. |
max_prompt_tokens
|
max_prompt_tokens | number |
The maximum number of prompt tokens specified to have been used over the course of the run. |
max_completion_tokens
|
max_completion_tokens | number |
The maximum number of completion tokens specified to have been used over the course of the run. |
usage
|
usage | object |
Usage statistics related to the run. This value will be null if the run is not in a terminal state (e.g., in_progress, queued). |
truncation_strategy
|
truncation_strategy | object |
Controls how a thread is truncated prior to the run. |
response_format
|
response_format | string |
The format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106. |
getRunResponse
Name | Path | Type | Description |
---|---|---|---|
id
|
id | string |
The identifier, which can be referenced in API endpoints. |
object
|
object | string |
The object type, which is always thread.run. |
created_at
|
created_at | integer |
The Unix timestamp (in seconds) for when the run was created. |
thread_id
|
thread_id | string |
The ID of the thread that was executed on as a part of this run. |
assistant_id
|
assistant_id | string |
The ID of the assistant used for execution of this run. |
status
|
status | string |
The status of the run, which can be either queued, in_progress, requires_action, cancelling, cancelled, failed, completed, or expired. |
required_action
|
required_action | object |
Details on the action required to continue the run. Will be null if no action is required. |
last_error
|
last_error | object |
The last error associated with this run. Will be null if there are no errors. |
expires_at
|
expires_at | integer |
The Unix timestamp (in seconds) for when the run will expire. |
started_at
|
started_at | integer |
The Unix timestamp (in seconds) for when the run was started. |
cancelled_at
|
cancelled_at | integer |
The Unix timestamp (in seconds) for when the run was canceled. |
failed_at
|
failed_at | integer |
The Unix timestamp (in seconds) for when the run failed. |
completed_at
|
completed_at | integer |
The Unix timestamp (in seconds) for when the run was completed. |
model
|
model | string |
The model deployment name that the assistant used for this run. |
instructions
|
instructions | string |
The instructions that the assistant used for this run. |
tools
|
tools | array of tools |
The list of tools that the assistant used for this run. |
file_ids
|
file_ids | array of fileIds |
The list of File IDs the assistant used for this run. |
metadata
|
metadata | object |
Set of 16 key-value pairs that can be attached to an object. Keys can be a maximum of 64 characters long, and values can be a maximum of 512 characters long. |
tool_choice
|
tool_choice | object |
Controls which (if any) tool is called by the model. 'none' means the model won't call any tools and instead generates a message. 'auto' means the model can pick between generating a message or calling a tool. Specifying a tool like {'type': 'file_search'} or {'type': 'function', 'function': {'name': 'my_function'}} forces the model to call that tool. |
max_prompt_tokens
|
max_prompt_tokens | number |
The maximum number of prompt tokens specified to have been used over the course of the run. |
max_completion_tokens
|
max_completion_tokens | number |
The maximum number of completion tokens specified to have been used over the course of the run. |
usage
|
usage | object |
Usage statistics related to the run. This value will be null if the run is not in a terminal state (e.g., in_progress, queued). |
truncation_strategy
|
truncation_strategy | object |
Controls how a thread is truncated prior to the run. |
response_format
|
response_format | string |
The format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106. |
listMessageResponse
Name | Path | Type | Description |
---|---|---|---|
object
|
object | string |
Details of the response object type |
data
|
data | array of Data |
The list of messages returned by the service. |
first_id
|
first_id | string |
Details of the first id |
last_id
|
last_id | string |
Details of the last id |
has_more
|
has_more | boolean |
Tells whether more agents are there |
Data
Name | Path | Type | Description |
---|---|---|---|
id
|
id | string |
The identifier, which can be referenced in API endpoints. |
object
|
object | string |
The object type, which is always assistant. |
created_at
|
created_at | integer |
The Unix timestamp (in seconds) for when the assistant was created. |
name
|
name | string |
The name of the assistant. The maximum length is 256 characters. |
description
|
description | string |
The description of the assistant. The maximum length is 512 characters. |
model
|
model | string |
Name of the model deployment name to use. |
instructions
|
instructions | string |
The system instructions that the assistant uses. The maximum length is 32768 characters. |
tools
|
tools | array of tools |
A list of tools enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter or function. A function description can be a maximum of 1,024 characters. |
metadata
|
metadata | object |
Set of 16 key-value pairs that can be attached to an object. Useful for storing additional information in a structured format. Keys can be a maximum of 64 characters long, and values can be a maximum of 512 characters long. |
temperature
|
temperature | number |
Defaults to 1. Determines what sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
top_p
|
top_p | number |
Defaults to 1. An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. |
response_format
|
response_format | object |
Specifies the format that the model must output. Setting this parameter to { 'type': 'json_object' } enables JSON mode, ensuring the message is valid JSON. |
tool_resources
|
tool_resources | object |
A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
tools
Name | Path | Type | Description |
---|---|---|---|
name
|
name | string |
List of tools that can be used in the run. |
fileIds
Name | Path | Type | Description |
---|---|---|---|
name
|
name | string |
List of file ids that can be used in the run. |