Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. Each embedding is a vector of floating point numbers, such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format. For example, if two texts are similar, then their vector representations should also be similar. Embeddings power vector similarity search in Azure Databases such as Azure Cosmos DB for NoSQL, Azure Cosmos DB for MongoDB vCore, Azure SQL Database or Azure Database for PostgreSQL - Flexible Server.
Prerequisites
- An Azure OpenAI embedding model deployed.
- The following values from your resource:
- Endpoint, for example,
https://YOUR-RESOURCE-NAME.openai.azure.com/. - API key.
- Model deployment name.
- Endpoint, for example,
For more language-specific setup guidance, see Azure OpenAI supported programming languages.
How to get embeddings
To obtain an embedding vector for a piece of text, make a request to the embeddings endpoint as shown in the following code snippets:
Note
The Azure OpenAI embeddings API does not currently support Microsoft Entra ID with the v1 API. Use API key authentication for the examples in this article.
using OpenAI;
using OpenAI.Embeddings;
using System.ClientModel;
EmbeddingClient client = new(
"text-embedding-3-small",
credential: new ApiKeyCredential("API-KEY"),
options: new OpenAIClientOptions()
{
Endpoint = new Uri("https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1")
}
);
string input = "This is a test";
OpenAIEmbedding embedding = client.GenerateEmbedding(input);
ReadOnlyMemory<float> vector = embedding.ToFloats();
Console.WriteLine($"Embeddings: [{string.Join(", ", vector.ToArray())}]");
Best practices
Verify inputs don't exceed the maximum length
- The maximum length of input text for our latest embedding models is 8,192 tokens. You should verify that your inputs don't exceed this limit before making a request.
- If sending an array of inputs in a single embedding request the max array size is 2048.
- When sending an array of inputs in a single request, remember that the number of tokens per minute in your requests must remain below the quota limit that was assigned at model deployment. By default, the latest generation 3 embeddings models are subject to a 350 K TPM per region limit.
Troubleshooting
- If you get a
401or403error, confirm the API key is valid for the resource. - If you get a
404error, confirm the endpoint includes the/openai/v1/path and you used the correct base URL. - If you get a
400error, confirmmodelis set to your deployment name and the request body is valid JSON.
Limitations & risks
Our embedding models may be unreliable or pose social risks in certain cases, and may cause harm in the absence of mitigations. Review our Responsible AI content for more information on how to approach their use responsibly.
Next steps
- Learn more about using Azure OpenAI and embeddings to perform document search with our embeddings tutorial.
- Learn more about the underlying models that power Azure OpenAI.
- Store your embeddings and perform vector (similarity) search using your choice of service: