Share via

Clarification on GPT-4o Model Retirement & Upgrade Behavior in Azure OpenAI

Bipin Kadam 60 Reputation points
2026-03-19T02:03:12.1533333+00:00

GPT-4o is scheduled for retirement on March 31, 2026. I currently have a deployment in Azure OpenAI Foundry using GPT-4o (model version: 2024-08-06), with the version upgrade policy set to “Once a new default version is available”

I would like clarification on the following:

  1. When GPT-4o (2024-08-06) is retired, will my existing deployment automatically upgrade to the next available default version (e.g., 2024-11-20 or another version)?
  2. How can we identify which model version is considered the “default” going forward, as this is not clear in the documentation?
  3. If the model is auto-upgraded, will there be any changes required on the client side (API interface, request/response format, parameters, etc.)?
  4. Are there any recommended steps to safely handle this transition to avoid service disruption?
Azure OpenAI Service
Azure OpenAI Service

An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.

0 comments No comments
{count} votes

Answer accepted by question author
  1. SRILAKSHMI C 15,825 Reputation points Microsoft External Staff Moderator
    2026-03-19T09:55:22.9066667+00:00

    Hello Bipin Kadam,

    Welcome to Microsoft Q&A and Thank you for reaching out.

    1. What happens after March 31, 2026?

    Since your policy is Auto-update to default,

    Your deployment will upgrade only when a new GPT-4o version is marked as the default

    Retirement alone does not trigger an upgrade

    If Microsoft designates a new default (e.g., a later GPT-4o version), your deployment will automatically move to it on the scheduled date

    If no default exists at that time, the deployment may:

    • Stop serving requests, or
    • Require manual redeployment

    2. How to identify the default version?

    Model deprecations & retirements documentation (shows current + next default versions)

    Also check via Azure Portal (deployment UI / model dropdown)

    CLI:

    az cognitiveservices account deployment list ...
    

    REST API (/deployments)

    3. Will client-side changes be needed?

    No API/interface changes expected (Responses / Chat APIs remain same)

    But expect some behavior differences:

    • Output quality/format
    • Token usage / latency
    • Tool or prompt handling

    So API-compatible, but not behavior-identical testing is recommended.

    4. Recommended approach

    Don’t rely only on auto-upgrade

    Create a test deployment with the next version

    Validate Outputs, Performance, Token usage

    Gradually switch traffic after validation

    Monitor announcements

    Use “Upgrade when expired” if you want automatic transition at retirement, even without a default

    Auto-upgrade happens only when a new default is set

    Retirement alone won’t switch your deployment

    No code changes expected, but test for behavior changes

    Please refer this

    I Hope this helps. Do let me know if you have any further queries.


    If this answers your query, Could you please take a moment to retake the survey on the above response by accepting? Your feedback is greatly appreciated.

    Thank you!


1 additional answer

Sort by: Most helpful
  1. Q&A Assist
    2026-03-19T02:03:38.48+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    For a GPT-4o 2024-08-06 deployment in Azure OpenAI (Azure AI Foundry), behavior is determined by the model version policy and the model retirement rules.

    1. What happens when GPT-4o 2024-08-06 is retired?

    For standard deployments, the relevant behaviors are:

    • Deployments set to Auto-update to default (or equivalent wording such as “Upgrade once new default version becomes available”) automatically move to the new default version when it is designated as default.
    • Deployments set to Upgrade when expired / Once the current version expires automatically update when their current version is retired.
    • Deployments set to No Auto Upgrade / Opt out of automatic model version upgrades stop working when the model version is retired.

    From the model-version guidance:

    • “Deployments set to Auto-update to default automatically update to use the new default version.”
    • “Deployments set to Upgrade when expired automatically update when its current version is retired.”
    • “Deployments that are set to No Auto Upgrade stop working when the model is retired.”

    In Foundry terminology:

    • “Deployments set to Upgrade once new default version becomes available automatically update to use the new default version.”
    • “Deployments set to Once the current version expires automatically update when their current version is retired.”

    Therefore:

    • If the policy on the GPT‑4o deployment is effectively “auto-update to default” (or “upgrade once new default version becomes available”), the deployment will move to whatever GPT‑4o version is designated as the new default when that change occurs.
    • If the policy is “upgrade when expired / once current version expires”, the deployment will move to the then-current default GPT‑4o version at the time the 2024‑08‑06 version is retired.
    1. How to identify which model version is the “default” going forward

    Default versions are published and can also be checked per deployment:

    • Documentation publishes a Default model versions table. For example, it shows gpt-4o current default version 2024-08-06 and notes when defaults change for other models.
    • In Azure Government, a separate table shows that gpt-4o default can differ by cloud (for example, 2024-11-20 as default in Azure Government vs 2024-08-06 in public cloud), illustrating that default is centrally defined and can change over time.

    To see the default and the version a deployment is actually using:

    • Foundry portal: Open the deployment → Details tab to see the current model version.
    • REST API: Call the deployments endpoint; the response includes the model version for each deployment.

    When Microsoft changes the default version of a model:

    • Customers are notified at least two weeks before a new version becomes the default.
    • Deployments configured to auto-update will switch to that new default within the documented window.
    1. Client-side impact when the deployment upgrades

    Model version changes and API version are separate:

    • “A given model deployment might support multiple API versions. The release of a new model version doesn't always require you to upgrade to a new API version, as is the case when there's an update to the model's weights.”
    • Azure “maintains the previous major version of a model until its retirement date, so you can switch back to it if needed.”

    Implications:

    • For typical GPT‑4o version upgrades (for example, 2024‑08‑06 → a later GPT‑4o version), the API interface, request/response format, and parameters generally remain the same because these are governed by the API version (api-version) rather than the model version.
    • If the API version in use continues to be supported by the new model version, no client-side code changes are required.
    • If a future model version required a newer API version, the change would be driven by the API version parameter, not by the model version policy itself; that scenario would be called out in release/retirement communications.
    1. Recommended steps to safely handle the transition

    To minimize disruption around GPT‑4o retirement and default changes:

    1. Confirm the deployment’s upgrade policy
      • In the Foundry portal, open the deployment and verify whether it is set to:
        • Auto-update to default / Upgrade once new default version becomes available, or
        • Upgrade when expired / Once the current version expires, or
        • No auto upgrade.
      • For continuous service, avoid “No auto upgrade” on a retiring model; that setting causes the deployment to stop working at retirement.
    2. Monitor default version announcements
      • Track the Default model versions table and model retirement documentation for GPT‑4o to see when a new default is announced and when retirement dates are reached.
      • Use the two-week notification window before a default change to test the new version.
    3. Create a test deployment on the upcoming version
      • Before the production deployment auto-upgrades, manually deploy the newer GPT‑4o version (for example, the announced future default) as a separate test deployment.
      • Run regression tests to validate:
        • Response quality and behavior for key prompts.
        • Latency and throughput.
        • Any application-specific assumptions (token limits, output length, etc.).
    4. Keep API version stable while testing
      • Use the same api-version in test and production to confirm that only the model version is changing.
      • Change the API version only when required by documentation, and test that separately.
    5. Plan a rollback path
      • While the previous major model version is still within its support window, keep a deployment pinned to that version (with auto-upgrade disabled) for rollback during testing.
      • Note that once a model version reaches its retirement date, deployments pinned to that version will either:
        • Auto-upgrade (if configured to upgrade when expired), or
        • Stop working (if configured with no auto upgrade).
    6. Operational checks at and after upgrade
      • Add monitoring/alerts around:
        • Error rates from the GPT‑4o endpoint.
        • Latency and timeouts.
      • After the upgrade date, confirm in the portal or via the deployments API that the deployment is now using the expected new model version.

    Following these steps ensures that when GPT‑4o 2024‑08‑06 is retired, the deployment either automatically moves to the new default GPT‑4o version (per the configured policy) or is manually migrated in a controlled way, without requiring client-side API contract changes in the typical case.


    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.