An Azure service that provides an event-driven serverless compute platform.
Hello Shreyas,
The behavior you are seeing is almost certainly caused by an external re-trigger rather than the Durable Functions framework itself. The framework does not "ghost-retry" or automatically restart an orchestration once it has successfully reached the Completed state.
Here is a breakdown of why this happens and how to resolve it:
- No Built-in Deduplication for Random GUIDs When you call
client.start_new()withinstance_id=None, the framework generates a brand-new GUID. It has no awareness of your underlying business logic orfileId. Consequently, if your HTTP endpoint is called twice for the same file, two distinct orchestrations will run concurrently.
Reference: Instance management in Durable Functions - Activity Function Re-execution vs. Replays
Durable Functions use event sourcing to rebuild state. During a replay, the framework checks the execution history; if an activity function completed previously, its result is returned from memory, and it is not re-executed. If you see the activity function running from scratch, it guarantees a completely new orchestration instance was initiated.
Reference: Orchestrator function code constraints - The Notify Callback is Likely the Culprit
If your batch-complete callback issues an HTTP request that routes back to the same/processendpoint (or another endpoint that invokesstart_new), it will spin up a fresh orchestration. - Implementing "Exactly-Once" Processing
To prevent duplicates, use a deterministic instance ID (e.g.,FileProcess-{fileId}). When you pass a predictable ID tostart_new(), the framework will return an HTTP 409 (Conflict) if an instance with that ID is already running or completed. Checkingclient.get_status()before starting is helpful, but relying on deterministic IDs offloads the concurrency lock directly to the Azure Storage provider, avoiding race conditions.
Reference: Exactly-once and at-most-once processing - functionTimeout Behavior
ThefunctionTimeoutsetting inhost.jsonapplies to the maximum execution time of individual function invocations (which can be unlimited"-1"on Dedicated/Premium plans). It does not govern the broader lifecycle of the orchestration itself and will never cause a completed orchestration to restart.
Hope this helps clarify the behavior and let me know in comments if any further clarification is needed.
Note: This response is generated with the help of AI system.