Intermittent FileSystem Corruption I/O errors in Azure Container Instances (UK South)

Question

Intermittent FileSystem Corruption I/O errors in Azure Container Instances (UK South)

Scott Gray 0

Service: Azure Container Instances

Region: UK South

Image: debian:bullseye-slim (No recent changes)

We are experiencing what appears to be intermittent filesystem corruption and Input/Output error failures when running Azure DevOps self-hosted agents inside Azure Container Instances.

The issue mainly manifests during Git operations and normal file reads / writes. Files like /etc/resolv.conf are being unreadable with I/O errors which also leads to networking failures.

Error from Git:

fatal: fsync error on './git/objects/pack/tmp_idx_*': Input/Output error

fatal: index-pack failed

2 answers

Your answer

Ankit Yadav 14,455 Reputation points Microsoft External Staff Moderator

2026-04-10T18:14:56.5066667+00:00

Hello @Scott Gray

We have reached out to you via Private Message as well for more details, please review the private message and share us the details for the same.
Ankit Yadav 14,455 Reputation points Microsoft External Staff Moderator

2026-04-13T18:16:41.1566667+00:00

Hello @Scott Gray

Just checking in if you are still looking for details or the answer shared earlier helped to answer your query.

Let us know if you still need assistance, we're here to help you!

Answer 1

Scott Gray hi,

this looks like ACI host/storage layer issue in that region, especially since same workload works in UK West and u are seeing low-level I O errors even on /etc/resolv.conf, thats not app level, thats underlying filesystem/overlay fs breaking.

PLS stop trusting that region for this workload, move to UK West or another region as primary, thats the fastest fix.

second avoid heavy git/pack operations on ACI ephemeral storage, use Azure Files or mounted volume for workspace instead of container overlay fs.

third reduce fsync pressure if possible (git can hammer disk hard), but tbh if host is flaky this wont fully save u.

fourth add retry logic on container runs (ACI sometimes lands u on bad host, next run may be fine).

fifth capture logs and open Azure support ticket with region + timestamps, this is backend issue they need to investigate

optional if this is critical workload consider moving to AKS or VM-based agents, ACI is not great for heavy IO workloads

rgds,

Alex

Answer 2

Hello Scott,

Thanks for the detailed description that helps narrow things down.

From a service perspective, Azure Container Instances use ephemeral, host-backed storage for the container’s root filesystem. This includes paths such as /etc, the container image layer, and any file I/O performed directly on the container filesystem. This storage is tied to the health of the underlying host and is not designed for heavy or durable I/O operations. (see:https://learn.microsoft.com/en-us/azure/reliability/reliability-container-instances)

There are no broad service issue specific to UK South that would indicate ongoing filesystem corruption in Azure Container Instances. Differences in behavior between regions (for example, UK South vs UK West) can occur due to capacity placement or individual host health, but these are not surfaced as public incidents unless there is a widespread impact.

The symptoms you’re seeing- intermittent fsync failures, unreadable files such as /etc/resolv.conf, and transient I/O errors during Git operations- are consistent with ephemeral local storage encountering a transient host-level failure. For this reason, we recommend that workloads running on ACI:

Treat container filesystem storage as temporary and failure-prone
Implement retry logic for I/O-heavy operations (especially Git)
Avoid performing critical build or workspace operations directly on the container root filesystem

To mitigate this:

Move your Git workspace off the container filesystem
- Use an emptyDir volume for temporary build artifacts. This provides a clean, writable directory scoped to the container group lifecycle, but remains ephemeral and host-backed. https://learn.microsoft.com/en-us/azure/container-instances/container-instances-volume-emptydir
- If you need durability across restarts or higher I/O stability, mount an Azure File share, which is the supported persistent storage option for ACI.
Add resilience to the pipeline
- Transient faults are expected in ACI. Git operations should include retries and exponential backoff.
Collect diagnostics
- Capture container events and logs using az container logs and az container show so we can evaluate host placement and restart history.

If the issue continues after moving I/O off the container filesystem, please share:

The container group definition (CPU, memory, volume mounts)
Frequency and duration of failures

Hope this answers your concerns with the intermittent failures!!

Share via

Intermittent FileSystem Corruption I/O errors in Azure Container Instances (UK South)

2 answers

Your answer