Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
When you enable the metrics-based experience for monitoring your Azure virtual machines, a default set of metrics are collected. You can customize your collection to include additional metrics such as per-process performance, logical disk usage, filesystem utilization, and other workload-specific metrics by modifying the data collection rule.
Details for the creation of the DCR are provided in Collect data from virtual machine client with Azure Monitor. This article provides additional details for the OpenTelemetry Performance Counters data source type.
Note
To work with the DCR definition directly or to deploy with other methods such as ARM templates, see Data collection rule (DCR) samples in Azure Monitor.
Cost
The default set of OpenTelemetry metrics are collected at no cost. There is an additional cost to collect any additional OTel metrics beyond the default set. See Azure Monitor pricing for pricing details.
Prerequisites
- An Azure Monitor workspace to store the OpenTelemetry metrics. See Create an Azure Monitor workspace.
- Permissions to create data collection rules. See Permissions.
Identify data collection rule (DCR)
To identify the DCR associated with the VM, open Data Collection Rules from the Monitor menu in the Azure portal. Select the Resources tab and locate your VM.
Click the number in the Data collection rules column to list the DCRs associated with the VM. The OTel DCR will have a name in the form MSVMOtel-<region>-<name>. Click on the DCR to open it.
Configure data source
On the Data sources tab of the DCR, click on the OpenTelemetry Performance Counters data source. Select from a predefined set of objects to collect and their sampling rate. The lower the sampling rate, the more frequently the value is collected.
Select Custom for a more granular selection of OpenTelemetry performance counters.
Verify data collection
To verify OpenTelemetry performance counters are being collected, scope a query to the Azure Monitor workspace, and check that the data is returned for the metrics you selected.
If the workspace was set to resource-context access mode, you can also verify the same query works as expected when scoped to the VM itself by navigating to the VM Metrics blade. Either choose the add with editor dropdown or View AMW metrics in editor dropdown under Metric Namespaces.
Both entry points should result in a PromQL editor with a query scoped to the VM resource now, where the same query will work as before, but without any need to filter on the VM microsoft.resourceid dimension.
Metrics reference
The following tables list the OpenTelemetry metrics available for virtual machines.
Default metrics
The metrics in the following table are collected by default and at no additional cost.
| Metric Name | Description |
|---|---|
| system.uptime | Time since last reboot (in seconds) |
| system.cpu.time | Total CPU time consumed (user + system + idle), in seconds |
| system.memory.usage | Memory in use (bytes) |
| system.network.io | Bytes transmitted/received |
| system.network.dropped | Dropped packets |
| system.network.errors | Network errors |
| system.disk.io | Disk I/O (bytes read/written) |
| system.disk.operations | Disk operations (read/write counts) |
| system.filesystem.usage | Filesystem usage in bytes |
| system.disk.operation_time | Average disk operation time |
Additional metrics
The metrics in the following table can be collected by modifying the DCR for the VM as described above. There's an additional cost to collect these metrics.
| Metric Name | Description |
|---|---|
| system.cpu.utilization | CPU usage % |
| system.cpu.logical.count | Number of logical processors |
| system.cpu.physical.count | Number of physical CPUs |
| system.cpu.frequency | CPU frequency |
| system.cpu.load_average.1m | System load average (1 min) |
| system.cpu.load_average.5m | System load average (5 min) |
| system.cpu.load_average.15m | System load average (15 min) |
| system.memory.utilization | % memory used |
| system.memory.limit | Total memory limit |
| system.memory.page_size | Page size (bytes) |
| system.linux.memory.available | Available memory |
| system.linux.memory.dirty | Dirty memory pages |
| system.paging.faults | Page faults |
| system.paging.operations | Paging operations (reads/writes) |
| system.paging.usage | Paging/swap usage (bytes) |
| system.paging.utilization | % paging/swap used |
| system.disk.io_time | Time spent doing I/O |
| system.disk.merged | Number of merged operations |
| system.disk.pending_operations | Pending I/O operations |
| system.disk.weighted_io_time | Weighted I/O time (accounts for queue depth) |
| system.filesystem.utilization | Filesystem usage % |
| system.filesystem.inodes.usage | Inodes usage |
| system.network.packets | Packets transmitted/received |
| system.network.connections | Active network connections |
| system.network.conntrack.count | Current conntrack table entries |
| system.network.conntrack.max | Maximum conntrack table size |
| process.uptime | Process uptime |
| process.cpu.time | CPU time consumed by process |
| process.cpu.utilization | CPU usage % per process |
| process.memory.usage | Memory usage (RSS) |
| process.memory.virtual | Virtual memory usage |
| process.memory.utilization | Memory % usage |
| process.disk.io | Disk I/O (bytes per process) |
| process.disk.operations | Disk operations per process |
| process.paging.faults | Process page faults |
| process.open_file_descriptors | Open file descriptors |
| process.threads | Number of threads |
| process.handles | Handles in use (Windows) |
| process.context_switches | Context switches |
| process.signals_pending | Pending signals |
| system.processes.count | Total number of processes |
| system.processes.created | Processes created |
For a complete reference with types, units, dimensions, and other metadata, see OpenTelemetry metrics reference.
Next steps
- OpenTelemetry metrics reference
- Metrics experience for VMs in Azure Monitor
- Learn more about Azure Monitor Agent
- Learn more about data collection rules