System Hardware Monitoring 
During experiment tracking, SwanLab automatically monitors machine hardware resources and records them in the System Charts. Currently supported hardware:
| Hardware | Info Logging | Resource Monitoring | Script | 
|---|---|---|---|
| NVIDIA GPU | ✅ | ✅ | nvidia.py | 
| Ascend NPU | ✅ | ✅ | ascend.py | 
| Cambricon MLU | ✅ | ✅ | cambricon.py | 
| Kunlunxin XPU | ✅ | ✅ | kunlunxin.py | 
| MooreThread GPU | ✅ | ✅ | moorethreads.py | 
| MetaX GPU | ✅ | ✅ | metax.py | 
| Hygon DCU | ✅ | ✅ | hygon.py | 
| CPU | ✅ | ✅ | cpu.py | 
| Memory | ✅ | ✅ | memory.py | 
| Disk | ✅ | ✅ | disk.py | 
| Network | ✅ | ✅ | network.py | 
System Monitoring Metrics 
SwanLab automatically monitors hardware resources on the machine running the experiment and generates charts for each metric, displayed under the System Charts tab.

Sampling Strategy & Frequency: SwanLab dynamically adjusts hardware data collection frequency based on experiment duration to balance granularity and system performance. Sampling frequencies:
| Data Points Collected | Sampling Frequency | 
|---|---|
| 0~10 | Every 10 seconds | 
| 10~50 | Every 30 seconds | 
| 50+ | Every 60 seconds | 
SwanLab monitors GPU, NPU, CPU, system memory, disk I/O, and network metrics relevant to training processes. Below are detailed descriptions of each component.
GPU (NVIDIA) 

On multi-GPU machines, each GPU's metrics are recorded separately, displayed as individual lines in charts.
| Metric | Description | 
|---|---|
| GPU Memory Allocated (%) | GPU memory utilization – Percentage of VRAM used. | 
| GPU Memory Allocated (MB) | GPU memory usage – VRAM consumption in MB. Chart Y-axis capped at the maximum VRAM across GPUs. | 
| GPU Utilization (%) | GPU utilization – Percentage of computational resources used. | 
| GPU Temperature (℃) | GPU temperature in Celsius. | 
| GPU Power Usage (W) | GPU power consumption in watts. | 
| GPU Time Spent Accessing Memory (%) | Memory access time – Percentage of time spent accessing VRAM. | 
NPU (Ascend) 

On multi-NPU machines, each NPU's metrics are recorded separately.
| Metric | Description | 
|---|---|
| NPU Utilization (%) | NPU computational utilization. | 
| NPU Memory Allocated (MB) | NPU memory usage – VRAM consumption in MB. Chart Y-axis capped at the maximum VRAM across GPUs. | 
| NPU Memory Allocated (%) | NPU memory utilization. | 
| NPU Temperature (℃) | NPU temperature in Celsius. | 
| NPU Power (W) | NPU power draw in watts. | 
MLU (Cambricon) 

On multi-MLU machines, each MLU's metrics are recorded separately.
| Metric | Description | 
|---|---|
| MLU Utilization (%) | MLU computational utilization. | 
| MLU Memory Allocated (MB) | MLU memory usage – VRAM consumption in MB. Chart Y-axis capped at the maximum VRAM across GPUs. | 
| MLU Memory Allocated (%) | MLU memory utilization. | 
| MLU Temperature (℃) | MLU temperature in Celsius. | 
| MLU Power (W) | MLU power draw in watts. | 
XPU (Kunlunxin) 

On multi-XPU machines, each XPU's metrics are recorded separately.
| Metric | Description | 
|---|---|
| XPU Utilization (%) | XPU computational utilization. | 
| XPU Memory Allocated (MB) | XPU memory usage – VRAM consumption in MB. Chart Y-axis capped at the maximum VRAM across GPUs. | 
| XPU Memory Allocated (%) | XPU memory utilization. | 
| XPU Temperature (℃) | XPU temperature in Celsius. | 
| XPU Power (W) | XPU power draw in watts. | 
GPU (MooreThread) 

On multi-GPU machines, each GPU's metrics are recorded separately.
| Metric | Description | 
|---|---|
| GPU Utilization (%) | GPU computational utilization. | 
| GPU Memory Allocated (MB) | GPU memory usage – VRAM consumption in MB. Chart Y-axis capped at the maximum VRAM across GPUs. | 
| GPU Memory Allocated (%) | GPU memory utilization. | 
| GPU Temperature (℃) | GPU temperature in Celsius. | 
| GPU Power (W) | GPU power draw in watts. | 
GPU (MetaX) 

On multi-GPU machines, each GPU's metrics are recorded separately.
| Metric | Description | 
|---|---|
| GPU Utilization (%) | GPU computational utilization. | 
| GPU Memory Allocated (MB) | GPU memory usage – VRAM consumption in MB. Chart Y-axis capped at the maximum VRAM across GPUs. | 
| GPU Memory Allocated (%) | GPU memory utilization. | 
| GPU Temperature (℃) | GPU temperature in Celsius. | 
| GPU Power (W) | GPU power draw in watts. | 
DCU (Hygon) 
On multi-DCU machines, each DCU's metrics are recorded separately.
| Metric | Description | 
|---|---|
| DCU Utilization (%) | DCU computational utilization. | 
| DCU Memory Allocated (MB) | DCU memory usage – VRAM consumption in MB. Chart Y-axis capped at the maximum VRAM across GPUs. | 
| DCU Memory Allocated (%) | DCU memory utilization. | 
| DCU Temperature (℃) | DCU temperature in Celsius. | 
| DCU Power (W) | DCU power draw in watts. | 
CPU 
| Metric | Description | 
|---|---|
| CPU Utilization (%) | CPU computational utilization. | 
| Process CPU Threads | Thread count used by the experiment. | 
Memory 
| Metric | Description | 
|---|---|
| System Memory Utilization (%) | System-wide memory usage percentage. | 
| Process Memory In Use (non-swap) (MB) | Physical memory (excluding swap) consumed by the process. | 
| Process Memory Utilization (MB) | Allocated memory (including swap) for the process. | 
| Process Memory Available (non-swap) (MB) | Available physical memory (excluding swap) for the process. | 
Disk 
| Metric | Description | 
|---|---|
| Disk IO Utilization (MB) | Disk I/O throughput in MB/s (read/write shown separately). | 
| Disk Utilization (%) | Disk usage percentage. | 
On Linux, monitors root (/) usage; on Windows, monitors system drive (typically C:).
Network 
| Metric | Description | 
|---|---|
| Network Traffic (KB) | Network I/O throughput in KB/s (receive/transmit shown separately). | 
Network read/write speeds in KB/s, displayed as separate lines for receive/transmit rates.