Process Monitoring Implementation#
This document describes the complete process monitoring implementation for Rust proplets in the Propeller distributed task execution system.
Overview#
Comprehensive OS-level process monitoring has been implemented for:
- Rust Proplet - Using
sysinfocrate for cross-platform metrics - Manager - Ready for integration (metrics aggregation and visualization)
Monitoring Profiles#
Profiles define which metrics to collect, how often, and how much history to retain.
The Rust implementation provides two built-in profiles:
| Profile | Interval | Metrics | Export | History | Use Case |
|---|---|---|---|---|---|
| Standard | 10s | All | Yes | 100 | General purpose |
| Long-running Daemon | 120s | All | Yes | 500 | Background services |
Custom Profiles#
You can also define custom monitoring profiles via JSON configuration with the following options:
enabled: Enable/disable monitoring (default:true)interval: Collection interval in seconds (default:10)collect_cpu: Collect CPU metrics (default:true)collect_memory: Collect memory metrics (default:true)collect_disk_io: Collect disk I/O metrics (default:true)collect_threads: Collect thread count (default:true)collect_file_descriptors: Collect file descriptor count (default:true)export_to_mqtt: Publish metrics to MQTT (default:true)retain_history: Keep metrics history (default:true)history_size: Maximum history entries (default:100)
Metrics Collected#
Common Metrics (All Platforms)#
- CPU usage percentage
- Memory usage (bytes and percentage)
- Disk I/O (read/write bytes)
- Network I/O (rx/tx bytes)
- Process uptime
Platform-Specific Metrics#
| Metric | Linux | macOS | Windows |
|---|---|---|---|
| Thread Count | ✓ | ✓ | Limited |
| File Descriptors | ✓ | ✓ | ✗ |
| Detailed Memory Stats | ✓ | ✓ | ✓ |
MQTT Topics#
Proplet-Level Metrics#
m/{domain_id}/c/{channel_id}/control/proplet/metrics
Publishes overall proplet health metrics.
### Task-Level Metrics
```txt
m/{domain_id}/c/{channel_id}/control/proplet/task_metrics
m/{domain_id}/c/{channel_id}/metrics/proplet
Publishes per-task process metrics.
Message Format#
{
"task_id": "uuid",
"proplet_id": "uuid",
"metrics": {
"cpu_percent": 42.5,
"memory_bytes": 67108864,
"memory_percent": 1.5,
"disk_read_bytes": 1048576,
"disk_write_bytes": 524288,
"network_rx_bytes": 4096,
"network_tx_bytes": 8192,
"uptime_seconds": 120,
"thread_count": 4,
"file_descriptor_count": 12,
"timestamp": "2025-01-15T10:30:00.000Z"
},
"aggregated": {
"avg_cpu_usage": 38.2,
"max_cpu_usage": 65.0,
"avg_memory_usage": 62914560,
"max_memory_usage": 71303168,
"total_disk_read": 2097152,
"total_disk_write": 1048576,
"total_network_rx": 12288,
"total_network_tx": 24576,
"sample_count": 24
}
}
Configuration#
Rust Proplet Environment Variables#
export PROPLET_ENABLE_MONITORING=true # Enable/disable monitoring (default: true)
export PROPLET_METRICS_INTERVAL=10 # Proplet-level metrics interval in seconds (default: 10)
Per-Task Configuration (JSON)#
{
"monitoring_profile": {
"enabled": true,
"interval": 5000000000,
"collect_cpu": true,
"collect_memory": true,
"collect_disk_io": true,
"collect_network_io": true,
"collect_threads": true,
"collect_file_descriptors": true,
"export_to_mqtt": true,
"retain_history": true,
"history_size": 200
}
}
Performance Impact#
Measured overhead across platforms:
| Profile | CPU Overhead | Memory Overhead |
|---|---|---|
| Minimal | < 0.1% | ~1 MB |
| Standard | < 0.5% | ~2 MB |
| Intensive | < 2% | ~5 MB |
Usage Examples#
task := task.Task{
ID: "task-123",
Name: "compute",
ImageURL: "registry.example.com/compute:v1",
Daemon: false,
MonitoringProfile: &monitoring.StandardProfile(),
}
Rust - Start Task with Monitoring#
{
"id": "550e8400-e29b-41d4-a716-446655440001",
"functionName": "compute",
"imageURL": "registry.example.com/compute:v1",
"daemon": false,
"monitoringProfile": {
"enabled": true,
"interval": 10,
"collect_cpu": true,
"collect_memory": true,
"export_to_mqtt": true,
"retain_history": true,
"history_size": 100
}
}
Integration with Monitoring Systems#
Prometheus#
Use MQTT-to-Prometheus exporter:
scrape_configs:
- job_name: "propeller"
static_configs:
- targets: ["mqtt-exporter:9641"]
Grafana#
Create dashboards with:
- CPU usage over time
- Memory consumption trends
- Disk/Network I/O rates
- Per-task resource usage
Custom Monitoring#
Subscribe to MQTT topics:
mosquitto_sub -h localhost -t "m/+/c/+/*/metrics" -v
Testing#
Manual Test#
# Start proplet
./build/proplet
# Submit a task
curl -X POST http://localhost:8080/tasks \
-H "Content-Type: application/json" \
-d '{
"id": "test-123",
"name": "compute",
"file": "...",
"monitoring_profile": {
"enabled": true,
"interval": 5000000000,
"export_to_mqtt": true
}
}'
# Monitor metrics
mosquitto_sub -h localhost -t "m/+/c/+/*/metrics" -v
Future Enhancements#
- Manager Integration
- Aggregate metrics from all proplets
- Historical metrics storage
- Metrics API endpoints
-
Alerting on anomalies
-
Advanced Metrics
- GPU usage (if available)
- Container-specific metrics (cgroups)
- Custom application metrics
-
Distributed tracing correlation
-
Optimization
- Adaptive sampling rates
- Metric compression
- Batched MQTT publishing
-
Metrics rollups/aggregation
-
Visualization
- Built-in dashboards
- Real-time metric streaming
- Historical trend analysis
- Anomaly detection
References#
- Rust Implementation:
proplet-rs/src/monitoring/ - Examples:
examples/monitoring-example.md - Rust Docs:
proplet-rs/MONITORING.md