Logs configuration
Astro Private Cloud (APC) provides centralized logging through Vector and Elasticsearch. Task logs, platform logs, and audit logs are collected by Vector and indexed in Elasticsearch for searchability and troubleshooting.
Architecture
Vector runs as a DaemonSet collecting logs from all pods. Logs are shipped to Elasticsearch for storage and indexing. For log visualization, you can connect your own tools (Kibana, Grafana, OpenSearch Dashboards, etc.) to query Elasticsearch.
Accessing logs
Airflow UI
Task logs are accessible directly in the Airflow webserver UI:
- Navigate to the Dag.
- Click a task instance.
- Click Log.
Elasticsearch API
Query logs directly via Elasticsearch:
BYO visualization
APC does not include a log visualization UI. Connect your preferred tool to Elasticsearch:
- Kibana: Deploy separately and point to the Elasticsearch endpoint.
- Grafana: Use the Elasticsearch data source.
- OpenSearch Dashboards: Use Elasticsearch API compatibility.
Vector configuration
Vector is the log collection agent in APC 1.0.
Enable Vector
Custom log parsing
Add custom transforms to parse Airflow log formats:
Logging sidecar
APC supports either DaemonSet or sidecar logging on a data plane cluster, but not both simultaneously. To use sidecar logging, you must first disable the Vector DaemonSet, then enable the sidecar:
Elasticsearch configuration
Enable Elasticsearch
Index lifecycle management
Configure log retention:
External logging
Forward to external Elasticsearch
Send logs to your own Elasticsearch cluster:
Forward to S3
Archive logs to object storage:
Forward to external systems
Configure Vector to send to any destination:
Deployment log settings
Task log retention
Configure log groomer to manage disk usage:
Log level configuration
Querying logs
Elasticsearch query examples
Find task failures:
Search specific Dag:
Filter by time range:
Common log fields
Troubleshooting
Logs aren’t appearing
-
Check Vector is running:
-
Check Elasticsearch health:
-
Verify Vector logs:
High disk usage
- Enable index lifecycle management.
- Reduce retention period.
- Increase Elasticsearch storage.
- Forward logs to external storage (S3).
Slow queries
- Add index patterns for common searches.
- Increase Elasticsearch resources.
- Reduce log verbosity.
Security
Access control
Elasticsearch access is restricted to platform components. For external access, configure authentication:
Log redaction
Redact sensitive data before indexing:
Best practices
- Set appropriate retention based on compliance requirements.
- Use log levels wisely - avoid DEBUG in production.
- Enable log groomer to prevent disk exhaustion on Airflow pods.
- Forward logs externally for long-term retention and compliance.
- Monitor Elasticsearch health and disk usage.
- Use your preferred visualization tool - deploy Kibana, Grafana, or other tools separately.