Apache DevLake for DORA Compliance: Complete Guide
Introduction
Measuring DevOps performance has long been a challenge for engineering teams. You know your team is shipping code and fixing bugs, but can you quantify your delivery velocity? How quickly can you recover from incidents? What’s your deployment success rate? Without concrete metrics, these questions remain frustratingly vague.
The DORA (DevOps Research and Assessment) metrics framework, developed by Google’s research team, provides standardized answers to these questions through four key performance indicators. However, calculating DORA metrics manually across fragmented tools like GitHub, Jira, Jenkins, and GitLab quickly becomes tedious and error-prone.
Apache DevLake solves this problem by acting as a unified data platform that automatically collects, transforms, and visualizes DevOps data from multiple sources. In this guide, you’ll learn how to set up DevLake to track DORA metrics for your team, enabling data-driven decisions that improve software delivery performance. Whether you’re running a small development team or managing enterprise-scale operations, this tutorial will walk you through the complete implementation process in under an hour.
Prerequisites
Before diving into DevLake, ensure you have:
- Docker and Docker Compose installed (version 20.10+)
- Access to your DevOps tools with appropriate permissions:
- Source control (GitHub, GitLab, or Bitbucket) with API token/PAT
- CI/CD system (Jenkins, GitHub Actions, GitLab CI, or similar)
- Issue tracker (Jira, GitHub Issues, or equivalent)
- Basic understanding of:
- Docker containerization concepts
- REST API authentication
- SQL queries (helpful for customization)
- System requirements: 4GB RAM minimum, 8GB recommended
- Network access to your DevOps tool APIs
Understanding DORA Metrics and DevLake
What Are DORA Metrics?
DORA metrics measure software delivery performance across two critical dimensions: velocity and stability. The framework focuses on four key metrics:
- Deployment Frequency (DF): How often you successfully deploy to production
- Lead Time for Changes (LT): Time from code commit to production deployment
- Mean Time to Restore (MTTR): How quickly you recover from production failures
- Change Failure Rate (CFR): Percentage of deployments causing production incidents
These metrics come with established benchmarks (Elite, High, Medium, Low) that help teams understand their performance level and identify improvement opportunities.
How DevLake Enables DORA Tracking
Apache DevLake is an open-source platform that ingests, analyzes, and visualizes fragmented data from DevOps tools to extract insights for engineering excellence. DevLake works by:
- Collecting data from multiple sources through plugins and webhooks
- Transforming raw API responses into a unified domain model
- Calculating DORA metrics at the project level
- Visualizing results through pre-built Grafana dashboards
The platform supports 15+ data sources including GitHub, GitLab, Jira, Jenkins, Azure DevOps, and more. For unsupported tools, DevLake provides webhooks that allow you to actively push data when a specific plugin isn’t available.
DevLake Architecture Overview
Understanding DevLake’s architecture helps you troubleshoot issues and customize the platform effectively.
DevLake’s three-layer data model includes the Raw layer for storing API responses in JSON, the Tool layer for extracting data into relational schemas specific to each DevOps tool, and the Domain layer that provides abstraction so analytics logic can be reused across different tools.
Installation and Setup
Step 1: Install DevLake with Docker Compose
Download the latest release files:
# Create project directory
mkdir devlake && cd devlake
# Download docker-compose and environment file
curl -o docker-compose.yml https://raw.githubusercontent.com/apache/incubator-devlake/main/docker-compose.yml
curl -o .env https://raw.githubusercontent.com/apache/incubator-devlake/main/.env.example
# Generate encryption key for sensitive data
openssl rand -base64 2000 | tr -dc 'A-Z' | fold -w 128 | head -n 1 > encryption.key
Update your .env file with the encryption key:
# Edit .env file
ENCRYPTION_SECRET=$(cat encryption.key)
Start DevLake services:
# Start all containers
docker-compose up -d
# Verify containers are running
docker-compose ps
# Expected output: config-ui, devlake, mysql, grafana all "Up"
Access the Config UI at http://localhost:4000 (default credentials: admin/admin).
Step 2: Configure Data Connections
DevLake uses “connections” to access your DevOps tools. Let’s configure the essential connections for DORA metrics.
GitHub Connection Example
- Navigate to Connections → Add Connection → GitHub
- Configure the connection:
Connection Name: github-main
Endpoint: https://api.github.com/
Auth Token: ghp_your_personal_access_token_here
Required token scopes: repo, read:org, read:user
- Test the connection before saving
Jenkins Connection Example
Connection Name: jenkins-prod
Endpoint: https://jenkins.yourcompany.com/
Username: your-jenkins-username
Password: your-api-token
Jira Connection Example
Connection Name: jira-incidents
Endpoint: https://yourcompany.atlassian.net/
Email: [email protected]
API Token: your-jira-api-token
Step 3: Configure Transformation Rules
Transformation rules tell DevLake which CI/CD runs constitute “deployments” and which issues are “incidents.”
Define Deployments
In your Jenkins/GitHub Actions connection scope config:
{
"deploymentPattern": "(?i)(deploy|push-image|release)",
"productionPattern": "(?i)(prod|production|main)"
}
This regex identifies jobs/workflows containing “deploy” or “push-image” in their names running on production branches.
Define Incidents
In your Jira connection scope config:
{
"issueTypeMappings": {
"INCIDENT": ["Bug", "Production Issue"],
"REQUIREMENT": ["Story", "Epic"]
}
}
DevLake calculates DORA metrics by requiring three key entities: pull requests for code changes, deployments from CI/CD systems, and incidents from issue tracking tools, with their exact definitions varying by project.
Creating Your First DORA Project
Step 1: Create a Project
Projects in DevLake group related data scopes (repos, boards, CI/CD pipelines) for metric calculation.
- Navigate to Projects → Create Project
- Configure project settings:
Project Name: my-product
Enable DORA Metrics: Yes
Step 2: Associate Data Connections
Add the connections you configured earlier:
- Click Add Data Scope
- Select your GitHub connection → Choose repositories
- Select your Jenkins connection → Choose jobs/pipelines
- Select your Jira connection → Choose boards
Example project configuration:
- GitHub:
org/backend-api,org/frontend-app - Jenkins:
backend-deploy-prod,frontend-deploy-prod - Jira:
PROJboard (filtered for incident-type issues)
Step 3: Configure Sync Policy
Set up data collection frequency:
Sync Frequency: Every 6 hours
Time Range: Last 90 days
Skip Failed Tasks: Enabled (recommended for large datasets)
Step 4: Start Data Collection
Click Collect All Data to begin the initial sync. This process typically takes:
- Small projects (< 5 repos): 10-20 minutes
- Medium projects (5-15 repos): 30-60 minutes
- Large projects (15+ repos): 1-3 hours
Monitor progress in the Blueprint Status tab.
Accessing DORA Dashboards
Once data collection completes, access your dashboards:
- Click Dashboards in the top-right corner
- Opens Grafana (credentials: admin/admin)
- Search for “DORA” dashboard
- Explore pre-built visualizations:
- Deployment frequency trends
- Lead time distribution
- MTTR by incident type
- Change failure rate over time
- Performance benchmarking (Elite/High/Medium/Low)
Understanding the DORA Dashboard
The main DORA dashboard displays:
- Top Section: Current metric values with benchmark classification
- Trend Charts: Monthly/weekly trends for each metric
- Breakdown Views: Metrics segmented by team, repository, or time period
- Benchmark Comparison: Your performance vs. industry standards
Advanced Configuration
Using Webhooks for Unsupported Tools
If your deployment tool lacks a native plugin:
# Create webhook in DevLake
POST http://localhost:8080/api/plugins/webhook/1/deployments
# Webhook payload example
{
"commit_sha": "abc123...",
"repo_url": "https://github.com/org/repo",
"start_time": "2024-12-17T10:00:00Z",
"end_time": "2024-12-17T10:05:00Z",
"environment": "PRODUCTION",
"result": "SUCCESS"
}
Configure your CI/CD pipeline to POST deployment data to this endpoint.
Custom SQL Queries
Extend DevLake with custom metrics:
-- Example: Calculate deployment frequency per day
SELECT
DATE(finished_date) as deploy_date,
COUNT(DISTINCT cicd_deployment_id) as deployments
FROM cicd_deployment_commits
WHERE
project_name = 'my-product'
AND result = 'SUCCESS'
AND environment = 'PRODUCTION'
AND finished_date >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY deploy_date
ORDER BY deploy_date;
Add this to a custom Grafana panel for specialized views.
Multi-Repository Projects
For complex projects spanning multiple repositories:
Project: payment-platform
Repos:
- payment-gateway (main deployment)
- payment-processor
- payment-ui
Deployment Source: payment-gateway (final pipeline)
Link Strategy: Track all repos for lead time, final pipeline for deployments
In multi-repo setups, connect the final pipeline that creates the production deployment to track deployments accurately, while connecting all repos and pipelines to calculate Lead Time for Changes across the entire lifecycle.
Common Pitfalls and Troubleshooting
Issue 1: Missing Deployments in Dashboard
Symptoms: Deployment frequency shows zero or incomplete data
Solutions:
- Verify deployment pattern regex matches your job names
- Check environment classification (must be “PRODUCTION”)
- Review transformation rules in scope config
- Ensure CI/CD connection has proper permissions
# Debug query to check raw deployment data
SELECT * FROM cicd_deployment_commits
WHERE cicd_scope_id IN (
SELECT row_id FROM project_mapping
WHERE project_name = 'your-project'
)
LIMIT 10;
Issue 2: Incorrect Lead Time Calculations
Symptoms: Lead time metrics seem unrealistic or show no data
Root Causes:
- Pull requests not linked to deployments
- Missing commit-to-PR associations
- Incorrect branch mappings
Fix: Ensure your scope config includes:
{
"prType": ".*",
"prComponent": ".*",
"prBodyClosePattern": "(?mi)(fix|close|resolve|fixes|closes|resolves|fixed|closed|resolved)[\\s]*.*(((and )?(#|https:\\/\\/github.com\\/)\\d+[ ]*)+)"
}
Issue 3: High Memory Usage During Collection
Symptoms: MySQL container crashes with “innodb_buffer_pool_size” errors
Solution: Increase buffer pool size in docker-compose.yml:
mysql:
image: mysql:8
command: --innodb-buffer-pool-size=512M
environment:
MYSQL_ROOT_PASSWORD: admin
When deleting huge amounts of records during data purging, MySQL InnoDB Engine creates locks in memory that can cause memory bursts, which can be solved by increasing the innodb_buffer_pool_size to a higher value.
Issue 4: Self-Signed Certificate Errors
Symptoms: “Test Connection” fails with certificate verification errors
Solution: For private GitLab/GitHub Enterprise:
devlake:
image: apache/devlake:latest
environment:
- IN_SECURE_SKIP_VERIFY=true
volumes:
- ./rootCA.crt:/usr/local/share/ca-certificates/rootCA.crt
command: ["sh", "-c", "update-ca-certificates; lake"]
Issue 5: DORA Metrics Not Appearing
Use the built-in DORA Debug Dashboard to diagnose:
- Search for “DORA Details” in Grafana
- Check each metric’s data availability:
- Are deployments collected?
- Are pull requests linked?
- Are incidents defined correctly?
- Review project mapping table
-- Verify project associations
SELECT * FROM project_mapping
WHERE project_name = 'your-project';
Production Deployment Considerations
Scaling for Enterprise Use
For production deployments:
- Use Kubernetes/Helm instead of Docker Compose
- Enable the temporal-based runner for better task distribution
- Implement database backups regularly
- Configure monitoring with Prometheus/Alertmanager
- Set up HTTPS with proper SSL certificates
Example Helm installation:
helm repo add devlake https://apache.github.io/incubator-devlake-helm-chart
helm repo update
helm install devlake devlake/devlake \
--set mysql.rootPassword=your-secure-password \
--set encryption.secret=your-encryption-key \
--set ingress.enabled=true
Security Best Practices
- Rotate API tokens regularly (90-day cycle recommended)
- Use secret management tools (HashiCorp Vault, AWS Secrets Manager)
- Enable database encryption at rest
- Implement RBAC in Grafana for dashboard access
- Audit data collection scope to minimize sensitive data exposure
Performance Optimization
- Limit sync frequency to business hours for large datasets
- Use incremental sync mode after initial collection
- Archive old data beyond 12 months
- Index frequently queried columns in custom dashboards
- Monitor API rate limits for external services
Conclusion
Apache DevLake provides a powerful, open-source solution for implementing DORA metrics without the complexity of building custom integrations. By centralizing DevOps data from tools like GitHub, Jira, and Jenkins, DevLake enables teams to track deployment frequency, lead time, recovery time, and change failure rate through intuitive Grafana dashboards.
Key takeaways from this guide:
- DevLake automates DORA metric calculation across fragmented toolchains
- Setup requires connecting data sources, defining transformations, and creating projects
- Pre-built dashboards provide immediate insights with industry benchmarking
- Webhooks extend support to any CI/CD or incident management tool
- Production deployments benefit from Kubernetes/Helm with proper security controls
Next Steps
- Baseline your metrics: Run DevLake for 2-4 weeks to establish current performance
- Set improvement goals: Target the next benchmark tier (e.g., Medium → High)
- Analyze bottlenecks: Use lead time breakdown to identify slowest stages
- Iterate and measure: Implement changes and track metric trends
- Expand coverage: Add more repositories and teams to DevLake projects
For deeper customization, explore the DevLake documentation on custom plugins, advanced SQL queries, and data model extensions. Join the DevLake Slack community for support and to share your DORA success stories.
References:
- Apache DevLake Official Documentation - https://devlake.apache.org/docs/Overview/Introduction/ - Comprehensive guide to DevLake concepts, architecture, and installation methods
- DORA Metrics Implementation Guide - https://devlake.apache.org/docs/DORA/ - Official documentation on configuring and calculating DORA metrics in DevLake
- DevLake GitHub Repository - https://github.com/apache/incubator-devlake - Source code, issue tracking, and community contributions
- DORA Metric Specifications - https://devlake.apache.org/docs/Metrics/DeploymentFrequency/ - Detailed SQL queries and benchmark definitions for each DORA metric
- Real-World DevLake Implementation (CloudifyOps Case Study) - https://medium.com/@CloudifyOps/unleashing-the-power-of-data-introducing-apache-devlake-6df94304b2c7 - Practical example of DevLake deployment with Jira, GitLab, and Jenkins