Apache DevLake for DORA Compliance: Complete Guide

11 min read
apache-devlake dora-metrics devops intermediate 2024

Introduction

Measuring DevOps performance has long been a challenge for engineering teams. You know your team is shipping code and fixing bugs, but can you quantify your delivery velocity? How quickly can you recover from incidents? What’s your deployment success rate? Without concrete metrics, these questions remain frustratingly vague.

The DORA (DevOps Research and Assessment) metrics framework, developed by Google’s research team, provides standardized answers to these questions through four key performance indicators. However, calculating DORA metrics manually across fragmented tools like GitHub, Jira, Jenkins, and GitLab quickly becomes tedious and error-prone.

Apache DevLake solves this problem by acting as a unified data platform that automatically collects, transforms, and visualizes DevOps data from multiple sources. In this guide, you’ll learn how to set up DevLake to track DORA metrics for your team, enabling data-driven decisions that improve software delivery performance. Whether you’re running a small development team or managing enterprise-scale operations, this tutorial will walk you through the complete implementation process in under an hour.

Prerequisites

Before diving into DevLake, ensure you have:

  • Docker and Docker Compose installed (version 20.10+)
  • Access to your DevOps tools with appropriate permissions:
    • Source control (GitHub, GitLab, or Bitbucket) with API token/PAT
    • CI/CD system (Jenkins, GitHub Actions, GitLab CI, or similar)
    • Issue tracker (Jira, GitHub Issues, or equivalent)
  • Basic understanding of:
    • Docker containerization concepts
    • REST API authentication
    • SQL queries (helpful for customization)
  • System requirements: 4GB RAM minimum, 8GB recommended
  • Network access to your DevOps tool APIs

Understanding DORA Metrics and DevLake

What Are DORA Metrics?

DORA metrics measure software delivery performance across two critical dimensions: velocity and stability. The framework focuses on four key metrics:

  1. Deployment Frequency (DF): How often you successfully deploy to production
  2. Lead Time for Changes (LT): Time from code commit to production deployment
  3. Mean Time to Restore (MTTR): How quickly you recover from production failures
  4. Change Failure Rate (CFR): Percentage of deployments causing production incidents

These metrics come with established benchmarks (Elite, High, Medium, Low) that help teams understand their performance level and identify improvement opportunities.

How DevLake Enables DORA Tracking

Apache DevLake is an open-source platform that ingests, analyzes, and visualizes fragmented data from DevOps tools to extract insights for engineering excellence. DevLake works by:

  • Collecting data from multiple sources through plugins and webhooks
  • Transforming raw API responses into a unified domain model
  • Calculating DORA metrics at the project level
  • Visualizing results through pre-built Grafana dashboards

The platform supports 15+ data sources including GitHub, GitLab, Jira, Jenkins, Azure DevOps, and more. For unsupported tools, DevLake provides webhooks that allow you to actively push data when a specific plugin isn’t available.

DevLake Architecture Overview

Understanding DevLake’s architecture helps you troubleshoot issues and customize the platform effectively.

API Calls

Raw JSON

Extract

Transform

Calculate

Metrics Data

Blueprint

Orchestrate

Execute Tasks

DevOps Tools

GitHub, Jira, Jenkins

Data Plugins

Raw Layer

Database

Tool Layer

Normalized Schema

Domain Layer

Unified Model

DORA Plugin

Grafana Dashboards

Config UI

API Server

Runner

DevLake’s three-layer data model includes the Raw layer for storing API responses in JSON, the Tool layer for extracting data into relational schemas specific to each DevOps tool, and the Domain layer that provides abstraction so analytics logic can be reused across different tools.

Installation and Setup

Step 1: Install DevLake with Docker Compose

Download the latest release files:

# Create project directory
mkdir devlake && cd devlake

# Download docker-compose and environment file
curl -o docker-compose.yml https://raw.githubusercontent.com/apache/incubator-devlake/main/docker-compose.yml
curl -o .env https://raw.githubusercontent.com/apache/incubator-devlake/main/.env.example

# Generate encryption key for sensitive data
openssl rand -base64 2000 | tr -dc 'A-Z' | fold -w 128 | head -n 1 > encryption.key

Update your .env file with the encryption key:

# Edit .env file
ENCRYPTION_SECRET=$(cat encryption.key)

Start DevLake services:

# Start all containers
docker-compose up -d

# Verify containers are running
docker-compose ps

# Expected output: config-ui, devlake, mysql, grafana all "Up"

Access the Config UI at http://localhost:4000 (default credentials: admin/admin).

Step 2: Configure Data Connections

DevLake uses “connections” to access your DevOps tools. Let’s configure the essential connections for DORA metrics.

GitHub Connection Example

  1. Navigate to ConnectionsAdd ConnectionGitHub
  2. Configure the connection:
Connection Name: github-main
Endpoint: https://api.github.com/
Auth Token: ghp_your_personal_access_token_here

Required token scopes: repo, read:org, read:user

  1. Test the connection before saving

Jenkins Connection Example

Connection Name: jenkins-prod
Endpoint: https://jenkins.yourcompany.com/
Username: your-jenkins-username
Password: your-api-token

Jira Connection Example

Connection Name: jira-incidents
Endpoint: https://yourcompany.atlassian.net/
Email: [email protected]
API Token: your-jira-api-token

Step 3: Configure Transformation Rules

Transformation rules tell DevLake which CI/CD runs constitute “deployments” and which issues are “incidents.”

Define Deployments

In your Jenkins/GitHub Actions connection scope config:

{
  "deploymentPattern": "(?i)(deploy|push-image|release)",
  "productionPattern": "(?i)(prod|production|main)"
}

This regex identifies jobs/workflows containing “deploy” or “push-image” in their names running on production branches.

Define Incidents

In your Jira connection scope config:

{
  "issueTypeMappings": {
    "INCIDENT": ["Bug", "Production Issue"],
    "REQUIREMENT": ["Story", "Epic"]
  }
}

DevLake calculates DORA metrics by requiring three key entities: pull requests for code changes, deployments from CI/CD systems, and incidents from issue tracking tools, with their exact definitions varying by project.

Creating Your First DORA Project

Step 1: Create a Project

Projects in DevLake group related data scopes (repos, boards, CI/CD pipelines) for metric calculation.

  1. Navigate to ProjectsCreate Project
  2. Configure project settings:
Project Name: my-product
Enable DORA Metrics: Yes

Step 2: Associate Data Connections

Add the connections you configured earlier:

  1. Click Add Data Scope
  2. Select your GitHub connection → Choose repositories
  3. Select your Jenkins connection → Choose jobs/pipelines
  4. Select your Jira connection → Choose boards

Example project configuration:

  • GitHub: org/backend-api, org/frontend-app
  • Jenkins: backend-deploy-prod, frontend-deploy-prod
  • Jira: PROJ board (filtered for incident-type issues)

Step 3: Configure Sync Policy

Set up data collection frequency:

Sync Frequency: Every 6 hours
Time Range: Last 90 days
Skip Failed Tasks: Enabled (recommended for large datasets)

Step 4: Start Data Collection

Click Collect All Data to begin the initial sync. This process typically takes:

  • Small projects (< 5 repos): 10-20 minutes
  • Medium projects (5-15 repos): 30-60 minutes
  • Large projects (15+ repos): 1-3 hours

Monitor progress in the Blueprint Status tab.

Accessing DORA Dashboards

Once data collection completes, access your dashboards:

  1. Click Dashboards in the top-right corner
  2. Opens Grafana (credentials: admin/admin)
  3. Search for “DORA” dashboard
  4. Explore pre-built visualizations:
    • Deployment frequency trends
    • Lead time distribution
    • MTTR by incident type
    • Change failure rate over time
    • Performance benchmarking (Elite/High/Medium/Low)

Understanding the DORA Dashboard

The main DORA dashboard displays:

  • Top Section: Current metric values with benchmark classification
  • Trend Charts: Monthly/weekly trends for each metric
  • Breakdown Views: Metrics segmented by team, repository, or time period
  • Benchmark Comparison: Your performance vs. industry standards

Advanced Configuration

Using Webhooks for Unsupported Tools

If your deployment tool lacks a native plugin:

# Create webhook in DevLake
POST http://localhost:8080/api/plugins/webhook/1/deployments

# Webhook payload example
{
  "commit_sha": "abc123...",
  "repo_url": "https://github.com/org/repo",
  "start_time": "2024-12-17T10:00:00Z",
  "end_time": "2024-12-17T10:05:00Z",
  "environment": "PRODUCTION",
  "result": "SUCCESS"
}

Configure your CI/CD pipeline to POST deployment data to this endpoint.

Custom SQL Queries

Extend DevLake with custom metrics:

-- Example: Calculate deployment frequency per day
SELECT 
  DATE(finished_date) as deploy_date,
  COUNT(DISTINCT cicd_deployment_id) as deployments
FROM cicd_deployment_commits
WHERE 
  project_name = 'my-product'
  AND result = 'SUCCESS'
  AND environment = 'PRODUCTION'
  AND finished_date >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY deploy_date
ORDER BY deploy_date;

Add this to a custom Grafana panel for specialized views.

Multi-Repository Projects

For complex projects spanning multiple repositories:

Project: payment-platform
Repos:
  - payment-gateway (main deployment)
  - payment-processor
  - payment-ui
Deployment Source: payment-gateway (final pipeline)
Link Strategy: Track all repos for lead time, final pipeline for deployments

In multi-repo setups, connect the final pipeline that creates the production deployment to track deployments accurately, while connecting all repos and pipelines to calculate Lead Time for Changes across the entire lifecycle.

Common Pitfalls and Troubleshooting

Issue 1: Missing Deployments in Dashboard

Symptoms: Deployment frequency shows zero or incomplete data

Solutions:

  • Verify deployment pattern regex matches your job names
  • Check environment classification (must be “PRODUCTION”)
  • Review transformation rules in scope config
  • Ensure CI/CD connection has proper permissions
# Debug query to check raw deployment data
SELECT * FROM cicd_deployment_commits 
WHERE cicd_scope_id IN (
  SELECT row_id FROM project_mapping 
  WHERE project_name = 'your-project'
)
LIMIT 10;

Issue 2: Incorrect Lead Time Calculations

Symptoms: Lead time metrics seem unrealistic or show no data

Root Causes:

  • Pull requests not linked to deployments
  • Missing commit-to-PR associations
  • Incorrect branch mappings

Fix: Ensure your scope config includes:

{
  "prType": ".*",
  "prComponent": ".*",
  "prBodyClosePattern": "(?mi)(fix|close|resolve|fixes|closes|resolves|fixed|closed|resolved)[\\s]*.*(((and )?(#|https:\\/\\/github.com\\/)\\d+[ ]*)+)"
}

Issue 3: High Memory Usage During Collection

Symptoms: MySQL container crashes with “innodb_buffer_pool_size” errors

Solution: Increase buffer pool size in docker-compose.yml:

mysql:
  image: mysql:8
  command: --innodb-buffer-pool-size=512M
  environment:
    MYSQL_ROOT_PASSWORD: admin

When deleting huge amounts of records during data purging, MySQL InnoDB Engine creates locks in memory that can cause memory bursts, which can be solved by increasing the innodb_buffer_pool_size to a higher value.

Issue 4: Self-Signed Certificate Errors

Symptoms: “Test Connection” fails with certificate verification errors

Solution: For private GitLab/GitHub Enterprise:

devlake:
  image: apache/devlake:latest
  environment:
    - IN_SECURE_SKIP_VERIFY=true
  volumes:
    - ./rootCA.crt:/usr/local/share/ca-certificates/rootCA.crt
  command: ["sh", "-c", "update-ca-certificates; lake"]

Issue 5: DORA Metrics Not Appearing

Use the built-in DORA Debug Dashboard to diagnose:

  1. Search for “DORA Details” in Grafana
  2. Check each metric’s data availability:
    • Are deployments collected?
    • Are pull requests linked?
    • Are incidents defined correctly?
  3. Review project mapping table
-- Verify project associations
SELECT * FROM project_mapping 
WHERE project_name = 'your-project';

Production Deployment Considerations

Scaling for Enterprise Use

For production deployments:

  1. Use Kubernetes/Helm instead of Docker Compose
  2. Enable the temporal-based runner for better task distribution
  3. Implement database backups regularly
  4. Configure monitoring with Prometheus/Alertmanager
  5. Set up HTTPS with proper SSL certificates

Example Helm installation:

helm repo add devlake https://apache.github.io/incubator-devlake-helm-chart
helm repo update
helm install devlake devlake/devlake \
  --set mysql.rootPassword=your-secure-password \
  --set encryption.secret=your-encryption-key \
  --set ingress.enabled=true

Security Best Practices

  • Rotate API tokens regularly (90-day cycle recommended)
  • Use secret management tools (HashiCorp Vault, AWS Secrets Manager)
  • Enable database encryption at rest
  • Implement RBAC in Grafana for dashboard access
  • Audit data collection scope to minimize sensitive data exposure

Performance Optimization

  • Limit sync frequency to business hours for large datasets
  • Use incremental sync mode after initial collection
  • Archive old data beyond 12 months
  • Index frequently queried columns in custom dashboards
  • Monitor API rate limits for external services

Conclusion

Apache DevLake provides a powerful, open-source solution for implementing DORA metrics without the complexity of building custom integrations. By centralizing DevOps data from tools like GitHub, Jira, and Jenkins, DevLake enables teams to track deployment frequency, lead time, recovery time, and change failure rate through intuitive Grafana dashboards.

Key takeaways from this guide:

  • DevLake automates DORA metric calculation across fragmented toolchains
  • Setup requires connecting data sources, defining transformations, and creating projects
  • Pre-built dashboards provide immediate insights with industry benchmarking
  • Webhooks extend support to any CI/CD or incident management tool
  • Production deployments benefit from Kubernetes/Helm with proper security controls

Next Steps

  1. Baseline your metrics: Run DevLake for 2-4 weeks to establish current performance
  2. Set improvement goals: Target the next benchmark tier (e.g., Medium → High)
  3. Analyze bottlenecks: Use lead time breakdown to identify slowest stages
  4. Iterate and measure: Implement changes and track metric trends
  5. Expand coverage: Add more repositories and teams to DevLake projects

For deeper customization, explore the DevLake documentation on custom plugins, advanced SQL queries, and data model extensions. Join the DevLake Slack community for support and to share your DORA success stories.


References:

  1. Apache DevLake Official Documentation - https://devlake.apache.org/docs/Overview/Introduction/ - Comprehensive guide to DevLake concepts, architecture, and installation methods
  2. DORA Metrics Implementation Guide - https://devlake.apache.org/docs/DORA/ - Official documentation on configuring and calculating DORA metrics in DevLake
  3. DevLake GitHub Repository - https://github.com/apache/incubator-devlake - Source code, issue tracking, and community contributions
  4. DORA Metric Specifications - https://devlake.apache.org/docs/Metrics/DeploymentFrequency/ - Detailed SQL queries and benchmark definitions for each DORA metric
  5. Real-World DevLake Implementation (CloudifyOps Case Study) - https://medium.com/@CloudifyOps/unleashing-the-power-of-data-introducing-apache-devlake-6df94304b2c7 - Practical example of DevLake deployment with Jira, GitLab, and Jenkins