GitOps Best Practices: Building Production-Ready Deployments

13 min read
gitops kubernetes devops continuous-deployment 2024

Introduction

If you’ve ever manually SSHed into a production server to fix “just one config file,” you understand why GitOps has become essential for modern infrastructure management. That quick fix works—until three months later when no one remembers what changed, why it changed, or how to replicate it across environments.

GitOps solves this problem by treating Git as the single source of truth for your entire infrastructure and application state. Instead of executing commands directly against your clusters, you declare your desired state in Git repositories, and automated agents continuously reconcile your live environment to match. This approach brings the reliability of code review, version control, and audit trails to infrastructure operations.

In this article, you’ll learn battle-tested best practices for implementing GitOps in production environments. We’ll cover repository structuring strategies, secrets management approaches, progressive delivery patterns, and common pitfalls to avoid. Whether you’re starting fresh or improving an existing GitOps workflow, these patterns will help you build more reliable, auditable, and scalable deployments.

Prerequisites

Before diving into GitOps best practices, you should have:

  • Kubernetes knowledge: Understanding of pods, deployments, services, and namespaces
  • Git fundamentals: Familiarity with branches, pull requests, and merge workflows
  • Basic CI/CD concepts: Knowledge of build pipelines and deployment automation
  • YAML proficiency: Ability to read and write Kubernetes manifests
  • A Kubernetes cluster: For hands-on practice (can be local with kind or minikube)

Understanding GitOps Core Principles

GitOps isn’t just “using Git for deployments.” It’s a methodology built on three foundational principles that distinguish it from traditional CI/CD approaches.

Declarative Configuration as Code

Your entire system state must be described declaratively. Rather than writing scripts that execute a series of commands (imperative approach), you define what the end result should look like. Kubernetes manifests, Helm charts, and Kustomize overlays are all declarative—they specify the desired state without dictating the steps to achieve it.

# Declarative: Describes what you want
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: nginx
        image: nginx:1.25

This declarative model enables powerful GitOps features like automatic drift detection and self-healing. If someone manually scales your deployment to 5 replicas, the GitOps operator will detect the drift and automatically scale it back to the declared 3 replicas.

Git as Single Source of Truth

All configuration must live in Git repositories, which become the canonical reference for your infrastructure state. This means:

  • No manual kubectl applies in production environments
  • No configuration stored only in cluster state
  • Every change flows through Git with proper review and approval

The benefits are substantial: complete audit trails, easy rollbacks via Git history, and the ability to recreate entire environments from scratch using only your Git repositories.

Automated Synchronization

GitOps operators (like Argo CD or Flux) continuously monitor your Git repositories and automatically apply changes to your clusters. This pull-based model has significant security advantages over traditional push-based CD systems, as your clusters never need to expose APIs to external CI systems.

git push

monitors

reconciles

detects drift

self-heals

Developer

Git Repository

GitOps Operator

Kubernetes Cluster

Repository Structure: Monorepo vs Polyrepo

One of your first critical decisions is how to organize your Git repositories. The two primary patterns are monorepo (single repository for everything) and polyrepo (separate repositories for different concerns).

For most production use cases, a polyrepo strategy with a dedicated configuration repository offers the best balance of flexibility, security, and maintainability.

Structure:

application-repos/          # Multiple application source code repositories
  ├── frontend-app/        # Contains app code, Dockerfile, CI pipeline
  ├── backend-api/         # Contains app code, Dockerfile, CI pipeline
  └── data-service/        # Contains app code, Dockerfile, CI pipeline

config-repo/               # Single GitOps configuration repository
  ├── base/                # Shared base configurations
  │   ├── frontend/
  │   ├── backend/
  │   └── data-service/
  ├── environments/
  │   ├── dev/             # Dev-specific overlays
  │   ├── staging/         # Staging-specific overlays
  │   └── production/      # Production-specific overlays
  └── clusters/
      ├── us-east-1/       # Cluster-specific configs
      └── eu-west-1/

Why polyrepo works:

  1. Clear separation of concerns: Application development teams focus on code; platform teams manage deployments
  2. Independent CI/CD pipelines: Application CI builds and tests code, pushes images, then updates the config repo
  3. Simplified access control: Developers can have read-only access to config repo while maintaining full control over their application repos
  4. Reduced blast radius: Issues in application repos don’t affect deployment configurations

Environment Management: Directories Over Branches

The GitOps community strongly discourages using branches for different environments (dev, staging, production). Instead, use directories with overlays:

Why directories win:

# Using Kustomize for environment-specific configuration
# environments/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../base/frontend
patches:
  - path: replica-count.yaml
  - path: resource-limits.yaml
images:
  - name: frontend
    newTag: v1.2.3  # Production version pinned
  • Clear visualization: You can see all environment configurations side-by-side in a single view
  • Easier promotion: Promoting changes between environments is transparent—just update the image tag or configuration values
  • No merge conflicts: Branches for environments create constant merge headaches; directories avoid this entirely
  • Better tooling support: Kustomize and Helm naturally work with directory-based overlays

Trunk-Based Development for GitOps

Adopt trunk-based development for your GitOps repositories. This means:

  1. One main branch (main or production) as your trunk
  2. Short-lived feature branches for changes
  3. Pull request workflow for all modifications
  4. Strict branch protection on the trunk
# Example workflow
git checkout -b feature/update-nginx-version
# Make changes to production overlay
git add environments/production/
git commit -m "feat: update nginx to 1.25 for security patch"
git push origin feature/update-nginx-version
# Create PR, get approvals, merge to main

Once merged to main, your GitOps operator automatically applies changes to the appropriate clusters. This keeps your workflow simple while maintaining safety through code review.

Secrets Management: Never Commit Plaintext

Managing secrets is one of GitOps’s biggest challenges. Git repositories aren’t designed for sensitive data, and Kubernetes Secrets are only base64-encoded (not encrypted). You have two main approaches:

Store secrets in a dedicated secrets manager (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault) and reference them from your GitOps repo. The External Secrets Operator pulls the actual values at runtime.

Benefits:

  • Secrets never touch Git
  • Centralized secret rotation
  • Better audit trails
  • Works naturally across multiple clusters
# gitops-repo/base/backend/external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: db-secret  # Creates this Kubernetes Secret
  data:
    - secretKey: username
      remoteRef:
        key: prod/database/credentials
        property: username
    - secretKey: password
      remoteRef:
        key: prod/database/credentials
        property: password

Installation example (Helm):

helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets \
  external-secrets/external-secrets \
  -n external-secrets-system \
  --create-namespace

Approach 2: Sealed Secrets (For Simpler Use Cases)

Sealed Secrets encrypts your secrets so they’re safe to store in Git. The SealedSecrets controller in your cluster decrypts them at runtime.

# Install sealed secrets controller
kubectl apply -f \
  https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.24.0/controller.yaml

# Create and seal a secret
kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=secret123 \
  --dry-run=client -o yaml | \
  kubeseal -o yaml > sealed-db-credentials.yaml

# Now safe to commit sealed-db-credentials.yaml to Git
git add sealed-db-credentials.yaml

Trade-offs:

  • ✅ Simpler setup than External Secrets
  • ✅ All configuration stays in Git
  • ❌ Key rotation is complex
  • ❌ Secrets are bound to specific clusters
  • ❌ If encryption key leaks, all secrets in Git are compromised

Recommendation: Use External Secrets Operator for production environments. Reserve Sealed Secrets for development environments or simpler deployments where the operational overhead of a secrets manager isn’t justified.

Progressive Delivery: Safe Production Rollouts

Deploying directly to 100% of production traffic is risky. Progressive delivery patterns gradually expose new versions while monitoring for issues.

Canary Releases

Deploy the new version alongside the current version and route a small percentage of traffic to it. Gradually increase traffic as confidence grows.

# Using Argo Rollouts for canary deployments
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: web-app
spec:
  replicas: 5
  strategy:
    canary:
      steps:
        - setWeight: 20    # 20% to canary
        - pause: {duration: 10m}
        - setWeight: 50    # 50% to canary
        - pause: {duration: 10m}
        - setWeight: 80    # 80% to canary
        - pause: {duration: 5m}
      analysis:
        templates:
          - templateName: error-rate-check
  template:
    spec:
      containers:
        - name: app
          image: myapp:v2.0.0

Blue-Green Deployments

Run two identical environments (blue and green). Deploy to the inactive environment, test it, then switch traffic over atomically.

# Blue deployment (currently active)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-blue
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: myapp
        version: blue
    spec:
      containers:
        - name: app
          image: myapp:v1.0.0

---
# Service pointing to blue
apiVersion: v1
kind: Service
metadata:
  name: app-service
spec:
  selector:
    app: myapp
    version: blue  # Switch to 'green' when ready
  ports:
    - port: 80

Deploy the green version, verify it works, then update the service selector to version: green. Instant switchover with easy rollback capability.

Monitoring and Observability

Without proper monitoring, you won’t know when GitOps automation fails or when deployments cause issues.

Essential Monitoring Practices

  1. GitOps operator health: Monitor your Argo CD or Flux components
  2. Sync status alerts: Get notified when repositories fail to sync
  3. Drift detection: Alert when live state diverges from Git
  4. Deployment events: Track all automated changes
# Example Prometheus alerting rule
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-alerts
data:
  alerts.yaml: |
    groups:
      - name: argocd
        rules:
          - alert: ApplicationSyncFailed
            expr: argocd_app_sync_total{phase="Error"} > 0
            for: 5m
            labels:
              severity: warning
            annotations:
              summary: "ArgoCD sync failed for {{ $labels.name }}"
              description: "Application {{ $labels.name }} failed to sync"

Integration with Observability Stack

Connect your GitOps tools with your existing observability platform:

# Flux notification to Slack
apiVersion: notification.toolkit.fluxcd.io/v1beta1
kind: Alert
metadata:
  name: slack-notifications
spec:
  providerRef:
    name: slack
  eventSeverity: info
  eventSources:
    - kind: GitRepository
      name: '*'
    - kind: Kustomization
      name: '*'

Common Pitfalls and How to Avoid Them

Pitfall 1: Programmatic Updates Create Git Conflicts

Problem: Multiple CI pipelines updating the same GitOps repo simultaneously cause race conditions and push conflicts.

Solution:

  • Use one GitOps repo per namespace or logical boundary
  • Implement retry logic with exponential backoff in your CI systems
  • Consider tools like Argo CD Image Updater that handle concurrent updates gracefully

Pitfall 2: Configuration Drift Goes Unnoticed

Problem: Manual changes to clusters aren’t detected quickly, undermining GitOps principles.

Solution:

  • Enable automatic sync in your GitOps operator (Argo CD’s auto-sync, Flux’s reconciliation)
  • Set aggressive sync intervals (every 3-5 minutes)
  • Configure alerts for OutOfSync states
  • Prevent manual cluster access with RBAC policies

Pitfall 3: No Clear Rollback Strategy

Problem: When deployments fail, teams panic because there’s no defined rollback process.

Solution:

# Rollback is just a Git revert
git revert HEAD
git push origin main
# GitOps operator automatically applies the previous state

Document your rollback procedures. For Argo CD:

# CLI rollback to previous revision
argocd app rollback myapp 123

Pitfall 4: Secrets Sprawl Across Repositories

Problem: As repositories proliferate, secrets are duplicated and become difficult to manage.

Solution:

  • Centralize secrets in one secrets manager
  • Use External Secrets Operator to reference them
  • Implement secret rotation policies
  • Document which secrets are used where

Pitfall 5: Inadequate Testing Before Production

Problem: Changes are deployed to production without sufficient validation.

Solution:

# Use preview environments for PRs
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: preview-envs
spec:
  generators:
    - pullRequest:
        github:
          owner: myorg
          repo: myapp
  template:
    metadata:
      name: 'preview-pr-{{number}}'
    spec:
      destination:
        namespace: 'preview-pr-{{number}}'

Choosing Your GitOps Tool: Argo CD vs Flux

Both Argo CD and Flux are excellent CNCF projects, but they have different philosophies:

Choose Argo CD if:

  • You want a rich web UI for visualization and troubleshooting
  • Your team includes non-Kubernetes experts who need an intuitive interface
  • You need built-in RBAC and SSO integration
  • You prefer application-centric workflows

Choose Flux if:

  • You want a lightweight, CLI-driven experience
  • Your team is comfortable with Kubernetes-native CRDs
  • You need modular components you can pick and choose
  • You prioritize resource efficiency

Current landscape note (2024): Weaveworks, the company behind Flux, shut down in early 2024. However, Flux remains a CNCF graduated project with active community development. For new projects starting in 2024-2025, Argo CD offers a more stable commercial ecosystem, but Flux is still a solid choice for teams that prefer its architecture.

Conclusion

GitOps transforms infrastructure management from an error-prone manual process into a reliable, auditable, automated workflow. By following these best practices—using polyrepo structures, managing secrets externally, implementing progressive delivery, and avoiding common pitfalls—you’ll build GitOps systems that scale with your organization.

Key takeaways:

  1. Structure matters: Use a dedicated GitOps repo with directory-based environments
  2. Never compromise on secrets: Use External Secrets Operator or Sealed Secrets, never plaintext
  3. Roll out carefully: Implement canary or blue-green deployments for production changes
  4. Monitor everything: Set up alerts for sync failures and drift detection
  5. Keep it simple: Start with basics, add complexity only when needed

Next steps:

  • Set up a test GitOps workflow with Argo CD or Flux on a local cluster
  • Migrate one non-critical application to GitOps to learn the patterns
  • Establish team guidelines for PR reviews and approval workflows
  • Implement automated testing in your GitOps CI pipeline

Remember: GitOps is a journey, not a destination. Start small, learn from each deployment, and incrementally improve your processes. The investment in proper GitOps practices pays dividends in reliability, velocity, and peace of mind.


References:

  1. GitLab - What is GitOps? - Core GitOps principles and methodology overview
  2. Red Hat - Git Workflows for GitOps Deployments - Trunk-based development patterns for GitOps
  3. Spot.io - Understanding GitOps Principles and Workflows - Declarative configuration and deployment strategies
  4. Sealos Blog - GitOps Best Practices for Production CI/CD - Repository structure and progressive delivery patterns
  5. Container Solutions - GitOps Limitations - Common pitfalls and challenges
  6. Microtica - GitOps Issues and Solutions - Rollback strategies and multi-environment management
  7. Red Hat - Secrets Management with GitOps - Sealed Secrets and External Secrets comparison
  8. Argo CD Documentation - Secret Management - Official guidance on secrets handling
  9. Spacelift - Flux vs Argo CD Comparison - Tool selection guidance and feature comparison