<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>Sejoon Kim — Blog</title>
    <link>https://sejoonkim.me/articles</link>
    <description>DevOps, Kubernetes, AWS, and infrastructure automation.</description>
    <language>en</language>
    <item>
      <title>Migrating Production Ingress from nginx to Traefik Gateway API</title>
      <link>https://sejoonkim.me/articles/nginx-to-traefik-gateway-api-migration</link>
      <guid>https://sejoonkim.me/articles/nginx-to-traefik-gateway-api-migration</guid>
      <pubDate>Wed, 20 May 2026 00:00:00 GMT</pubDate>
      <description>CVE-2025-1974 started it. Four months later, three clusters ran Traefik v3 on Gateway API with zero downtime. The honest version: what broke in stage, the certificate deadlock, and the nine days we deliberately did nothing.</description>
    </item>
    <item>
      <title>CrowdSec WAF on Kubernetes</title>
      <link>https://sejoonkim.me/articles/crowdsec-waf-kubernetes</link>
      <guid>https://sejoonkim.me/articles/crowdsec-waf-kubernetes</guid>
      <pubDate>Wed, 14 Jan 2026 00:00:00 GMT</pubDate>
      <description>Needed a WAF for public APIs. Chose CrowdSec (open-source). Hit three integration issues: PostgreSQL namespace, client IP preservation, log collection. Documenting the fixes.</description>
    </item>
    <item>
      <title>Reducing Docker Image Sizes by 70%</title>
      <link>https://sejoonkim.me/articles/docker-image-optimization</link>
      <guid>https://sejoonkim.me/articles/docker-image-optimization</guid>
      <pubDate>Wed, 24 Dec 2025 00:00:00 GMT</pubDate>
      <description>Our Docker images were 800MB+. Build times were slow, pulling images took forever. Spent a day optimizing - got images down to 200MB using multi-stage builds and Alpine base images.</description>
    </item>
    <item>
      <title>Adding Nodes to Kubernetes Cluster When Traffic Grew</title>
      <link>https://sejoonkim.me/articles/adding-kubernetes-nodes</link>
      <guid>https://sejoonkim.me/articles/adding-kubernetes-nodes</guid>
      <pubDate>Wed, 17 Dec 2025 00:00:00 GMT</pubDate>
      <description>Traffic increased 40% over 3 months. Nodes were running at 75% CPU. Ordered 2 new Hetzner servers and added them to the cluster. Took about 4 hours from ordering to nodes serving traffic.</description>
    </item>
    <item>
      <title>DNS Lookups Were Timing Out Randomly in Kubernetes</title>
      <link>https://sejoonkim.me/articles/coredns-timeout-investigation</link>
      <guid>https://sejoonkim.me/articles/coredns-timeout-investigation</guid>
      <pubDate>Wed, 10 Dec 2025 00:00:00 GMT</pubDate>
      <description>Applications occasionally failed DNS lookups with 5-second timeouts. Checked CoreDNS logs, CPU usage, network - everything looked fine. Turned out to be conntrack table exhaustion on worker nodes.</description>
    </item>
    <item>
      <title>Automating SSH Key Rotation on Hetzner Servers</title>
      <link>https://sejoonkim.me/articles/ssh-key-rotation-automation</link>
      <guid>https://sejoonkim.me/articles/ssh-key-rotation-automation</guid>
      <pubDate>Wed, 03 Dec 2025 00:00:00 GMT</pubDate>
      <description>Security audit said our SSH keys hadn&apos;t been rotated in 18 months. Wrote a script to rotate keys across all Hetzner servers and update them in Azure Key Vault. Took 3 hours to build, runs in 5 minutes.</description>
    </item>
    <item>
      <title>Enforcing Pod Security Standards Broke Half Our Deployments</title>
      <link>https://sejoonkim.me/articles/pod-security-standards-enforcement</link>
      <guid>https://sejoonkim.me/articles/pod-security-standards-enforcement</guid>
      <pubDate>Wed, 26 Nov 2025 00:00:00 GMT</pubDate>
      <description>Enabled Pod Security Standards in Kubernetes. Immediately broke 6 out of 12 applications because they were running as root or using privileged containers. Spent 2 days fixing them all.</description>
    </item>
    <item>
      <title>Moving Terraform State from Local Files to Azure Storage</title>
      <link>https://sejoonkim.me/articles/terraform-state-azure-backend</link>
      <guid>https://sejoonkim.me/articles/terraform-state-azure-backend</guid>
      <pubDate>Wed, 19 Nov 2025 00:00:00 GMT</pubDate>
      <description>We&apos;d been storing Terraform state in Git (bad idea). Moved it to Azure Blob Storage with state locking. Migration took 30 minutes. Should have done this from the start.</description>
    </item>
    <item>
      <title>Debugging Random 502 Errors from NGINX Ingress</title>
      <link>https://sejoonkim.me/articles/nginx-ingress-502-debugging</link>
      <guid>https://sejoonkim.me/articles/nginx-ingress-502-debugging</guid>
      <pubDate>Wed, 12 Nov 2025 00:00:00 GMT</pubDate>
      <description>Users reported occasional 502 errors. Logs showed NGINX couldn&apos;t reach backend pods. Took a day to find the issue - pod readiness probes were too aggressive and marking healthy pods as not ready.</description>
    </item>
    <item>
      <title>Adding Trivy Scans to Our CI Pipeline</title>
      <link>https://sejoonkim.me/articles/trivy-container-scanning-ci</link>
      <guid>https://sejoonkim.me/articles/trivy-container-scanning-ci</guid>
      <pubDate>Wed, 05 Nov 2025 00:00:00 GMT</pubDate>
      <description>Integrated Trivy into GitLab CI to scan container images for vulnerabilities before deployment. Found 47 high-severity issues we didn&apos;t know about. Some were fixable, some weren&apos;t.</description>
    </item>
    <item>
      <title>Automating PostgreSQL Backups to Azure Blob Storage</title>
      <link>https://sejoonkim.me/articles/postgres-backup-to-azure-blob</link>
      <guid>https://sejoonkim.me/articles/postgres-backup-to-azure-blob</guid>
      <pubDate>Wed, 29 Oct 2025 00:00:00 GMT</pubDate>
      <description>Set up daily PostgreSQL backups from our Kubernetes cluster to Azure Blob Storage. Using pg_dump in a CronJob with lifecycle policies for retention. Cost is about €8/month for 30 days of backups.</description>
    </item>
    <item>
      <title>external-secrets Wasn&apos;t Syncing from Azure Key Vault</title>
      <link>https://sejoonkim.me/articles/azure-key-vault-sync-delay</link>
      <guid>https://sejoonkim.me/articles/azure-key-vault-sync-delay</guid>
      <pubDate>Wed, 22 Oct 2025 00:00:00 GMT</pubDate>
      <description>Secrets in Azure Key Vault were updated but pods kept using old values. Took 2 hours to figure out the sync interval setting and force a refresh. Notes on how external-secrets actually works.</description>
    </item>
    <item>
      <title>Hit Let&apos;s Encrypt Rate Limit While Testing cert-manager</title>
      <link>https://sejoonkim.me/articles/lets-encrypt-rate-limit-hit</link>
      <guid>https://sejoonkim.me/articles/lets-encrypt-rate-limit-hit</guid>
      <pubDate>Wed, 15 Oct 2025 00:00:00 GMT</pubDate>
      <description>Made a mistake while testing cert-manager configuration. Issued 20 certificates for the same domain in an hour. Got rate limited for a week. Notes on staging environment and rate limits.</description>
    </item>
    <item>
      <title>Downsizing Hetzner Servers We Don&apos;t Need</title>
      <link>https://sejoonkim.me/articles/hetzner-server-rightsizing</link>
      <guid>https://sejoonkim.me/articles/hetzner-server-rightsizing</guid>
      <pubDate>Wed, 08 Oct 2025 00:00:00 GMT</pubDate>
      <description>Looked at actual CPU and memory usage across our Kubernetes nodes. Found we were paying for servers we barely used. Saved €120/month by switching to smaller machines.</description>
    </item>
    <item>
      <title>Upgrading a Self-Managed Kubernetes Cluster Without Managed Services</title>
      <link>https://sejoonkim.me/articles/self-managed-k8s-upgrade</link>
      <guid>https://sejoonkim.me/articles/self-managed-k8s-upgrade</guid>
      <pubDate>Wed, 01 Oct 2025 00:00:00 GMT</pubDate>
      <description>Moving from Kubernetes 1.28 to 1.29 on bare metal Hetzner servers. No managed control plane to click &apos;upgrade&apos; - we had to do it manually. Notes on what actually happened.</description>
    </item>
    <item>
      <title>Hetzner Network Issues and Why We Keep Backups Elsewhere</title>
      <link>https://sejoonkim.me/articles/hetzner-network-incident</link>
      <guid>https://sejoonkim.me/articles/hetzner-network-incident</guid>
      <pubDate>Wed, 24 Sep 2025 00:00:00 GMT</pubDate>
      <description>Hetzner&apos;s network had problems in their Falkenstein datacenter. Our services stayed up because we split workloads across regions and keep critical data in Azure.</description>
    </item>
    <item>
      <title>Kubernetes StatefulSet: A Deep Dive</title>
      <link>https://sejoonkim.me/articles/kubernetes-statefulset-deep-dive</link>
      <guid>https://sejoonkim.me/articles/kubernetes-statefulset-deep-dive</guid>
      <pubDate>Wed, 17 Sep 2025 00:00:00 GMT</pubDate>
      <description>Understanding StatefulSet internals, ordered pod management, persistent storage, and real-world use cases for stateful applications in Kubernetes</description>
    </item>
    <item>
      <title>Migration Diary Part 2: Moving Logs from Grafana Cloud to Kubernetes</title>
      <link>https://sejoonkim.me/articles/logging-migration-loki-alloy</link>
      <guid>https://sejoonkim.me/articles/logging-migration-loki-alloy</guid>
      <pubDate>Tue, 09 Sep 2025 00:00:00 GMT</pubDate>
      <description>Setting up Loki and Alloy for log aggregation in our Kubernetes cluster. Learning what all those Loki components actually do.</description>
    </item>
    <item>
      <title>Migration Diary Part 1: Moving Metrics from Grafana Cloud to Kubernetes</title>
      <link>https://sejoonkim.me/articles/prometheus-grafana-setup-kubernetes</link>
      <guid>https://sejoonkim.me/articles/prometheus-grafana-setup-kubernetes</guid>
      <pubDate>Sun, 07 Sep 2025 00:00:00 GMT</pubDate>
      <description>Moving our monitoring from Grafana Cloud to self-hosted Prometheus and Grafana on Kubernetes. Turns out most apps already had metrics support, just needed to enable it.</description>
    </item>
    <item>
      <title>Creating a Least-Privilege Monitoring User in Zalando Postgres Operator</title>
      <link>https://sejoonkim.me/articles/creating-monitoring-user-zalando-postgres</link>
      <guid>https://sejoonkim.me/articles/creating-monitoring-user-zalando-postgres</guid>
      <pubDate>Fri, 05 Sep 2025 00:00:00 GMT</pubDate>
      <description>How I solved the challenge of creating a monitoring-only user with minimal permissions in a GitOps-managed Postgres cluster</description>
    </item>
    <item>
      <title>Zero-Downtime Helm App Upgrade in Production</title>
      <link>https://sejoonkim.me/articles/zero-downtime-nginx-upgrade-gitops</link>
      <guid>https://sejoonkim.me/articles/zero-downtime-nginx-upgrade-gitops</guid>
      <pubDate>Wed, 03 Sep 2025 00:00:00 GMT</pubDate>
      <description>How to upgrade a Helm-managed application in production with zero downtime using GitOps and Kubernetes RollingUpdate strategy</description>
    </item>
    <item>
      <title>Why I&apos;m Obsessed with Uptime: The Real Cost of Downtime</title>
      <link>https://sejoonkim.me/articles/the-real-cost-of-downtime</link>
      <guid>https://sejoonkim.me/articles/the-real-cost-of-downtime</guid>
      <pubDate>Fri, 29 Aug 2025 00:00:00 GMT</pubDate>
      <description>My journey into understanding why every millisecond matters in DevOps, and what the research taught me about building reliable systems</description>
    </item>
    <item>
      <title>Managing Secrets in Kubernetes with External Secrets Operator</title>
      <link>https://sejoonkim.me/articles/kubernetes-external-secrets</link>
      <guid>https://sejoonkim.me/articles/kubernetes-external-secrets</guid>
      <pubDate>Sun, 24 Aug 2025 00:00:00 GMT</pubDate>
      <description>A comprehensive guide to implementing External Secrets Operator for secure secret management in Kubernetes clusters</description>
    </item>
  </channel>
</rss>