Incremental IPv6 with Kubernetes

October 13, 2024 · 12 min read

TL;DR

Due to looming IP address exhaustion, we've been migrating my company's Kubernetes workloads to IPv6. While IPv6 has its sharp edges, AWS EKS's new IPv6-only mode and better OSS ecosystem support has made it possible to adopt incrementally.

Here's a bunch of tricks I've picked up in the process.

At my work, we've been struggling a bit over the past few years with decisions made (almost 10 years ago now) about our AWS network design. While we have a full class A private network (16,777,216 IPv4 addresses), we've managed to paint ourselves into the very sad corner of looming IP address exhaustion.

There's a few reasons:

Our integration with cell network carriers (to support our home security systems) requires a huge chunk of our IP space
Our decision to use a multi-account architecture in AWS, and that we chose to use a flat IP space across our accounts. This means our IP space is fragmented across accounts, regions, and availability zones, making a lot of that address space effectively unusable.

Even with all of this, we might have been fine... until we went big on Kubernetes.

Kubernetes eats IPs for breakfast

Kubernetes has been a huge win for us. But it gobbles up IP addresses like Pac-Man with a tapeworm.

It's fairly straightforward math: in AWS EKS, with the VPC CNI integration (i.e. a network plugin for Kubernetes that allows it to integrate with AWS's networking APIs), here's what happens to all your IPs:

The EKS control plane requires at least 16 addresses (at least 6 per subnet)
Every node (EC2 instance) requires at least one address, but depending on your CNI settings, the CNI plugin can eagerly allocate additional addresses to keep "warm" (to speed up pod creation)
Every pod on a node gets its own IP address. This includes not only user workloads, but also every daemonset pod. In our cluster, we have at about 8-10 daemonset pods per node.

This means, as we've migrated workloads to Kubernetes, we've increased the number of IPs we're using by roughly 10x.

This adds up quickly. We've had a few close calls with IPv4 exhaustion during high traffic events where we had to scramble to temporarily kill non-critical workloads to free up IPs, rebalance across availability zones, or provision new subnets to make sure customers weren't affected.

Actually, IPv6 is a thing

Unlike IPv4, IPv6 address space is so incomprehensibly large that it's effectively unlimited. For example, a typical IPv6 private subnet would have a /64 IPv6 CIDR, which is 18,446,744,073,709,551,616 addresses.

fun fact

Apparently a number with that many digits is called a "vigintillion". Numbers this large can only be discussed using your best Carl Sagan voice.

IPv6 has been a standard for like 25 years, but is still not widely adopted (for a lot of reasons, including backward-incompatibility, lack of ecosystem support, and ISPs squabbling and dragging their feet).

It's legitimately really difficult to migrate a large distributed architecture like ours to IPv6, because, historically, it would require simultaneous changes across many different systems, along with some scary big-bang moments. It also requires reconsidering a lot of assumptions built into your network design and security strategy.

It's been hard to figure out how to untangle that knot.

Enter EKS IPv6 mode

Given the scarcity (and price) of public IPv4 addresses, and to support the increasing scale of its customers, AWS has been under a lot of pressure to provide more viable paths to adopting IPv6. In one of the smartest moves I've seen from them in a while, they've used Kubernetes's built-in IPv6 support to build a new IPv6 mode for EKS.

Here's the core of the hack: While each node continues to get an IPv4 address, pods get only IPv6 addresses.

Inside the cluster, all traffic is via IPv6, but traffic to and from the cluster gets NATed through the nodes' IPv4 addresses. From the perspective of anything outside the cluster, connections appear to be coming from the nodes' IPv4 addresses. This means only the software inside the cluster has to be modified to use IPv6.

info

Note that if a host outside the cluster is IPv6-enabled, pods may just communicate directly with it over IPv6, and bypass the IPv4 NAT.

This translation of IP version between inside and outside the cluster has allowed us to migrate our workloads incrementally, which has made the whole process much more tractable.

Migrating only EKS workloads, alone, looks like it's going to allow us to reduce IPv4 address usage significantly, perhaps enough to solve our IPv4 exhaustion without any further network changes. Even if not, it should buy us years of additional runway before we hit that point.

What does migrating an EKS cluster to IPv6 require?

Unfortunately, you can't enable IPv6 mode on an existing EKS cluster; you have to create a new cluster and migrate your workloads over. There have been a bunch of specific challenges around this (mostly just minutiae around Terraform wrangling and executing DNS cutovers), but now that we've found most of the corner cases, the process is pretty mechanical.

The bulk of the remaining work is around making any code or configuration changes necessary in the individual workloads to get them to bind to IPv6 addresses.

A few basics

I've been a programmer for like 30 years, and I had never done anything with IPv6 before this migration. There were a few embarrassingly basic things I had to learn about IPv6:

The IPv6 "all interfaces" address is ::, which is equivalent to 0.0.0.0 in IPv4.
The IPv6 loopback address is ::1, equivalent to 127.0.0.1 in IPv4.
URLs that use an IPv6 address as the hostname need the address enclosed in square brackets, e.g. http://[2001:db8::1]:8080 so the colons in the address don't get confused with the port delimiter.
Happy Eyeballs is an algorithm (implemented by most network clients) that allows apps (including browsers) to efficiently decide whether to use an IPv6 or IPv4 address when both are advertised via DNS.

Your OS and language probably supports IPv6

One cool thing is that almost all modern OSes (Linux, Mac, Windows) support "dual-stack": they can listen on a port on both IPv6 and IPv4 from a single socket.

On top of this, most high-level programming languages (and their standard libraries) utilize this feature, so if you bind to the :: (all interfaces) address, you'll be able to listen on both IPv4 and IPv6 at the same time.

For example, in node.js:

const http = require('http');

const server = http.createServer((req, res) => {
  res.writeHead(200, { 'Content-Type': 'text/plain' });
  res.end('Hello World!\n');
});

// binds to port 8080 on all IPv6 and IPv4 interfaces by default!
server.listen(8080);

Or you can do it explicitly:

// Or you can do the same thing explicitly:
server.listen(8080, '::');

Here's the same basic thing in Go:

package main

import (
    "net"
)

func main() {
    // Binds to all IPv6 and IPv4 interfaces.
    // Note the square brackets around the address, since the 
    // interface is a subset of a URL.
    listener, err := net.Listen("tcp", "[::]:8080")

    // ...

The same thing is true in .NET, Python, Rust, Java and probably most other languages that aren't doing something weird in their networking implementation.

Of course, most languages also have lower level networking APIs that are IP version specific. If you're doing more complicated things with sockets, you may have a little more work to do.

Unfortunately, not all apps use dual-stack by default

Even though IPv6 support is readily available in most OSes and languages, it's not always enabled by default in every application. This was particularly annoying for us, because we use a lot of OSS and 3rd party container images as mock dependencies for integration tests, and supporting IPv6 meant we had to add explicit configuration for in a lot of places where we previously just used the defaults.

In most cases, the trick is finding the magic CLI arg, environment variable, or config file setting that controls the host to bind to, and setting it to ::.

warning

Some software (MongoDDB, Redis) goes out of their way to make :: not be a dual-stack binding. In those cases, you have to configure both the IPv6 and IPv4 listeners separately.

IPv6 cheat sheet

Here's a bunch of examples of various apps I've had to learn how to get working with IPv6:

aws-load-balancer-controller

You don't need to configure the aws-load-balancer-controller itself any differently for IPv6, but when creating Ingresses that use it, they need to have the following annotations to support IPv6:

# Tells the controller to create target groups of pod IP(v6) addresses. 
# The "instance" target type won't work on IPv6.
alb.ingress.kubernetes.io/target-type: ip
# Tells the controller to create a load balancer with IPv6 enabled
alb.ingress.kubernetes.io/ip-address-type: dualstack

The nice thing about this is that the load balancer itself will listen on IPv4 addresses (in addition to IPv6 addresses), which means IPv4 clients won't even know the app has been migrated.

warning

If you're using external-dns to create Route53 entries for your load balancer Ingresses, keep in mind that it will create both A records (for the load balancer's IPv4 addresses) and AAAA records (for its IPv6 addresses). This will change the behavior of any IPv6-enabled clients making connections to that load balancer, such that they may prefer the load balancer's IPv6 addresses over its IPv4 addresses.

This may be fine, but it is one way in which the "only IPv6 inside the cluster" model leaks. For example, if you have security groups on the load balancer, you'll need to make sure you're adding IPv6 versions of any rules.

ingress-nginx

In the helm values for ingress-nginx, you need to set the ipFamilies value to include IPv6:

controller:
  service:
    ipFamilies:
      - IPv6

MongoDB

Mongo binds to IPv4 only by default. You can get it listening to IPv6/IPv4 (dual-stack) interfaces with the following command override:

mongod --ipv6 --bind_ip ::,0.0.0.0

Here's an example of a Kubernetes pod:

apiVersion: v1
kind: Pod
metadata:
  name: mongo
spec:
  containers:
  - name: mongo
    image: mongo:8
    command:
    - mongod 
    - --ipv6
    - --bind_ip 
    - "::,0.0.0.0"
    ports:
    - containerPort: 27017
      protocol: TCP

More info: https://www.mongodb.com/docs/manual/core/security-mongodb-configuration/

Redis

Redis binds to IPv4 only by default. You can change it to bind to all interfaces with the following command override:

redis-server --bind "0.0.0.0 ::"

Here's an example of a Kubernetes pod:

apiVersion: v1
kind: Pod
metadata:
  name: redis
spec:
  containers:
  - name: redis
    image: redis:5
    command:
    - redis-server 
    - --bind
    - "0.0.0.0 ::"
    ports:
    - containerPort: 6379
      protocol: TCP

MariaDB

MariaDB 5.5+ already listens on :: by default, so no additional configuration is needed.

LocalStack

LocalStack currently doesn't support IPv6. However, I've opened a PR to add IPv6 support. If that PR gets merged, then you'll be able to use an IPv6 address in the GATEWAY_LISTEN env variable:

GATEWAY_LISTEN=[::]:4566

RabbitMQ

RabbitMQ listens on :: by default, so no additional configuration is needed.

warning

Note that while the rabbitmq:management image binds automatically to the main amqp port (5672) on IPv6, the management API (port 15672) does NOT bind to IPv6.

nginx

The nginx image's default config listens on both IPv4 and IPv6 by default.

If you're authoring your own nginx.conf, you need to add listeners for IPv6 and IPv4 separately. Here's an example of binding port 3001 on both IPv6 and IPv4:

nginx.conf
listen       3001; # IPv4
listen  [::]:3001; # IPv6

OTEL collector

The OpenTelemetry Collector config accepts the :: (all interfaces) address any place you could specify an IP address. For example:

config:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: "[::]:4317"

Other OTEL collector components will automatically use IPv6. For example, the Prometheus receiver correctly uses IPv6 pod addresses when scraping metrics.

Jaeger

Jaeger already listens on :: by default, so no additional configuration is needed.

WireMock

WireMock already listens on :: by default, so no additional configuration is needed.

Gradio

Gradio binds to 127.0.0.1 by default. You can use the server_name property to set up an IPv6 binding in the launch() method in the Blocks object:

blocks.launch(inline=False, server_port=5112, share=False, server_name="[::]")

Uvicorn

Uvicorn will bind to all IPv4/6 interfaces if you set host='::' in the Config object:

ip_config = Config(app=_fastapi_server, host="::", port=8080)
return Server(ip_config)

More info

More IPv6 cheat sheet examples, please!

I'll be adding more IPv6/dual-stack configuration examples as I encounter them.

Do you have more? Leave them in the comments and I'll add them to the list!

Kubernetes eats IPs for breakfast​

Actually, IPv6 is a thing​

Enter EKS IPv6 mode​

What does migrating an EKS cluster to IPv6 require?​

A few basics​

Your OS and language probably supports IPv6​

Unfortunately, not all apps use dual-stack by default​

IPv6 cheat sheet​

aws-load-balancer-controller​

ingress-nginx​

MongoDB​

Redis​

MariaDB​

LocalStack​

RabbitMQ​

nginx​

OTEL collector​

Jaeger​

WireMock​

Gradio​

Uvicorn​

More IPv6 cheat sheet examples, please!​

Kubernetes eats IPs for breakfast

Actually, IPv6 is a thing

Enter EKS IPv6 mode

What does migrating an EKS cluster to IPv6 require?

A few basics

Your OS and language probably supports IPv6

Unfortunately, not all apps use dual-stack by default

IPv6 cheat sheet

aws-load-balancer-controller

ingress-nginx

MongoDB

Redis

MariaDB

LocalStack

RabbitMQ

nginx

OTEL collector

Jaeger

WireMock

Gradio

Uvicorn

More IPv6 cheat sheet examples, please!