9 min read

Production Ready Certificates on AKS with Cert Manager and Azure DNS

Too many times systems have gone down because of DNS misconfigurations or issues with certificates. With AKS, ExternalDNS and Cert-Manager it should never be DNS again these are problems of the past.
Production Ready Certificates on AKS with Cert Manager and Azure DNS

It's always DNS. A sentence heard way to often. A well deserved second place goes to the phrase "A Certificate expired". Too many times systems go down because of DNS misconfigurations, issues with certificates or simply a human error related to any of the subjects. Running Azure Kubernetes Service? It should never be DNS again; these are problems of the past.

DNS and Certificates are probably the two topics that have haunted me (and likely anyone in IT Operations) at least once during my career. And "once" is on the most lower end of that spectrum. The thing is, with platforms such as Azure Kubernetes Service and all things that Azure and the Cloud Native ecosystem has to offer. The problem should never be DNS.. Or certificates..

Why? Let's say we're running our solution in Azure Kubernetes Service. That means.. Azure. And, that means Azure DNS. What if we can connect the two? Well, it turns out that we can. Over the course of this post we will configure Azure Kubernetes Service to interact with Azure DNS and Let's Encrypt to ensure that we achieve the following: Deploy an internet facing solution, automatically configure the DNS Record and request and configure a certificate (with ACME challenge).

And that is easier than you would think. Let's take a look.

Technologies used

We are using several technologies for our setup:

  • Cert Manager to request, store and manage all parts of the certificate lifecycle
  • ExternalDNS to create, update and delete DNS records based on our Ingress configuration
  • Let's Encrypt to provide us with a certificate

Prerequisites

We are deploying a production ready configuration, some prerequisites are required. We're talking production ready so perhaps you stumbled upon this post and already have something up and running and are already looking at improving your existing Azure Kubernetes Service configuration to unicorn level, epic proportions. If not, these are the prerequisites you need to ensure are in place before you can follow along.

  • Azure Kubernetes Service Cluster (Quick Start)
  • NGINX Ingress controller is deployed
  • Azure DNS and a Domain (Quick Start)
  • Solution/example deployed on AKS (Example)

Before we get started

Before we get started I would like to elaborate a bit on the usage of identities. Over the course of this post,we are using Workload identities on Azure Kubernetes Service. Often documentation (including my own) goes through the processing of setting these up relatively quickly and there isn't always a full understanding of how these identities work and how everything ties together. It's not within the scope of this blog post to go into depth but I find a basic explanation usually helps when troubleshooting issues related to permissions and identities. Why? because we are tying up a lot of things together here.

We will create a managed identities in Azure for both ExternalDNS and CertManager. We are deploying multiple identities because it is a best practice to separate these.

The Managed Identity is then provided with permissions (RBAC) on technologies in the Azure Subscriptions. In our case: Azure DNS. This allows the Managed Identity to perform actions on an Azure Level.

Then we create the Federated Identity which is essentially creates the service account inside Kubernetes which links the Kubernetes Service Account to the Managed Identity in Azure.

For each workload we deploy (Cert-Manager and ExternalDNS) we tell these workloads to use the service accounts we created.

If something is wrong, we need to look at any of these steps in the process:

  • The Managed Identity and its permissions
  • The Service in Kubernetes (is pointing to the right Managed Identity, is it in the correct namespace)
  • Is the workload you are deploying leveraging the correct service account?

Hope this helps when debugging similar configurations as explained in this blog post.

Setting things up

We need to set up both Cert Manager and External DNS. As Cert Manager will depend on ExternalDNS as we will request a certificate from Let's Encrypt and perform a validation through and ACME challenge, we will configure ExternalDNS first.

Installing and configuring ExternalDNS
Setting up ExternalDNS is pretty straight forward but we need to keep a couple of things in mind:

  • We need to interact with Azure DNS
  • That means ExternalDNS needs access to Azure DNS
  • That means an identity needs to be created

In my example I will be using Workload Identity which in my opinion should be the standard go-to configuration for interacting with Azure Resources from within AKS. If you have not enabled identity for your cluster you can do so by running:

$RG_AKS="<Resource Group that contains your AKS Cluster resource>"
$ClusterName="<Name of your cluster"
az aks update --resource-group $RG_AKS --name $ClusterName --enable-oidc-issuer --enable-workload-identity

Next we will create the managed identity and provide it permissions to manage our Azure DNS Instance and provide it with permissions on Azure DNS. In my case this instance is running in the Resource Group "rg-dns" and contains a zone called "cloudadventures.org".

$RG_AKS="rg-aks-certmanager"
$ClusterName="aks-certmanager"
$DNSZone="cloudadventures.org"
$RG_DNS="rg-dns"

kubectl create ns external-dns

az aks update --resource-group $RG_AKS --name $ClusterName --enable-oidc-issuer --enable-workload-identity

$MINameExternalDNS="mi-externaldns"
az identity create --resource-group $RG_AKS --name $MINameExternalDNS

$MI_ExternalDNS_CLIENT_ID=az identity show --resource-group $RG_AKS --name $MINameExternalDNS --query "clientId" --output tsv
$DNSZone_ID=az network dns zone show -n $DNSZone -g $RG_DNS --query "id" -o tsv
$DNSZone_RG_ID=az group show -g $RG_DNS --query "id" -o tsv

az role assignment create --role "Reader" --assignee $MI_ExternalDNS_CLIENT_ID --scope $DNSZone_RG_ID

az role assignment create --role "DNS Zone Contributor" --assignee $MI_ExternalDNS_CLIENT_ID --scope $DNSZone_ID

We can then visually verify if the permissions have been set by checking the IAM configuration of our Azure DNS Zone:

Now we need to set up our federated identity so that we can connect our managed identity to a Kubernetes Service account.

$OIDC_ISSUER_URL=az aks show -n $ClusterName -g $RG_AKS --query "oidcIssuerProfile.issuerUrl" -otsv

az identity federated-credential create --name $MINameExternalDNS --identity-name $MINameExternalDNS --resource-group $RG_AKS --issuer "$OIDC_ISSUER_URL" --subject "system:serviceaccount:external-dns:external-dns"

kubectl patch serviceaccount external-dns --namespace "external-dns" --patch "{""metadata"": {""annotations"": {""testazure.workload.identity/client-id"": ""$MI_ExternalDNS_CLIENT_ID""}}}"

With all the identities created we can continue and deploy External DNS to our cluster. For that we need to create a secret to ensure ExternalDNS is aware of the identity and is provided with the correct information to authenticate to Azure.

$subscriptionId = (az account show --query id -o tsv)
$tenantId = (az account show --query tenantId -o tsv)
$azureJsonPath = ".\azure.json"

$jsonContent = @"
{
  "tenantId": "$tenantId",
  "subscriptionId": "$subscriptionId",
  "resourceGroup": "$RG_DNS",
  "useWorkloadIdentityExtension": true,
  "userAssignedIdentityID": "$MI_ExternalDNS_CLIENT_ID"
}
"@

$jsonContent | Out-File -FilePath $azureJsonPath -Force

kubectl create secret generic azure-config-file --namespace "external-dns" --from-file .\azure.json

Time to deploy external dns. We can do so by applying the yaml below using kubectl. In my example I am filtering for the domain cloudadventures.org specifically. This ensures ExternalDNS only triggers on Ingress configurations that require a domain specific to cloudventures.org. We can do so by modifying the argument "--domain-filter=cloudadventures.org".

apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: external-dns
rules:
  - apiGroups: [""]
    resources: ["services","endpoints","pods", "nodes"]
    verbs: ["get","watch","list"]
  - apiGroups: ["extensions","networking.k8s.io"]
    resources: ["ingresses"]
    verbs: ["get","watch","list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
  - kind: ServiceAccount
    name: external-dns
    namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
spec:
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
        azure.workload.identity/use: "true"
    spec:
      serviceAccountName: external-dns
      containers:
        - name: external-dns
          image: registry.k8s.io/external-dns/external-dns:v0.15.0
          args:
            - --source=service
            - --source=ingress
            - --domain-filter=cloudadventures.org 
            - --provider=azure
            - --txt-prefix=externaldns-
          volumeMounts:
            - name: azure-config-file
              mountPath: /etc/kubernetes
              readOnly: true
      volumes:
        - name: azure-config-file
          secret:
            secretName: azure-config-file

And that is External DNS setup. From now until the end of time, if there is a problem: It should never be DNS.

We also wanted certificates, let's set up Cert Manager!

Cert Manager setup and configuration

The process for installing and configuring Cert Manager is similar to that of that of External DNS. We will reuse some bits of code but I have chosen the path of redundancy in case you are deciding to only install one or the other (Cert Manager or ExternalDNS).

Let's start with deploying Cert Manager. For that we need a couple of things:

  • Helm
  • values.yaml

Make sure you have the jetstack repo added to your Helm repositories and always update your repositories before installing something new 😄

helm repo add jetstack https://charts.jetstack.io
helm repo update

Now create a values.yaml file and save the file with the following contents:

podLabels:
  azure.workload.identity/use: "true"
serviceAccount:
  labels:
    azure.workload.identity/use: "true"

This will tell Cert Manager that we will be working with workload identity for authentication with Azure. Specifically with Azure DNS.

We can now install Cert Manager with the following command:

helm upgrade --install cert-manager jetstack/cert-manager `
  --namespace cert-manager `
  --create-namespace `
  --version v1.15.3 `
  --set crds.enabled=true `
  --values values.yaml

Give it a couple of seconds and you should be greated with a note stating "cert-manager v1.15.3 has been deployed successfully!"

Now it's time to create the identity for Cert Manager, similar to our ExternalDNS Configuration.

az identity create --name $MINameCertManager --resource-group $RG_AKS
$MI_CertManager_CLIENT_ID=az identity show --name $MINameCertManager --resource-group $RG_AKS --query 'clientId' -o tsv

$DNSZone="cloudadventures.org"
$RG_DNS="rg-dns"

$DNSZone_ID=az network dns zone show -n $DNSZone -g $RG_DNS --query "id" -o tsv
$DNSZone_RG_ID=az group show -g $RG_DNS --query "id" -o tsv

az role assignment create --role "Reader" --assignee $MI_CertManager_CLIENT_ID --scope $DNSZone_RG_ID
az role assignment create --role "DNS Zone Contributor" --assignee $MI_CertManager_CLIENT_ID --scope $DNSZone_ID

Next we create the federated identity that Cert Manager can then leverage from within AKS.

$OIDC_ISSUER_URL=az aks show -n $ClusterName -g $RG_AKS --query "oidcIssuerProfile.issuerUrl" -otsv
az identity federated-credential create --name "cert-manager" --identity-name "$MINameCertManager" --resource-group $RG_AKS --issuer "$OIDC_ISSUER_URL" --subject "system:serviceaccount:cert-manager:cert-manager"

We are almost there. We just created the service account which links Kubernetes bits and pieces to the identity created with the permissions provided to configure Azure DNS. All that there is left to do now is to create a cluster issuer for Cert Manager to use Let's Encrypt and leverage the service account to configure Azure DNS.

We do so by applying the following configuration with "kubectl apply -f .\ci-production.yaml".

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-production
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: <Your e-mail address>
    privateKeySecretRef:
      name: letsencrypt-production
    solvers:
    - dns01:
        azureDNS:
          resourceGroupName: <Azure DNS Resource Group>
          subscriptionID: <Your Subscription ID>
          hostedZoneName: <Domain Name of your Azure DNS Zone>
          environment: AzurePublicCloud
          managedIdentity:
            clientID: <CLient ID that comes from the variable $MI_CertManager_CLIENT_ID> 

Ingress Configuration and results

It is time to configure the ingress resource. If you have deployed the example provided at the prerequisites part of this post (here), you should be able to copy paste the following yaml.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: demo-ingress-monthly
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt
spec:
  ingressClassName: nginx
  rules:
  - host: unicorns.cloudadventures.org
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: demo-svc
            port:
              number: 80
  tls:
    - hosts:
      - unicorns.cloudadventures.org
      secretName: unicorns-cloudadventures-tls

Apply the configuration with "kubectl apply -f .\ingress.yaml". This example should be similar to what you normally deploy, with an addition of the annotation "cert-manager.io/cluster-issuer: letsencrypt". This will tell Cert Manager which ClusterIssuer to use. In our case the letsencrypt-production ClusterIssuer that contains our Azure DNS configuration.

Additionally, we have told ExternalDNS to look for any ingress configurations and specifically filter on the cloudadventures.org domain earlier in this post. Let's take a look at Azure DNS and see if our configuration works.

We can see that Cert manager created TXT entries for our "unicorns" subdomain and we can also see that a ExternalDNS created an A record for unicorns.cloudadventures.org.

It will take a couple of minutes as the need the ACME challenge to complete but we should be able to see a secret with the name that we are referring to in our ingress resource (unicorns-cloudadventures-tls).

💡
You may be confronted with a certificate of type "opaque". This should be because you are too quick and the certificate request and creation is still being processed. If this takes more than a couple of minutes, something else is up and inspecting the logs of Cert Manager will point you in the right direction.

Inspecting the secrets in the namespace we can see a service has been created.

And for the result:

Success, we have deployed our ingress resource and "automagically" Azure DNS was configured and a certificate was requested, issued and installed.

It no longer has to be DNS.. Or certificates for that matter.

Videos and Cloud Native Weekly

During Cloud Native Weekly we have recorded a 10 minutes of tech (part 1 and 2) showing a demo of the results. For those interested:

Cloud Native Weekly Cert Manager Part 1: https://www.youtube.com/live/aVUx2Rf00S4?si=MnS8-q2_Vh4-yNkj

Cloud Native Weekly Cert Manager part 2:https://www.youtube.com/live/D4lhNEwA4Sk?si=Rlw-DHxBYO23ay2q