Using Docker Compose to deploy a Next.js app to a Linux App Service in Azure

When I was creating my sample app for deploying a Next.js app to Azure App Services I had instinctively reached for a Windows App Service because that was what I was familiar with, but I had a feeling that Linux App Service would be better suited for the project. I thought it would be better in terms of meeting the expectations of others if they were coming from other hosting providers where Linux is the default (or only) choice - so I wanted to try Linux App Service out, but I had no practical experience with it myself and I couldn't really find a lot of real world examples of people using it. Ultimately I decided to just stick with Windows as I knew that would work, but I wanted to come back and try Linux App Service at some point.

I recently found myself with the time and headspace to do just that, and I can say that I am really happy with Linux App Service (or more specifically Web App for Containers). There were some challenges along the way (isn't there always!), but I have switched my sample app over to using it with no plan on going back to a non-container App Service at this point.

This article describes my goals and approach in switching over from Windows to Linux App Service, and concludes with why I don't plan on switching back.

Although this article is written with the example of deploying a Next.js app to a Linux App Service it could serve as an example for deploying any app to Linux App Service using Docker Compose.

Goals and research

First up was sketching out some goals for what I wanted to achieve:

Using my sample Next.js app as a test, switch out Windows App Service for Linux App Service in the hosting infrastructure, but retain the CDN, App Insights and deployment slot environment setup
Retain certain functions of IIS on Windows App Service for doing things such as gzip compression, and rewrites and redirects
Continue to use Azure Pipelines with Azure CLI and Bicep to automate provisioning of hosting infrastructure and automated deployment of the application to multiple environments

Next was doing some research for how I could achieve my goals:

Searching around the internet led me to the idea of using NGINX to handle the things mentioned above that IIS was handling and have that also act as a reverse proxy through to the Next application server
Next had documented a sample for creating a Docker image that looked like a useful starting point
I found a really great article by Steve Holgado that covers setting up a production-ready Next.js environment using NGINX and PM2 via Docker Compose and was a fantastic reference that gave me a big leg up when getting started (I don't know you Steve, but thank you!)
Microsoft's documentation on configuring multi-container apps proved that support for Docker Compose was available in Linux App Services and pointed out the current limitations (which would not be a blocker for me)
I found a couple of NGINX articles on avoiding configuration mistakes and tuning NGINX performance would help with creating a (hopefully!) solid NGINX server config as this was the first time I had used NGINX
Docker's getting started with Docker Compose docs would be useful because I was still pretty green with Docker!

Based on my research it seemed that there were two broad steps for implementing the changes required to meet my goals:

Get the sample app running via Docker Compose locally
Get the sample app running via Docker Compose in a Linux App Service

Running the app via Docker Compose locally

Getting the app running via Docker Compose locally seemed like a logical first step because if it works locally then it should (in reality there were some challenges in Azure, which I will come to!) work anywhere.

Steve Holgado's article was a huge help here, but I ended up making some changes to meet my goals. The following sections describe how I ended up getting my Next.js app running via Docker Compose locally.

Dockerising Next.js

When Dockerising (I'm unsure on that "word" myself, but I've seen it used a few times so I'm gonna use it!) Next.js I used the Dockerfile from Next's own sample as the basis for my Dockerfile, but made a few changes:

Switched from yarn commands to npm commands
- I'm using npm in my sample repo because it's the lowest common denominator when it comes to Node package managers!
Used npm to install pm2, and use pm2-runtime as a process manager for our Next.js server
- I found conflicting guidance on whether pm2 is useful inside a Docker container (so weird not to find consensus on the internet, right?!) - Next don't use it (or anything similar) in their sample, but it was recommended in Steve Holgado's article, and I know Microsoft use it in a lot of their sample apps so I ended up adding it as it sounded like a "good to have" on balance
Created a script that would generate a JSON file containing the full Next config, which could be consumed by a custom server script
- I'm using a custom server script because I want to add server-side monitoring via Azure Insights, but I will explain why a had to generate the JSON file in the next section!
Copy the entire node_modules folder rather than just the "standalone" node_modules
- This is regrettable as it inflates the size of the image when we want to keep it as small as possible, but I will also explain why I am doing this in the next section!

I also had to ensure that my .dockerignore file was configured appropriately so that I wouldn't end up with any unwanted files copied over to the Docker image during the build. This is the Dockerfile I ended up with for the Next app:

# Install dependencies only when needed
FROM node:16-alpine AS deps

# Check https://github.com/nodejs/docker-node/tree/b4117f9333da4138b03a546ec926ef50a31506c3#nodealpine to understand why libc6-compat might be needed.
# RUN apk add --no-cache libc6-compat

WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install --omit=dev

# Rebuild the source code only when needed
FROM node:16-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .

# Next.js collects anonymous telemetry data about general usage, which we opt out from
# https://nextjs.org/telemetry
ENV NEXT_TELEMETRY_DISABLED 1

RUN npm run build

# Production image, copy all the files and run next
FROM node:16-alpine AS runner
WORKDIR /app

# Install PM2 to manage node processes
RUN npm install pm2 --location=global

RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001

# Disable telemetry during runtime
ENV NEXT_TELEMETRY_DISABLED 1

# Automatically leverage output traces to reduce image size
# https://nextjs.org/docs/advanced-features/output-file-tracing
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static

# You only need to copy next.config.js if you are NOT using the default configuration
COPY --from=builder /app/next.config.js ./
COPY --from=builder /app/public ./public
COPY --from=builder /app/package.json ./package.json

# Override the standalone server with custom server, which requires the generated config file
COPY --from=builder /app/next.config.json ./
COPY --from=builder /app/server.js ./

# TODO: Standalone output is not including packages used by custom server.js
COPY --from=builder /app/node_modules ./node_modules

USER nextjs

EXPOSE 3000

ENV PORT 3000

CMD ["pm2-runtime", "node", "--", "server.js"]

To keep the size of the resulting Docker image as small as possible node:16-alpine is being used as a base image and a feature of Next.js called Output Tracing is used to reduce the size of the output from next build by analyzing imports and creating a trimmed down node_modules folder that acts as a smaller "standalone" deployment. This is all really great!

Unfortunately, I hit a couple of challenges with the Output Tracing feature:

The standalone output generates a custom server script for you, but there is no way to customise that script - for example, to add the code I require to implement server-side monitoring via App Insights
I couldn't find a way to get Next.js to include my custom server script when analysing imports for Output Tracing - this meant that the server would not run because it had dependencies that were missing from the standalone node_modules output

So I had to create some workarounds...

Using a custom server with Next.js Output Tracing

As mentioned, Next.js generates a custom server script for you as part of the Output Tracing standalone build, but there is no way to customise it. I had a custom server script already that was being used when deployed to the Windows App Service that included the customisations I wanted to make for logging and monitoring requests to the server via Application Insights. I thought I could just copy that over and overwrite the standalone server, but after giving it a go my app wouldn't run and I would just get application errors complaining about missing config values despite having deployed next.config.js alongside my app.

After comparing the generated standalone server script to my custom server script I could see that the full Next config was being inlined into the generated standalone server script, and without doing the same in my custom server script the app failed to run. It seems like the standalone app requires the config to be provided to the server when creating a NextServer instance, but doesn't load the config from next.config.js as I thought it would.

The challenge was how to generate the full Next config to supply to my custom server, and after a little bit of digging through Next's source code I found a function called loadConfig I could use to generate a full config that included customisations I have made in next.config.js and then write that to a JSON file like so:

// File: ./server/generate-config.js

const path = require('path')
const fs = require('fs')
const loadConfig = require('next/dist/server/config').default
const nextConfig = require('../next.config')

loadConfig('phase-production-build', path.join(__dirname), nextConfig).then(
  (config) => {
    config.distDir = './.next'
    config.configOrigin = 'next.config.js'

    fs.writeFileSync('next.config.json', JSON.stringify(config))
  }
)

My custom server script can then read the contents of the JSON file containing the full config and use it when creating a NextServer instance, and I can use it with the standalone build.

Finally, I modified my scripts in package.json so that the build script would generate the config before running next build:

{
  "scripts": {
    "config:generate": "node ./server/generate-config.js",
    "dev": "next dev",
    "build": "npm run config:generate & npm run build:next",
    "build:next": "next build",
    "lint": "next lint",
    "start": "node server.js"
  }
}

Including custom server imports in Next.js Output Tracing

My custom server script imports dependencies from the applicationinsights npm package so that I can log and monitor server performance, metrics, errors etc through Application Insights, but I could not get Output Tracing to include the custom server script in its analysis and therefore these dependencies were missing from the standalone build output.

The Output Tracing documentation includes some usage caveats that describe how you can include or exclude files from the trace using unstable_includeFiles and unstable_excludeFiles page config props, but despite getting unstable_includeFiles to work it did not affect the output of Output Tracing.

I mean to open an issue with Next.js to ask for guidance on how to include my custom server script in Output Tracing, but in the meantime I have resigned to copying the entire node_modules folder into my build. This is not an ideal solution because it destroys the main benefit of the standalone build, which is to slim down node_modules, but I figured it was a "good enough" compromise until I can find a better solution.

Dockerising NGINX

Dockerising NGINX is pretty straghtforward because the official nginx:alpine image does most of the work for you - you just have to provide your config. The key features to configure in my case were:

The reverse proxy to the Next app container
gzip compression
Security headers
Rewrite rules required by the application (I have one that is used for cache busting CDN requests when there is a new build deployed)

Apart from the reverse proxy Next.js is capable of handling all of these things itself, but in my opinion the application should not be burdened with these functions unless there is a good reason for it, which is why I wanted a server such as NGINX (there are other options) in front of my Next application.

As mentioned earlier I found some good resources to guide me in this because I had never used NGINX before:

I started by following Steve Holgado's guide, but decided not to use the static asset caching because that's what the CDN is taking care of
I then followed the NGINX configuration and performance tuning guides and incorporated anything that seemed relevant
I then found the NGINXConfig tool from DigitalOcean and incorporated a lot of stuff found in the generated configuration provided by that tool

As I was "stitching it all together" I also had to reference the NGINX docs a lot to look up what some of the configuration was doing, but I what I ended up with is pretty solid (in my opinion!). I don't want to switch the focus from Docker to NGINX, but if you want to take a look at the config files you can find them in my sample repo.

This is the Dockerfile I ended up with:

# Base on offical NGINX Alpine image
FROM nginx:alpine

# Remove any existing config files
RUN rm /etc/nginx/conf.d/*
RUN rm /etc/nginx/nginx.conf

# Copy config files
COPY ./includes /etc/nginx/includes
COPY ./conf /etc/nginx/conf.d
COPY ./nginx.conf /etc/nginx/nginx.conf

# Expose the listening port
EXPOSE 8080

# Launch NGINX
CMD ["nginx", "-g", "daemon off;"]

Next up was getting the NGINX and Next app containers building and running via Docker Compose.

Docker Compose

I again found myself using Steve Holgado's article as a starting point, but made some changes in the way that environment variables would be handled:

I have variables that need to be available in process when next build runs, which means they need to be available to the Docker build process
I have variables that need to be available at runtime in the Docker container

I had been using a single .env.local file to manage all environment variables used by my Next.js app in my development environment so the most convenient way of providing these to Docker seemed to be to use the env_file option.

Environment variables defined in env_file are not visible during the build however so instead the builder stage of my Next Dockerfile copies the env files so that they are available, but they are not then copied to the final runner stage as this is where env_file will be available.

I'm aware that copying the env files in this manner may cause security issues if your env files contain secrets. The recommended way to pass secrets for use at build time (appears to be) to use args - this is what Docker recommend - but that seems like a potential maintenance headache to me as args must be specified individually.

My thoughts on the implications of this are that the files are only made available during an intermediate build stage that would be present on my development machine or (eventually) in my private Azure Container Registry so if unauthorised people have access to those then I frankly have bigger problems. Runtime environment variables are also dealt with differently when we are in App Service (which I will get to) so env_file is not needed in production. Depending on the sensitivity of the secrets in question, or if I was pushing to a public registry then I would go with the args approach for build time secrets.

I don't mean to brush security concerns under the carpet, but this seems like a pragmatic trade-off for this sample app, and I know I have other options should I need them.

I also ended up splitting my .env.local into multiple files to take advantage of Next's Default Environment Variables feature so that it was easier to manage dev (used by next dev) vs production (used by next build) variables.

And this was the final docker-compose.yml:

version: '3'
services:
  nextjs:
    container_name: nextjs
    build:
      context: ./
    env_file:
      - ./.env
      - ./.env.production
      - ./.env.local
  nginx:
    container_name: nginx
    build:
      context: ./nginx
    ports:
      - 3001:8080

I can now run the app via Docker compose locally by executing docker compose up and browsing to http://localhost:3001

Running the app via Docker Compose in a Linux App Service

I already had a bunch of Powershell and Bicep scripts and an Azure Pipeline that I was using to automate spinning up and deploying to resources in Azure so this part was mostly about identifying what changes were required to those existing scripts to switch over from using a Windows App Service to a Linux App Service and my new Docker setup. The main changes were:

I would need to provision an Azure Container Registry using Bicep, which provides me with a private Docker container registry for my application's Docker images
I would need to change my Bicep deployment to provision a Linux App Service instead of a Windows App Service
The pipeline would need to use Docker Compose to build my application's images and push them to Azure Container Registry
The pipeline would need to instruct the App Service to pull new images from Azure Container Registry on successful build

Provisioning an Azure Container Registry with Bicep

This is pretty straight-forward with Bicep:

resource containerRegistry 'Microsoft.ContainerRegistry/registries@2021-09-01' = {
  name: containerRegistryName
  location: location
  sku: {
    name: containerRegistrySkuName
  }
  properties: {
    adminUserEnabled: false
  }
}

The only thing of note really is the adminUserEnabled: false property - I will come back to this soon when talking about role assignments and setting up authorisation for the App Service to pull images from Azure Container Registry.

Provisioning a Linux App Service with Bicep

As with any App Service you have to define an App Service Plan as well as the App Service itself and any required settings. Below are essentially the Bicep resource definitions I ended up with, but just condensed a bit from my actual definitions so I can focus on the important details:

// Define the app service plan

resource appServicePlan 'Microsoft.Web/serverfarms@2020-12-01' = {
  name: appServicePlanName
  location: location
  kind: 'linux'
  sku: {
    name: skuName
    capacity: skuCapacity
  }
  properties: {
    reserved: true
  }
}

// Define the app service

resource appService 'Microsoft.Web/sites@2020-12-01' = {
  name: appServiceName
  location: location
  kind: 'app,linux,container'
  identity: {
    type: 'SystemAssigned'
  }
  tags: {
    'hidden-related:${appServicePlan.id}': 'empty'
  }
  properties: {
    serverFarmId: appServicePlan.id
    httpsOnly: true
    clientAffinityEnabled: false
    siteConfig: {
      http20Enabled: true
      minTlsVersion: '1.2'
      alwaysOn: true
      acrUseManagedIdentityCreds: true
      appCommandLine: ''
    }
  }
}

// Define app service settings

resource webAppConfig 'Microsoft.Web/sites/config@2020-12-01' = {
  name: '${appServiceName}/appsettings'
  properties: {
    DOCKER_IMAGE_NAME: containerImageName
    DOCKER_IMAGE_TAG: containerImageTag
    DOCKER_REGISTRY_SERVER: containerRegistryServer
    DOCKER_REGISTRY_SERVER_URL: containerRegistryServerUrl
    WEBSITES_ENABLE_APP_SERVICE_STORAGE: 'false'
    WEBSITES_PORT: '8080'
  }
}

There was a bit of trial and error involved in getting this working exactly as I wanted, but I will try to highlight some things to be aware of:

You have to specify the reserved: true property when creating the App Service Plan - if you don't you will get a Windows App Service Plan (despite having specified kind: 'linux')
I found conflicting examples with specifying kind on the App Service - app,linux,container was the first that worked for me, but app,linux should work too
Assigning permissions on the App Service so it can pull images from Container Registry is a bit of a faff!
- The "easiest" way to do it and what you may see in docs and tutorials is to use the built-in Container Registry "admin user" and add the credentials to the App Service as settings DOCKER_REGISTRY_SERVER_USERNAME and DOCKER_REGISTRY_SERVER_PASSWORD, but then you have admin credentials stored as plain text, which is... well, no thanks
- The method I settled on (though of course it's Azure so there are multiple ways of doing essentially the same thing) was using managed identities and role based access control - I'll cover how I setup the role assignments for this below
You may also see in docs and tutorials for Linux App Service a property of the siteConfig being set in the resource definition called linuxFxVersion - to set this property for use with Docker Compose (or "multi-container apps" in Azure nomenclature) you have to set this to COMPOSE|{a base64 encoded string containing the contents of your Docker Compose yaml file} (I am not joking though I wish I was) - do not set this here as it can and should (in my case at least) be set in the Azure Pipelines yaml using Azure CLI
- Setting it here could trigger App Service to pull images from Container Registry, which (again, in my case) I don't want to do at this point
The App Service settings shown above starting with DOCKER_ are not required, but I found them useful
- For reference - as far as I could see there was no way to get information from the Azure Portal about which container images are in use on your App Service or where they came from
- As outputs in the Pipeline as these values are required in the Docker Compose tasks to build and push images to the Container Registry - I'll expand on this later!
The WEBSITES_ENABLE_APP_SERVICE_STORAGE App Service setting is used to specify whether you want to mount an SMB share to the /home/ directory - my app doesn't need it, but it's good to know this is possible
The WEBSITES_PORT App Service setting configures the port that your container is listening on for HTTP requests
- In a multi-container app only one service can receive HTTP requests

Using managed identities and role based access control (RBAC) to allow App Service to pull images from Azure Container Registry

I wanted to use managed identities and RBAC to provide authorisation to my App Service to pull images from the Container Registry. The best documentation I could find on doing this provides its examples using Azure CLI, but I needed to grant the role assignments to my App Service via Bicep. It did at least give me some steer on what I needed to do if not how to do it, which is:

Enable the system-assigned managed identity on the App Service
Grant the managed identity the AcrPull role on the Container Registry
Configure the App Service to use the managed identity when pulling from the Container Registry

Step 1 is simple enough and can be seen in the App Service Bicep resource definition above - under identity you set type: 'SystemAssigned' and you're done.

I'll skip to step 3 as that is also simple and can be seen in the App Service Bicep resource definition above - under siteConfig you set acrUseManagedIdentityCreds: true.

Now back to step 2, which is a little trickier! It's simple enough to create the role assignment definition in Bicep:

// This is the ACR Pull Role Definition Id: https://docs.microsoft.com/en-us/azure/role-based-access-control/built-in-roles#acrpull
var acrPullRoleDefinitionId = subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '7f951dda-4ed3-4680-a7ca-43fe172d538d')

resource appServiceAcrPullRoleAssignment 'Microsoft.Authorization/roleAssignments@2020-10-01-preview' = {
  scope: containerRegistry
  name: guid(containerRegistry.id, appService.id, acrPullRoleDefinitionId)
  properties: {
    principalId: appService.identity.principalId
    roleDefinitionId: acrPullRoleDefinitionId
    principalType: 'ServicePrincipal'
  }
}

However, when running this deployment in Azure Pipelines it will fail unless the service principal that executes the deployment (typically this is tied to a service connection in your Azure DevOps project) has permission to assign roles in RBAC. If you want to use the built-in roles for your service principal (I was assigning the "Contributor" role) this basically means that they need to be an "Owner", which is probably not what you want as it would be too privileged.

My solution for this was to create a custom role, "ContributorWithRBAC" in subscription scope. The definition of the ContributorWithRBAC role I created is identical to the built-in Contributor role, but with the Microsoft.Authorization/*/Write NotAction removed. I can then assign my custom role to the service principals that would execute my deployments.

This may still be too privileged for you (based on principle of least privilege) in which case you should create a role with as narrow scope as you require as long as it can assign roles in RBAC.

In my project I have some Powershell scripts that I use to initialise service connections and service principals (among other things) so I could script both the creation of the custom role and the assignments using Azure CLI. Below are the relevant parts of the scripts, condensed down from my actual scripts to highlight just the relevant details:

# Set the name of the custom role
$ContributorWithRbacRoleName = 'ContributorWithRBAC'

# This function will create or update the custom role on the provided $SubscriptionId
function Set-ContributorWithRbacRole {
    param(
        [Parameter(Mandatory=$true)]
        [string]$SubscriptionId
    )

    $Scope = "/subscriptions/$SubscriptionId"

    $Role = Get-AzRole -Scope $Scope -RoleName $ContributorWithRbacRoleName

    if ($Role) {
        # If the Role already exists there is nothing more to do

        return $Role
    }

    # We need to build this json string dynamically, but there are some gotchas, which is why it is like it is below...
    # See https://github.com/Azure/azure-cli/issues/16940#issuecomment-782546983
    $RoleDefinition = '{\"Name\": \"'+$ContributorWithRbacRoleName+'\", \"Description\": \"Same as built-in Contributor role, but allows you to assign roles in Azure RBAC.\", \"Actions\": [\"*\"], \"NotActions\": [\"Microsoft.Authorization/*/Delete\", \"Microsoft.Authorization/elevateAccess/Action\", \"Microsoft.Blueprint/blueprintAssignments/write\", \"Microsoft.Blueprint/blueprintAssignments/delete\", \"Microsoft.Compute/galleries/share/action\"], \"DataActions\": [], \"NotDataActions\": [], \"AssignableScopes\": [\"'+$Scope+'\"]}'

    $Role = (az role definition create --role-definition "$RoleDefinition" | ConvertFrom-Json)

    return $Role
}

# This function attempts to get a role with a matching $RoleName in the provided $Scope
function Get-AzRole {
    param(
        [Parameter(Mandatory=$true)]
        [string]$Scope,
        [Parameter(Mandatory=$true)]
        [string]$RoleName
    )

    $Role = (az role definition list --name $RoleName --scope $Scope --query '[0]' | ConvertFrom-Json)

    return $Role
}

# This function creates the role assignment by assigning the $Role to the $Assignee in the provided $Scope
function Set-AzRoleAssignment {
    param(
        [Parameter(Mandatory=$true)]
        [string]$Role,
        [Parameter(Mandatory=$true)]
        [string]$Assignee,
        [Parameter(Mandatory=$true)]
        [string]$Scope
    )

    $RoleAssignment = (az role assignment create `
    --role $Role `
    --assignee $Assignee `
    --scope $Scope `
    | ConvertFrom-Json)

    return $RoleAssignment
}



# Basic usage

## Create/update the custom role (assuming you have the id of the subscription you are creating the role on)

$Role = Set-ContributorWithRbacRole -SubscriptionId $SubscriptionId

## Assign the role to a service principal (assuming you have the id of the service principal and id of the resource group in scope)

$RoleAssignment = Set-AzRoleAssignment -Role $Role.roleName -Assignee $ServicePrincipalId -Scope $ResourceGroupId

With the infrastructure updates done it was time to turn my focus to the Pipeline.

Using Docker Compose in Azure Pipelines to build and push images to Azure Container Registry

This is also reasonably straight-forward due to Microsoft providing a DockerCompose task, but did require some additional effort to deal with the environment variables required by the build process.

I realise I am banging on about environment variables again, but bear with me! Also, if you're reading this and know of a better way to deal with environment variables at build time in Docker then feel free to raise an issue on the next-azure repo - I'm interested in hearing about other approaches!

As mentioned earlier my app is using Next's Default Environment Variables feature and the build process expects to have access to these .env files:

.env contains variables used in all environments
.env.production contains variables used in the production environment, and will override variables set in the above file should there be duplicate keys
.env.local contains variables used in the current environment, and will override variables set in either of the above two files should there be duplicate keys

The expectation is that .env and .env.production will never contain secrets or sensitive data and will be committed and available in the repo, but .env.local could contain overrides for the current environment and could contain secrets or sensitive data so will never be committed and available in the repo. So the problem to solve was how to generate an .env.local file in the pipeline that contained variables specific to the target environment and could keep secrets secret!

The approach I ended up taking was this:

I was already using variable groups to source variables for the target environment and surface them in the pipeline
Variable groups can source values from key vault for secrets or sensitive data
I already maintain an .env.template file in the repo that describes all environment variables used by the app
I found this EnvTransform task that can transform the .env.template file and generate an .env.local file by providing values for each key that matches a variable with a matching key name from the variable group
I can run the EnvTransform task prior to running the DockerCompose tasks so that Docker has everything it needs during the build

The relevant parts of the Azure Pipelines yaml look like this:

- task: EnvTransform@0
  displayName: 'Create env file for build'
  inputs:
    inputType: 'file'
    inputFile: '$(System.DefaultWorkingDirectory)/.env.template'
    outputFile: '$(System.DefaultWorkingDirectory)/.env.local'

- task: DockerCompose@0
  displayName: 'Build Next app container images'
  inputs:
    action: Build services
    azureSubscriptionEndpoint: '$(AzureServiceConnection)'
    azureContainerRegistry: '$(DOCKER_REGISTRY_SERVER)'
    projectName: '$(DOCKER_IMAGE_NAME)'
    dockerComposeFile: docker-compose.yml
    qualifyImageNames: true
    additionalImageTags: '$(DOCKER_IMAGE_TAG)'
    includeLatestTag: true

- task: DockerCompose@0
  displayName: 'Push Next app container image to container registry'
  inputs:
    action: Push services
    azureSubscriptionEndpoint: '$(AzureServiceConnection)'
    azureContainerRegistry: '$(DOCKER_REGISTRY_SERVER)'
    projectName: '$(DOCKER_IMAGE_NAME)'
    dockerComposeFile: docker-compose.yml
    qualifyImageNames: true
    additionalImageTags: '$(DOCKER_IMAGE_TAG)'
    includeLatestTag: true

The other thing to note from the above is that the DOCKER_REGISTRY_SERVER, DOCKER_IMAGE_NAME and DOCKER_IMAGE_TAG variables all come from the output of my Bicep deployment (you might recall that I also used them as App Service settings), but no matter where you source them from you need to be consistent with these values as they are also used in the next step to instruct the App Service to pull images from the Azure Container Registry:

DOCKER_REGISTRY_SERVER is the FQDN of the Container Registry where the images will be pushed
- For me this comes from the Bicep resource described earlier, specifically from containerRegistry.properties.loginServer
DOCKER_IMAGE_NAME ends up being used as the name of the repository in the registry that will contain your images
- I went with using an environment-specific resource prefix, which is basically a concatenation of a project id and target environment name e.g. najspreview, najsprod etc
- Something else to be aware of is that because we are using Docker Compose you will end up with a repository per service so, for example, my docker-compose.yml describes two services named nextjs and nginx so there are actually two repositories per environment e.g. najspreview_nextjs and najspreview_nginx - as far as I can tell from the docs there is no limit on the number of repositories allowed in a registry
DOCKER_IMAGE_TAG I have set to the built-in variable Build.BuildNumber
- Microsoft provide some recommendations on tagging and versioning images and I have gone for the "Unique tags" approach
- I also include the latest tag as a "Stable tag", but I use the "Unique tag" in the deployment

Using Docker Compose to pull images from Container Registry to an App Service

Phew, the final step! The last step on this journey was to trigger a "deployment" to the App Service, which is to instruct it to pull the latest images after they had been successfully built and pushed to the Container Registry.

My first attempt at this was to use the Azure Web App for Container task as this seemed the obvious choice, but quickly ran into this issue. If you recall when I was describing the changes to the Bicep resource definitions I am not setting a linuxFxVersion on the App Service and for some reason this results in the task trying to incorrectly set windowsFxVersion when it executes (sad times). I tried some of the workarounds suggested in that GitHub issue, but there wasn't any way around it (more sad times).

My second attempt, and what eventually worked for me, was to use the Azure CLI task to execute a az webapp config container set command that does correctly set the linuxFxVersion. If you recall from earlier, the linuxFxVersion for a multi-container app must be set to COMPOSE|{a base64 encoded string containing the contents of your Docker Compose yaml file}, but the nice thing about this Azure CLI command is that you can just pass it a Docker Compose yaml file and it will deal with encoding it and setting the value in the correct format.

The Docker Compose file for the App Service needed to be different to the Docker Compose file used for the build however, because it needs to specify the exact images and tags to pull and the registry to pull them from. This is the file I ended up with:

version: '3'
services:
  nextjs:
    container_name: nextjs
    image: ${DOCKER_REGISTRY_SERVER}/${DOCKER_IMAGE_NAME}_nextjs:${DOCKER_IMAGE_TAG}
  nginx:
    container_name: nginx
    image: ${DOCKER_REGISTRY_SERVER}/${DOCKER_IMAGE_NAME}_nginx:${DOCKER_IMAGE_TAG}
    ports:
      - 8080:8080

Those DOCKER_ variables are the same variables used earlier in the Docker build and push steps that are sourced from my Bicep deployment and made available as variables in the Pipeline. Docker Compose can substitute environment variables so my hope was that because App Service makes its settings available as environment variables and I had set settings with matching keys on the App Service that Docker Compose would be able to substitute these variables on the App Service. Sadly it does not. Instead I had to include another task in the Pipeline that would do the variable substitution (I already had a Powershell script in my project that I could reuse for this purpose) and generate a final Docker Compose yaml file that could be passed to the Azure CLI command.

The relevant tasks ended up looking like the below:

# Replace tokens in the app service docker compose file with relevant values for this build
# Locally docker compose CLI would do this for us, but app services does not expand these tokens
- pwsh: |
    . ../infra/Set-Tokens.ps1
    Set-Tokens `
      -InputFile docker-compose.appservice.yml `
      -OutputFile "$(DistDirectory)/docker-compose.appservice.yml" `
      -StartTokenPattern "\$\{" `
      -EndTokenPattern "\}" `
      -Tokens @{ `
        DOCKER_REGISTRY_SERVER="$(DOCKER_REGISTRY_SERVER)"; `
        DOCKER_IMAGE_NAME="$(DOCKER_IMAGE_NAME)"; `
        DOCKER_IMAGE_TAG="$(DOCKER_IMAGE_TAG)" `
      }
  workingDirectory: '$(System.DefaultWorkingDirectory)/.azure/web-app'
  displayName: 'Create App Service Docker Compose file'

# Update app service container settings to pull new images from the registry
# Using AzureCLI task instead of AzureWebAppContainer task because AzureWebAppContainer incorrectly sets `windowsFxVersion` instead of `linuxFxVersion`
# See https://github.com/microsoft/azure-pipelines-tasks/issues/14805
- task: AzureCLI@2
  displayName: 'Update web app container settings'
  inputs:
    azureSubscription: '$(AzureServiceConnection)'
    scriptType: pscore
    scriptLocation: inlineScript
    inlineScript: |
      $slotName = "$(WebAppSlotName)"
      if ($slotName) {
        # Deploy to slot
        az webapp config container set --name "$(WebAppName)" --resource-group "$(SharedResourceGroupName)" --multicontainer-config-file "$(Pipeline.Workspace)/dist/docker-compose.appservice.yml" --multicontainer-config-type COMPOSE --slot $slotName
      }
      else {
        # Deploy to production
        az webapp config container set --name "$(WebAppName)" --resource-group "$(SharedResourceGroupName)" --multicontainer-config-file "$(Pipeline.Workspace)/dist/docker-compose.appservice.yml" --multicontainer-config-type COMPOSE
      }

The full Azure Pipeline yaml can be seen in my next-azure repo.

Conclusion

When I look at the changes in the related commits I made to switch my sample app from targeting a Windows App Service to a Linux App Service it was not that big of a change. What took the time, and what I have tried to describe above, was the learning and the trial and error required in getting to that final state.

The main challenge I always come up against when working with Azure is having to piece together information from the many docs, tutorials, quick starts, samples etc and navigate around unexpected behaviour, obstacles, and inconsistencies in the various APIs, CLIs, scripting languages etc on offer. It sometimes doesn't feel like a cohesive platform in the way that it is evolving, and whilst I do appreciate options and the breadth of services and tools on offer, I do sometimes yearn for the simplicity, speed and focus of Netlify, Vercel, Railway etc.

Maybe I am making a rod for my own back - maybe I am over complicating things - but with the work now done I do think the introduction of Docker into my sample app does simplify some things and brings benefits:

The behaviour and performance of the pipeline is more predictable as it eliminates the need to run npm install (or equivalent) both in the pipeline and on the App Service post-deploy, which can sometimes fail, or take much longer than you would expect (it could be unpredictable)
It opens up possibilities such as using tools like pnpm, which are not supported on Windows App Service due to lack of support for symlinks - essentially I think it provides more freedom of choice when using this example as a starting point for your app
You can more easily spin up the "production build" locally using Docker if you want to do some testing locally before pushing through the pipeline

Cost should also be considered and depending on your requirements Linux App Services could be more or less expensive than Windows (pay as you go prices for West Europe region in July 2022)!

Linux App Services are generally cheaper than Windows App Services, but the cost savings are not uniform (Standard tier price is slightly lower, but is much lower on Basic and Premium tiers)
- Basic tier: Linux about $13 USD/month; Windows about $55 USD /month
- Standard tier: Linux about $69 USD/month; Windows about $73 USD/month
- Premium tier: Linux about $84 USD/month; Windows about $146 USD/month
There is an additional cost for the Container Registry, which at time of writing is
- Basic tier: about $0.17 USD/day
- Standard tier: about $0.67 USD/day
- Premium tier: about $1.67 USD/day
The lowest App Service Plan SKU tier I can now use is a Basic tier B1, whereas previously I could deploy to the free F1 tier
- In reality though the performance of even the most basic Next.js app like the one in my sample app was terrible on F1 so that was never really a viable option for anything except a quick demo (if that!)
For a production app you would probably want to use a Standard or Premium tier anyway regardless of whather you have a Windows or Linux App Service so if you are going to production then the Linux App Service plus Container Registry is going to cost you slightly more at Standard tiers, but less at Premium tiers
- A Premium tier App Service with a Standard tier Container Registry is a sweet spot for me

So in summary, I think (for this kind of deployment) the Linux App Service provides some good benefits over Windows App Service and potentially cost savings for production apps, which is why I won't be switching back.