February 26, 2018

Optimize with Multi-Stage Dockerfile

Nádasdi Balázs
Head of Engineering

Last year in June, Docker 17.05 was released including a feature that is extremely useful, but is often left unused. What am I talking about? This feature is the Multi-Stage Dockerfile. It enables you to merge separated Dockerfiles into one universal file. In the old days, in a single-staged Dockerfile, we had to create multiple files in order to make the production image clean. Of course we could build huge images for deployment that contains all the requirements to build our application.


Let's take a look at a simple example: We have a single-page website. We have a static index.html, a single JavaScript, and a CSS file. As can be seen in this example, it's clean and short, but can be huge.

Here is our project directory tree:

  1. docroot
  2. |-- Gemfile
  3. |-- Gemfile.lock
  4. |-- Rakefile
  5. |-- public
  6. |   `-- index.html
  7. `-- src
  8.     |-- css
  9.     |   |-- main.scss
  10.     |   `-- modules
  11.     |       `-- article.scss
  12.     `-- js
  13.         |-- boot.js
  14.         |-- init.js
  15.         `-- modules
  16.             `-- article.js

Our HTML is just a simple empty document:

  1. <!DOCTYPE html>
  2. <html>
  3.     <head>
  4.         <meta charset="utf-8" />
  5.         <meta name="viewport" content="width=device-width" />
  6.         <title>Sample Project for Multi-Stage Docker build</title>
  7.         <link rel="stylesheet" href="/app.css" />
  8.         <script type="text/javascript" charset="utf-8" src="/app.js"></script>
  9.     </head>
  10.     <body>
  11.         <div id='mainContent'></div>
  12.     </body>
  13. </html>

It's nice and clean. As we build a huge project, we don't want to write a single CSS and JavaScript file, so we create a beautiful directory structure, and before we deploy our site, we will merge them. For CSS we will use SCSS because it contains a lot of features that can be handy later. To build our project, we use simple rake tasks. So, we need ruby at least. For SCSS, we have to install extra packages such as libffi.

The old way

What did we do before Multi-Stage Dockerfiles?
Of course we want as small of a deploy image as possible so we create two Dockerfiles- one for building the project and one to deploy.
Firstly, building our project:

  1. # Dockerfile.build
  2. FROM ruby:2.5.0-alpine3.7
  4. # Install dependencies for native extensions
  5. RUN apk add --no-cache build-base libffi-dev
  7. # This will be our application root folder
  8. WORKDIR /application
  10. # Copy all the content from docroot
  11. COPY docroot /application
  13. # Build our application
  14. RUN bundle install
  15. RUN bundle exec rake

Now we can create a much simpler image with pre-built artifacts:

  1. # Dockerfile
  2. FROM nginx:latest
  4. COPY ./build-artifact /usr/share/nginx/html

Wow! Nice and clean but how can we deploy our site? What do we need  to do to get the final image? Let's create a script so we can eliminate human error in the process:

  1. #!/bin/bash
  2. # build.
  4. # Build the "build-image"
  5. docker build -t yitsushi/myshinyproject:build -f ./Dockerfile.build .
  7. # Create a temporary container
  8. docker create --name temp_container yitsushi/myshinyproject:build
  10. # Extract build artifacts
  11. docker cp temp_container:/application/public ./build-artifact
  13. # delete the temporary container
  14. docker rm -f temp_container
  16. # Build the final image
  17. docker build --no-cache -t yitsushi/myshinyproject:latest -f ./Dockerfile .
  19. # Delete the temporary build-artifact directory
  20. rm -rf ./build-artifact

It's ugly. We need a temporary image with a temporary container, and after that, we’ll need to create a local temporary directory. Don't forget to clean up your environment as well, like the build-artifact directory.

From here we simply execute our shell script:

  1. ❯ ./build.sh
  2. Sending build context to Docker daemon  9.665MB
  3. Step 1/6 : FROM ruby:2.5.0-alpine3.7
  4.  ---> 308418a1844f
  5. Step 2/6 : RUN apk add --no-cache build-base libffi-dev
  6.  ---> Using cache
  7.  ---> 677a75453610
  8. Step 3/6 : WORKDIR /application
  9.  ---> Using cache
  10.  ---> 1ba87d6eae13
  11. Step 4/6 : COPY docroot /application
  12.  ---> f415c072262a
  13. Step 5/6 : RUN bundle install
  14.  ---> Running in ba26aed32021
  15. Fetching gem metadata from https://rubygems.org/...........
  16. Fetching rake 12.3.0
  17. Installing rake 12.3.0
  18. Using bundler 1.16.1
  19. Fetching ffi 1.9.18
  20. Installing ffi 1.9.18 with native extensions
  21. Fetching rb-fsevent 0.10.2
  22. Installing rb-fsevent 0.10.2
  23. Fetching rb-inotify 0.9.10
  24. Installing rb-inotify 0.9.10
  25. Fetching sass-listen 4.0.0
  26. Installing sass-listen 4.0.0
  27. Fetching sass 3.5.5
  28. Installing sass 3.5.5
  29. Bundle complete! 2 Gemfile dependencies, 7 gems now installed.
  30. Bundled gems are installed into `/usr/local/bundle`
  31. Removing intermediate container ba26aed32021
  32.  ---> 5e60801a48d2
  33. Step 6/6 : RUN bundle exec rake
  34.  ---> Running in c685fe9c164a
  35. Removing intermediate container c685fe9c164a
  36.  ---> cb561d54cd0a
  37. Successfully built cb561d54cd0a
  38. Successfully tagged yitsushi/myshinyproject:build
  39. a82359e3b871dcbfa7acb1f4ba0f0a3d5b576e33d8ff6b56c1317d7799ef2148
  40. temp_container
  41. Sending build context to Docker daemon  10.75kB
  42. Step 1/2 : FROM nginx:latest
  43.  ---> 3f8a4339aadd
  44. Step 2/2 : COPY ./build-artifact /usr/share/nginx/html
  45.  ---> f1fab6d81151
  46. Successfully built f1fab6d81151
  47. Successfully tagged yitsushi/myshinyproject:latest
  49.  ❯ docker images
  50. yitsushi/myshinyproject   latest              f1fab6d81151        9 minutes ago       108MB
  51. yitsushi/myshinyproject   build               cb561d54cd0a        9 minutes ago       247MB
  53.  ❯ docker run --rm -p 8888:80 yitsushi/myshinyproject

With Multi-Stage Dockerfile

Where can Multi-Stage Dockerfiles help us? We can eliminate all the unnecessary temporary files and containers. But how? We create only one Dockerfile with multiple FROM statement:

  1. # Dockerfile, the new one
  2. FROM ruby:2.5.0-alpine3.7 as builder
  4. # Install dependencies for native extensions
  5. RUN apk add --no-cache build-base libffi-dev
  7. # This will be our application root folder
  8. WORKDIR /application
  10. # Copy all the content from docroot
  11. COPY docroot /application
  13. # Build our application
  14. RUN bundle install
  15. RUN bundle exec rake
  17. # Here we start a new stage
  18. FROM nginx:latest
  20. COPY --from=builder /application/public /usr/share/nginx/html

What?! Yes, that’s right- we can write a Dockerfile like this. We can define as many steps as we want, but keep in mind that it's not a "build more images at the same time" approach. In the end, only the last one will be available as the final image.
How can we build with this? We don't need our build.sh because it's that simple:

  1. ❯ docker build -t yitsushi/myshinyproject .
  3.  ❯ docker images
  4. yitsushi/myshinyproject   latest              5ee4b57eb9e9        9 minutes ago       108MB
  6.  ❯ docker run --rm -p 8888:80 yitsushi/myshinyproject

Hidden secret keys

What else can we do with Multi-Stage Dockerfiles? We can COPY a private ssh key to use in our build stage. However, in our production image, there will be no private ssh key.
Imagine: you have a private Go GitHub repository, and one of the dependencies is private as well. You can call go get, and there are two alternatives :

  • Add a deployment key to your Docker image;
  • Pre-fetch specific dependencies (or all).

Now we can add our key in a stage, fetch all repository, and- Voilà! Problem solved!

  1. # Dockerfile with private ssh key
  2. FROM golang:alpine as build
  4. # Just add the key from build parameter
  6. RUN mkdir -p /root/.ssh
  7. ADD echo ${SSH_KEY} > /root/.ssh/private_key
  8. RUN chmod 0600 /root/.ssh/private_key
  9. RUN ssh-add /root/.ssh/private_key
  10. RUN ssh-keyscan github.com >> /root/.ssh/known_hosts
  12. ADD . /go/src/github.com/Yitsushi/myshinyproject
  13. RUN go install github.com/Yitsushi/myshinyproject
  15. # Final image
  16. FROM alpine:latest
  17. COPY --from=build /go/bin/myshinyproject /usr/local/bin/myshinyproject
  18. CMD ["/usr/local/bin/myshinyproject"]

The image we’ll receive at the end will be as clean and small as possible. The build is still short and clear.

  1. ❯ docker build \
  2.     --build-arg SSH_KEY="$(cat ~/.ssh/cheppers_rsa)" \
  3.     -t yitsushi/myshinyproject \
  4.     -f Dockerfile .

Now we can deploy our image anywhere without exposing our private key after a local build.

Update: As Kaji Bikash pointed out, Docker can't see files outside of its scope during the build. So use SSH_KEY instead of SSH_KEY_PATH.


Docker gives us the opportunity to match our testing environment with our production environment as close as possible. But does this mean that our production environment must contain all the dependencies to build our project? This is a feature that’s already in there, but as I see it, it’s undeservedly underutilized.  Perhaps the reason being that there are a few resources available on the topic. Or perhaps it wasn’t announced as loudly as some of the other features.

The source code is available on GitHub.

Related posts

July 30, 2018

An introduction to Kubernetes capabilities with a sample Go distributed face-recognition application.