Optimize with Multi-Stage Dockerfile
Last year in June, Docker 17.05 was released including a feature that is extremely useful, but is often left unused. What am I talking about? This feature is the Multi-Stage Dockerfile. It enables you to merge separated Dockerfiles into one universal file. In the old days, in a single-staged Dockerfile, we had to create multiple files in order to make the production image clean. Of course we could build huge images for deployment that contains all the requirements to build our application.

Let's take a look at a simple example: We have a single-page website. We have a static index.html, a single JavaScript, and a CSS file. As can be seen in this example, it's clean and short, but can be huge.
Here is our project directory tree:
-
docroot
-
|-- Gemfile
-
|-- Gemfile.lock
-
|-- Rakefile
-
|-- public
-
| `-- index.html
-
`-- src
-
|-- css
-
| |-- main.scss
-
| `-- modules
-
| `-- article.scss
-
`-- js
-
|-- boot.js
-
|-- init.js
-
`-- modules
-
`-- article.js
Our HTML is just a simple empty document:
-
<!DOCTYPE html>
-
<html>
-
<head>
-
<meta charset="utf-8" />
-
<meta name="viewport" content="width=device-width" />
-
<title>Sample Project for Multi-Stage Docker build</title>
-
<link rel="stylesheet" href="/app.css" />
-
<script type="text/javascript" charset="utf-8" src="/app.js"></script>
-
</head>
-
<body>
-
<div id='mainContent'></div>
-
</body>
-
</html>
It's nice and clean. As we build a huge project, we don't want to write a single CSS and JavaScript file, so we create a beautiful directory structure, and before we deploy our site, we will merge them. For CSS we will use SCSS because it contains a lot of features that can be handy later. To build our project, we use simple rake tasks. So, we need ruby at least. For SCSS, we have to install extra packages such as libffi
.
The old way
What did we do before Multi-Stage Dockerfiles?
Of course we want as small of a deploy image as possible so we create two Dockerfiles- one for building the project and one to deploy.
Firstly, building our project:
-
# Dockerfile.build
-
FROM ruby:2.5.0-alpine3.7
-
-
# Install dependencies for native extensions
-
RUN apk add --no-cache build-base libffi-dev
-
-
# This will be our application root folder
-
WORKDIR /application
-
-
# Copy all the content from docroot
-
COPY docroot /application
-
-
# Build our application
-
RUN bundle install
-
RUN bundle exec rake
Now we can create a much simpler image with pre-built artifacts:
-
# Dockerfile
-
FROM nginx:latest
-
-
COPY ./build-artifact /usr/share/nginx/html
Wow! Nice and clean but how can we deploy our site? What do we need to do to get the final image? Let's create a script so we can eliminate human error in the process:
-
#!/bin/bash
-
# build. sh
-
-
# Build the "build-image"
-
docker build -t yitsushi/myshinyproject:build -f ./Dockerfile.build .
-
-
# Create a temporary container
-
docker create --name temp_container yitsushi/myshinyproject:build
-
-
# Extract build artifacts
-
docker cp temp_container:/application/public ./build-artifact
-
-
# delete the temporary container
-
docker rm -f temp_container
-
-
# Build the final image
-
docker build --no-cache -t yitsushi/myshinyproject:latest -f ./Dockerfile .
-
-
# Delete the temporary build-artifact directory
-
rm -rf ./build-artifact
It's ugly. We need a temporary image with a temporary container, and after that, we’ll need to create a local temporary directory. Don't forget to clean up your environment as well, like the build-artifact directory.
From here we simply execute our shell script:
-
❯ ./build.sh
-
Sending build context to Docker daemon 9.665MB
-
Step 1/6 : FROM ruby:2.5.0-alpine3.7
-
---> 308418a1844f
-
Step 2/6 : RUN apk add --no-cache build-base libffi-dev
-
---> Using cache
-
---> 677a75453610
-
Step 3/6 : WORKDIR /application
-
---> Using cache
-
---> 1ba87d6eae13
-
Step 4/6 : COPY docroot /application
-
---> f415c072262a
-
Step 5/6 : RUN bundle install
-
---> Running in ba26aed32021
-
Fetching gem metadata from https://rubygems.org/...........
-
Fetching rake 12.3.0
-
Installing rake 12.3.0
-
Using bundler 1.16.1
-
Fetching ffi 1.9.18
-
Installing ffi 1.9.18 with native extensions
-
Fetching rb-fsevent 0.10.2
-
Installing rb-fsevent 0.10.2
-
Fetching rb-inotify 0.9.10
-
Installing rb-inotify 0.9.10
-
Fetching sass-listen 4.0.0
-
Installing sass-listen 4.0.0
-
Fetching sass 3.5.5
-
Installing sass 3.5.5
-
Bundle complete! 2 Gemfile dependencies, 7 gems now installed.
-
Bundled gems are installed into `/usr/local/bundle`
-
Removing intermediate container ba26aed32021
-
---> 5e60801a48d2
-
Step 6/6 : RUN bundle exec rake
-
---> Running in c685fe9c164a
-
Removing intermediate container c685fe9c164a
-
---> cb561d54cd0a
-
Successfully built cb561d54cd0a
-
Successfully tagged yitsushi/myshinyproject:build
-
a82359e3b871dcbfa7acb1f4ba0f0a3d5b576e33d8ff6b56c1317d7799ef2148
-
temp_container
-
Sending build context to Docker daemon 10.75kB
-
Step 1/2 : FROM nginx:latest
-
---> 3f8a4339aadd
-
Step 2/2 : COPY ./build-artifact /usr/share/nginx/html
-
---> f1fab6d81151
-
Successfully built f1fab6d81151
-
Successfully tagged yitsushi/myshinyproject:latest
-
-
❯ docker images
-
yitsushi/myshinyproject latest f1fab6d81151 9 minutes ago 108MB
-
yitsushi/myshinyproject build cb561d54cd0a 9 minutes ago 247MB
-
-
❯ docker run --rm -p 8888:80 yitsushi/myshinyproject
With Multi-Stage Dockerfile
Where can Multi-Stage Dockerfiles help us? We can eliminate all the unnecessary temporary files and containers. But how? We create only one Dockerfile with multiple FROM
statement:
-
# Dockerfile, the new one
-
FROM ruby:2.5.0-alpine3.7 as builder
-
-
# Install dependencies for native extensions
-
RUN apk add --no-cache build-base libffi-dev
-
-
# This will be our application root folder
-
WORKDIR /application
-
-
# Copy all the content from docroot
-
COPY docroot /application
-
-
# Build our application
-
RUN bundle install
-
RUN bundle exec rake
-
-
# Here we start a new stage
-
FROM nginx:latest
-
-
COPY --from=builder /application/public /usr/share/nginx/html
What?! Yes, that’s right- we can write a Dockerfile like this. We can define as many steps as we want, but keep in mind that it's not a "build more images at the same time" approach. In the end, only the last one will be available as the final image.
How can we build with this? We don't need our build.sh because it's that simple:
-
❯ docker build -t yitsushi/myshinyproject .
-
-
❯ docker images
-
yitsushi/myshinyproject latest 5ee4b57eb9e9 9 minutes ago 108MB
-
-
❯ docker run --rm -p 8888:80 yitsushi/myshinyproject
Hidden secret keys
What else can we do with Multi-Stage Dockerfiles? We can COPY a private ssh key to use in our build stage. However, in our production image, there will be no private ssh key.
Imagine: you have a private Go GitHub repository, and one of the dependencies is private as well. You can call go get, and there are two alternatives :
- Add a deployment key to your Docker image;
- Pre-fetch specific dependencies (or all).
Now we can add our key in a stage, fetch all repository, and- Voilà! Problem solved!
-
# Dockerfile with private ssh key
-
FROM golang:alpine as build
-
-
# Just add the key from build parameter
-
ARG SSH_KEY
-
RUN mkdir -p /root/.ssh
-
ADD echo ${SSH_KEY} > /root/.ssh/private_key
-
RUN chmod 0600 /root/.ssh/private_key
-
RUN ssh-add /root/.ssh/private_key
-
RUN ssh-keyscan github.com >> /root/.ssh/known_hosts
-
-
ADD . /go/src/github.com/Yitsushi/myshinyproject
-
RUN go install github.com/Yitsushi/myshinyproject
-
-
# Final image
-
FROM alpine:latest
-
COPY --from=build /go/bin/myshinyproject /usr/local/bin/myshinyproject
-
CMD ["/usr/local/bin/myshinyproject"]
The image we’ll receive at the end will be as clean and small as possible. The build is still short and clear.
-
❯ docker build \
-
--build-arg SSH_KEY="$(cat ~/.ssh/cheppers_rsa)" \
-
-t yitsushi/myshinyproject \
-
-f Dockerfile .
Now we can deploy our image anywhere without exposing our private key after a local build.
SSH_KEY
instead of SSH_KEY_PATH
.Conclusion
Docker gives us the opportunity to match our testing environment with our production environment as close as possible. But does this mean that our production environment must contain all the dependencies to build our project? This is a feature that’s already in there, but as I see it, it’s undeservedly underutilized. Perhaps the reason being that there are a few resources available on the topic. Or perhaps it wasn’t announced as loudly as some of the other features.
The source code is available on GitHub.
Related posts

An introduction to Kubernetes capabilities with a sample Go distributed face-recognition application.