Notes about Dockerfile, a complete Dockerfile instruction reference and some best practices.
Dockerfiles are text files that list instructions for the Docker daemon to follow when building a container image
When you execute the docker build
command, the lines in your Dockerfile are processed sequentially to assemble your image (in a layered fashion).
The basic Dockerfile syntax looks as follows:
# Comment
INSTRUCTION arguments
Instructions are keywords that apply a specific action to your image, such as copying a file from your working directory, or executing a command within the image.
By convention, instructions are usually written in uppercase. This is not a requirement of the Dockerfile parser, but it makes it easier to see which lines contain a new instruction.
Lines starting with a #
character are interpreted as comments. They’ll be ignored by the parser, so you can use them to document your Dockerfile.
FROM ubuntu:22.04
COPY . /app
RUN make /app
CMD python /app/app.py
In the example above, each instruction creates one layer:
FROM
creates a layer from theubuntu:22.04
Docker image.COPY
adds files from your Docker client's current directory.RUN
builds your application withmake
.CMD
specifies what command to run within the container.
Note : When you run an image and generate a container, you add a new writable layer, also called the container layer, on top of the underlying layers. All changes made to the running container, such as writing new files, modifying existing files, and deleting files, are written to this writable container layer (writable layer is not persistent, read about that here ).
Docker supports over 15 different Dockerfile instructions for adding content to your image and setting configuration parameters. We will discuss all of them.
Docker runs instructions in a Dockerfile
in order. A Dockerfile
must begin with a FROM
instruction. (ARG
is the only instruction that may precede FROM
in the Dockerfile
.)
FROM nginx:latest
- We use
FROM
to specify the base image we want to start from. (base image and also alias for multi-stage build) - The
tag
values are optional. If you omit it, the builder assumes alatest
tag by default. The builder returns an error if it cannot find thetag
value.
MAINTAINER malidkha
- Set the Author field of the generated images.
LABEL name="malidkha"
LABEL email="malidkha.elmasnaoui@gmail.com"
LABEL description="This is a Dockerfile reference"
- Labels are key-value pairs and simply adds custom metadata to your Docker Images (such as version , release-date ..)
#shell form
RUN apt-get update -y && apt-get install -y procps
RUN chown -R malikdha:malikdha /var/log/nginx && \
chown -R malikdha:malikdha /etc/nginx
#exec form
RUN ["apt-get", "update", "-y"]
RUN
is used to run commands during the image build process. (installing packages using apt , npm , composer ..)- Shell form :
RUN COMMAND
(by default is/bin/sh -c
) - Exec form :
RUN ["<executable>", "<param1>", "<param2>"]
- Change shell In which to run the command :
RUN ["/bin/bash", "-c", "echo hello from bash"]
#shell form
CMD nginx -g daemon off
#exec form
CMD ["/usr/sbin/nginx", "-g", "daemon off;"]
CMD
: Executes a command within a running container- Shell form :
CMD <command> <param1> <param2
(by default is/bin/sh -c
) - Exec form :
CMD ["executable", "param1", "param2"]
- Change shell In which to execute the command :
CMD ["/bin/bash", "-c", "echo hello from bash"]
CMD
Runs when the containers starts , only one can be executed, the last one- We can override the
CMD
instruction using thedocker run
command. If we specify aCMD
in our Dockerfile and one on thedocker run
command line, then the command line will override the Dockerfile ’sCMD
instruction.
#shell form
ENTRYPOINT nginx -g daemon off
#exec form
ENTRYPOINT ["/usr/sbin/nginx", "-g", "daemon off;"]
ENTRYPOINT
: Specifies the commands that will execute when the Docker container starts. If you don’t specify anyENTRYPOINT
, it defaults to/bin/sh -c
- Shell form :
ENTRYPOINT <command> <param1> <param2
(by default is/bin/sh -c
) - Exec form :
ENTRYPOINT ["executable", "param1", "param2"]
- Change shell In which to execute the command :
ENTRYPOINT ["/bin/bash", "-c", "echo hello from bash"]
ENTRYPOINT
Sets default parameters that cannot be overridden while executing Docker containers with CLI parameters(cli will be appended as argumenets).ENTRYPOINT
can be overridden during running docker using--entrypoint COMMAND
flag whitedocker run
command
ENTRYPOINT ["/usr/sbin/nginx"]
CMD ["-g", "daemon off;"]
- We can use both
CMD
andENTRYPOINT
--->ENTRYPOINT
will be executed first followed byCMD
as argument
SHELL ["executable", "parameters"]
#change default shell
SHELL ["/bin/bash", "-c"]
SHELL
instruction allows the default shell used for the shell form of commands to be overridden. The default shell on Linux is["/bin/sh", "-c"]
.- The
SHELL
instruction can appear multiple times. Each SHELL instruction overrides all previous SHELL instructions, and affects all subsequent instructions. - All instructions that use a shell , or that come after
SHELL
instruction, will use the new shell.
EXPOSE 80/tcp
EXPOSE
is Used to tell Docker that the container listens on the specified network ports at runtime.Default is TCP if the protocol is not specified.- This instruction won't open any port, as we can't open a port on a Docker Image, it is just to let docker know that the container will listen to the specified port at runtime
#ARG INstruction
ARG NAME=malikdha
ARG NAME2
#ENV Insruction
ENV NAME malidkha
ENV NAME2=malidkha
-
ARG
Defines build-time variables using key-value pairs. However, these ARG variables will not be accessible when the container is running. -
The
ARG
instruction defines a variable that users can pass at build-time to the builder with thedocker build
command using the--build-arg <varname>=<value>
flag -
ENV
Sets environment variables within the image, making them accessible both during the build process and while the container is running. -
This instruction won't open any port, as we can't open a port on a Docker Image, it is just to let docker know that the container will listen to the specified port at runtime
#COPY Instruction
COPY /path/on/host/ /path/on/container/
#ADD Instruction
ADD /path/on/host/ /path/on/container/
COPY
is used to copy a file or folder from the host system into the docker image.- We can copy the file or folder , while setting its ownership on the image using
COPY --chown=<user>:<group>
ADD
is an advanced form ofCOPY
, Copies new files, directories, or remote file URLs from the host filesystem into the docker image.ADD
also supports 2 additional sources. First, you can use a URL instead of a local file / directory. Secondly, you can extract a tar file from the source directly into the destination.- A valid use case for
ADD
is when you want to extract a local tar file into a specific directory in your Docker image. - If you’re copying in local files to your Docker image, always use
COPY
because it’s more explicit.
WORKDIR /var/www/html
WORKDIR
is Used to set the current working directory.- The
WORKDIR
instruction can be used multiple times in aDockerfile
. If a relative path is provided, it will be relative to the path of the previousWORKDIR
instruction. For example:
USER malidkha
USER
sets the user name or UID to use when running the image and for anyRUN
,CMD
,ENTRYPOINT
and any instruction instructions that follow it in theDockerfile
.- The
WORKDIR
instruction can be used multiple times in aDockerfile
. If a relative path is provided, it will be relative to the path of the previousWORKDIR
instruction. For example: - If the User does not exit on the container you may want to add the user first :
RUN groupadd -f malidkha && (id -u malidkha &> /dev/null || useradd -G malidkha malidkha -D)
VOLUME ["/var/log/nginx", "/var/log/ph"]
VOLUME /var/log/nginx /var/log/php
VOLUME
creates a mount point with the specified name and marks it as holding externally mounted volumes from native host or other containers.- Note : By design, the
VOLUME
directive in the Dockerfile only defines that a volume should be created for a specific path in the image (/container when running).The name of the volume should never be dictated by an image, because that's something that should be defined at runtime;
ONBUILD RUN composer install --optimize-autoloader --no-dev
ONBUILD
executes after the current Dockerfile build completes.- The
ONBUILD
instruction may not triggerFROM
,MAINTAINER
, orONBUILD
instructions.
HEALTHCHECK <options> CMD <command>
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost/ || exit 1
#Disable any healthcheck inherited from the base image
HEALTHCHECK NONE
-
HEALTHCHECK
check container health by running a command inside the container.(Instructs Docker how to test a container to check that it is still working) -
Whenever a health check passes, it becomes
healthy
. After a certain number of consecutive failures, it becomesunhealthy
You can usedocker ps
command to check the status. -
The
<options>
that can appear are...--interval=<duration>
(default:30s
) In practice - should be relatively long (10m
)--timeout=<duration>
(default:30s
)--retries=<number>
(default:3
)--start_period=<duration>
(default:0s
)--start-interval=DURATION
(default:5s
)
-
The health check will first run
interval
seconds after the container is started, and then againinterval
seconds after each previous check completes. If a single run of the check takes longer thantimeout
seconds then the check is considered to have failed. It takesretries
consecutive failures of the health check for the container to be consideredunhealthy
. -
start period
provides initialization time for containers that need time to bootstrap. Probe failure during that period will not be counted towards the maximum number of retries. However, if a health check succeeds during the start period, the container is considered started and all consecutive failures will be counted towards the maximum number of retries. -
start interval
is the time between health checks during the start period. -
There can only be one
HEALTHCHECK
instruction in a Dockerfile. If you list more than one then only the lastHEALTHCHECK
will take effect. -
The command's exit status indicates the health status of the container.
0
: success - the container is healthy and ready for use1
: unhealthy - the container is not working correctly2
: reserved - do not use this exit code
-
The first 4096 bytes of stdout and stderr from the
<command>
are stored and can be queried withdocker inspect
. -
When the health status of a container changes, a
health_status
event is generated with the new status.
Multi-stage builds allow us to create a Dockerfile that defines multiple stages for building an image. Each stage can have its own set of instructions and build context, which means that the resulting image can be optimized for size and performance. In a multi-stage build, each stage produces an intermediate image that is used as the build context for the next stage. The final stage produces the image that will be used to run the application.
The key advantage of using multi-stage builds is that it allows developers to reduce the size of the final image. By breaking down the build process into smaller stages, it becomes easier to remove unnecessary files and dependencies that are not needed in the final image. This can significantly reduce the size of the image, which can lead to faster deployment times and lower storage costs.
In addition to reducing image size, multi-stage builds can also improve build performance. By breaking down the build process into smaller stages, Docker can cache the intermediate images and reuse them if the source code or dependencies haven't changed. This can lead to faster builds and shorter development cycles. Read about docker best practices here
With multi-stage builds, you use multiple FROM
statements in your Dockerfile. Each FROM
instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don't want in the final image.
The following Dockerfile has two separate stages: one for building a binary, and another where we copy the binary into.
# Build executable stage
FROM golang
ADD . /app
WORKDIR /app
RUN go build
ENTRYPOINT /app/app
# Build final image
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /app/app .
CMD ["./app”]
The end result is a tiny production image with nothing but the binary inside. None of the build tools required to build the application are included in the resulting image.
How does it work? The second FROM
instruction starts a new build stage with the alpine
image as its base. The COPY --from=0
line copies just the built artifact from the previous stage into this new stage. The Go SDK and any intermediate artifacts are left behind, and not saved in the final image.
By default, the stages aren't named, and you refer to them by their integer number, starting with 0
for the first FROM
instruction. However, you can name your stages, by adding an AS <NAME>
to the FROM
instruction.
Let's take a look at the previous example with the named build stages
# Build executable stage
FROM golang AS build
ADD . /app
WORKDIR /app
RUN go build
ENTRYPOINT /app/app
# Build final image
FROM alpine:latest as final
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=build /app/app .
CMD ["./app”]
When you build your image, you don't necessarily need to build the entire Dockerfile including every stage. You can specify a target build stage. The following command assumes you are using the previous Dockerfile
but stops at the stage named build
:
$ docker build --target build -t TEST .
A few scenarios where this might be useful are:
- Debugging a specific build stage
- Using a
debug
stage with all debugging symbols or tools enabled, and a leanproduction
stage - Using a
testing
stage in which your app gets populated with test data, but building for production using a different stage which uses real data
When using multi-stage builds, you aren't limited to copying from stages you created earlier in your Dockerfile. You can use the COPY --from
instruction to copy from a separate image, either using the local image name, a tag available locally or on a Docker registry, or a tag ID. The Docker client pulls the image if necessary and copies the artifact from there. The syntax is:
COPY --from=nginx:latest /etc/nginx/nginx.conf /nginx.conf
You can pick up where a previous stage left off by referring to it when using the FROM
directive. For example:
FROM ubuntu AS base
RUN echo "base"
FROM base AS stage1
RUN echo "stage1"
FROM base AS stage2
RUN echo "stage2"
A .dockerignore
is a configuration file that describes files and directories that you want to exclude when building a Docker image.
Usually, you put the Dockerfile in the root directory of your project, but there may be many files in the root directory that are not related to the Docker image or that you do not want to include. .dockerignore
is used to specify such unwanted files and not include them in the Docker image.
The .dockerignore
file is helpful to avoid inadvertently sending files or directories that are large or contain sensitive files to the daemon or avoid adding them to the image using the ADD
or COPY
commands. Those are some benefits of using such file :
If you have frequently updated files (git history, test results, etc.) in your working directory, the cache will be regenerated every time you run Docker build. Therefore, if you include the directory with such files in the context, each build will take a lot of time. Consequently, you can prevent the cache from generating at build time by specifying .dockerignore
.
By specifying files not to be included in the context in .dockerignore
, the size of the image can be reduced. Reducing the size of the Docker image has the following advantages These benefits are important because the more instances of a service you have, such as microservices, the more opportunities you have to exchange Docker images.
- Faster speed when doing
Docker pull/push
. - Faster speed when building Docker images.
Examples: it is recommended to add directories such as node_modules
, vendor
... to .dockerignore
because of their large file size.
If a Docker image file contains sensitive information such as credentials, it becomes a security issue. For example, uploading a Docker image containing files with credential information such as .aws
or .env
to a Docker repository such as the public DockerHub will cause those credential information to be compromised.
Don't forget to exclude the .git
directory at this point. If you have committed sensitive information in the past but have not erased it, it can cause serious problems. Git history is not required to be included in Docker images, so be sure to include it in your .dockerignore
file.
Before sending the Docker build context (Dockerfile, files you want to send inside the Docker image, and other things needed when building the Docker image) to the Docker daemon, in the root directory of the build context look for a file named .dockerignore
in the root directory of the build context. If this file exists, the CLI will exclude any files or directories from the context that match the pattern written in the file. Therefore, the files and directories described in the .dockerignore
file will not be included in the final built Docker image.
**/node_modules/
npm-debug.log
.env
.aws
.editorconfig
.git
.gitignore
LICENSE
README.md
src/vendor
**/*.swp
Dockerfile
docker-compose*