Basic - Part 2
Data persistence
Data Types
Different type of data can be stored in images and containers. While images are read-only, containers are read + write and can be used to stored data temporary. In volumes, data can persist even if containers are stopped and deleted.
Data persistence problems
Each container has its own filesystem. When we remove/delete a container (--rm
), all its filesystem and data get deleted with it. Thus, we can't retrieve files from a removed container. However, data are persistent when starting and stopping the container since the container still exist.
The problem arise when we want to keep data from an application even if the containers get removed or deleted. For example, this could be user accounts or files in a filesystem. Rebuilding the container with the same image won't help us to retrieve the data since the data exists within the container (write/read) and not in the image which is readable only.
Volumes can help with keeping data even if containers get removed.
Volumes
Volumes are mounted shares hosted on our local machine. It is used to solve the problem of data persistence. Since volumes are mounted shares they still exist when the container is deleted. Containers can write and read on volumes. Volume also managed by Docker. This implies that we do not know the path on our local host where the data is stored.
There exist two kinds of volumes:
Anonymous volumes
Volume and data are deleted when the container is removed. Does not help to keep data persistency.
Anonymous volume can be specified in Dockerfile as followed:
Here, the "/app/feedback" directory located inside the container. It is the directory that contains the data we want to save.
Named volumes
Volume and data persist even if the container is removed.
In both Anonymous and Named volumes the data is stored on an unknown directory on our host machine! It is impossible to delete, edit or consult directly the data saved in volumes!
We can specify a named volume when running a container using the -v
flag
When we will stop this container, the container will get automatically removed as the command contains the flag --rm
.
Listing existing volumes
However, we can observe that the volume containing our data did not disappear.
By running another container and specifying the same volume will allow us retrieve our data.
Bind Mounts
Bind mounts are the solution when it's time to store persistent data that needs to be editable directly. For example, we would want to put our source code in a bind mount, so any changed made in the source code would be reflected in our app running inside the container without the need of rebuilding the image. Unlike volumes, bind mounts are managed by the developer and we can specify the path on our localhost where the persistent data will be stored.
The solution for data that need to be edited!
Creating a simple bind mount
Run a container with multiples volumes and bind mounts
The following command will fail due to an overwriting issue
When we create a bind mounts with files and folder located on our localhost, we overwrite that are being copied during the execution of to the Dockerfile instructions. This could cause an issue when running the container.
Files and folders that are being copied during the build of the image are being overwritten when mounting the bind mounts.
To solve this issue, we needed to create an anonymous volume that will prevent this overwriting issue between our localhost file and files within the containers. How Docker works is that it will evaluate all volumes paths and if a crash occurs the longest internal path win.
For example, between /app
and /app/node_modules
, the second path will win and survive to make sure that it does not get overwritten by the /app
folder on our localhost.
The /app/node_modules
in the command below corresponds to the folder that is being created during the npm installation and contains all dependencies.
In the command above the three -v
commands allow:
-v (first)
: stores the feedbacks that have already being created even if the container is being removed
-v (second)
: allows to change the source code (HTML/CSS) of the application while being reflected instantly when reloading the application without the need of rebuilding the image.
-v (third)
: avoid the /app/node_modules
to be overwritten when creating the bind mounts without this volume we will face an issue with some dependencies modules since the /app/node_modules
folder does not exist on localhost.
Shortcuts
To specify the current directory
Nodemon running in a container
nodemon is a very useful package during Node.Js development phase. By default, Node.js does not reflect instantly our changes when we are making modification to our server side code unless we are restarting the server. nodemon is a npm package that aim to facilitate the development phase by restarting the server automatically when any change is made in our NodeJs code.
To include nodemon in our project, we need to add these following lines in our package.js file:
We can include the nodemon package in a devDependencies
section in our package.json file. The script
tag tell to our server to be started using nodemon.
In the Dockerfile, we will replace the following instruction:
Issue with nodemon using Docker in Windows with WLS2
For the change to get propagated within our container using nodemon, the source code need to be located within the WLS2 filesystem and not directly on our Windows system.
Two options are possible, embrace WSL2 as our by default terminal or mount our Windows folder in WSL.
A more dirty options, but that works is using the -L
flag.
Read-Only Volumes
To avoid the application into the running container to have write access on our localhost files.
By default, volumes are created with Read and Write accesses
Specify a Read-Only container
We can use the :ro
expression within the volume path to specify we want this volume to be Read-Only. If we add an additional volume with a path deeper in the hierarchy than /app
, this second volumes will overwrite the first one. This means that the /app/temp
volumes will have with Read and Write permissions.
.dockerignore
Alike .gitignore
, the .dockerignore
file aim to specify which files and folders we do not want to have copied within the image using the COPY instruction in the Dockerfile.
Usually we want to avoid to copy these files/folders as they are not required for running our app.
.git
Dockerfile
/node_modules
ENV variables
Docker allows to specify environment variables that could be use within the container environment and specified at run time.
These variables can be specified in the Docker file with the variable ENV.
In the Docker file below, we created the PORT environment variable and set it default value to 80. Then, this variable is reused with the EXPORT instruction.
Moreover, if we want the value of this ENV to be set dynamically by the user when running the container, we can specify the environment variable to be used in the source code of server.js.
Running the container
The value of the environment variable can be specified using --env
or -e
and the key value pair.
.env file
With multiple environment variables it is more convenient to use a .env
file which contain the key value pair of our environment variable. Then, when running the container, point to that file location using --env-file <file_path>
Build ARG
ARG can be used to build an image with some specific configuration values. This is a built-time instruction.
ARG values can be specified within the Dockerfile using the ARG
instruction. Here, the we set the default value of the DEFAULT_PORT
variable to 80. Then, we use the DEFAULT_PORT
arg to set the PORT
environment variable.
An ARG value can not be used within a run time instruction or inside the source code.
Building the image
The utility of ARG instruction is that based on a same Dockerfile, we could build two images with two different configurations. Here, using two different ports.
Last updated