Basic - Part 2

Data persistence

Data Types

Different type of data can be stored in images and containers. While images are read-only, containers are read + write and can be used to stored data temporary. In volumes, data can persist even if containers are stopped and deleted.

Data persistence problems

Each container has its own filesystem. When we remove/delete a container (--rm), all its filesystem and data get deleted with it. Thus, we can't retrieve files from a removed container. However, data are persistent when starting and stopping the container since the container still exist.

The problem arise when we want to keep data from an application even if the containers get removed or deleted. For example, this could be user accounts or files in a filesystem. Rebuilding the container with the same image won't help us to retrieve the data since the data exists within the container (write/read) and not in the image which is readable only.

Volumes can help with keeping data even if containers get removed.

Volumes

Volumes are mounted shares hosted on our local machine. It is used to solve the problem of data persistence. Since volumes are mounted shares they still exist when the container is deleted. Containers can write and read on volumes. Volume also managed by Docker. This implies that we do not know the path on our local host where the data is stored.

There exist two kinds of volumes:

Anonymous volumes

Volume and data are deleted when the container is removed. Does not help to keep data persistency.

Anonymous volume can be specified in Dockerfile as followed:

VOLUME ["/app/feedback"]

Here, the "/app/feedback" directory located inside the container. It is the directory that contains the data we want to save.

Named volumes

Volume and data persist even if the container is removed.

In both Anonymous and Named volumes the data is stored on an unknown directory on our host machine! It is impossible to delete, edit or consult directly the data saved in volumes!

We can specify a named volume when running a container using the -v flag

-v <volume_name>:<directory_inside_container>

When we will stop this container, the container will get automatically removed as the command contains the flag --rm.

docker run -p 80:80 -d --rm --name data_volume_container -v feedback:/app/feedback data_volume

Listing existing volumes

However, we can observe that the volume containing our data did not disappear.

docker volume ls

By running another container and specifying the same volume will allow us retrieve our data.

Bind Mounts

Bind mounts are the solution when it's time to store persistent data that needs to be editable directly. For example, we would want to put our source code in a bind mount, so any changed made in the source code would be reflected in our app running inside the container without the need of rebuilding the image. Unlike volumes, bind mounts are managed by the developer and we can specify the path on our localhost where the persistent data will be stored.

The solution for data that need to be edited!

Creating a simple bind mount

docker run --rm \
           -d \
           -p 80:80 \
           --name <CTN_NAME> \ 
           -v "<LOCALHOST_ABS_PATH_TO_SHARE>:/app" \
           <IMG_NAME>

Run a container with multiples volumes and bind mounts

The following command will fail due to an overwriting issue

docker run --rm \
           -d \
           -p 80:80 \
           --name feedback-app \ 
           -v feedback:/app/feedback \
           -v "<LOCALHOST_ABS_PATH_TO_SHARE>:/app" \
           data_volume

When we create a bind mounts with files and folder located on our localhost, we overwrite that are being copied during the execution of to the Dockerfile instructions. This could cause an issue when running the container.

Files and folders that are being copied during the build of the image are being overwritten when mounting the bind mounts.

To solve this issue, we needed to create an anonymous volume that will prevent this overwriting issue between our localhost file and files within the containers. How Docker works is that it will evaluate all volumes paths and if a crash occurs the longest internal path win.

For example, between /app and /app/node_modules, the second path will win and survive to make sure that it does not get overwritten by the /app folder on our localhost.

The /app/node_modules in the command below corresponds to the folder that is being created during the npm installation and contains all dependencies.

docker run --rm \
           -d \
           -p 80:80 \
           --name feedback-app \ 
           -v feedback:/app/feedback \
           -v "<LOCALHOST_ABS_PATH_TO_SHARE>:<CTN_PATH>" \
           -v /app/node_modules # to avoid the node_modules being overwritten
           data_volume

In the command above the three -v commands allow:

-v (first): stores the feedbacks that have already being created even if the container is being removed

-v (second): allows to change the source code (HTML/CSS) of the application while being reflected instantly when reloading the application without the need of rebuilding the image.

-v (third): avoid the /app/node_modules to be overwritten when creating the bind mounts without this volume we will face an issue with some dependencies modules since the /app/node_modules folder does not exist on localhost.

Shortcuts

To specify the current directory

-v "%cd%":/app # Windows cmdline

Nodemon running in a container

nodemon is a very useful package during Node.Js development phase. By default, Node.js does not reflect instantly our changes when we are making modification to our server side code unless we are restarting the server. nodemon is a npm package that aim to facilitate the development phase by restarting the server automatically when any change is made in our NodeJs code.

To include nodemon in our project, we need to add these following lines in our package.js file:

We can include the nodemon package in a devDependencies section in our package.json file. The script tag tell to our server to be started using nodemon.

In the Dockerfile, we will replace the following instruction:

# Replace
CMD ["node", "server.js"]

# by
CMD ["npm", "start"] # This instruction instruct to run the start script specified in the package.json file

Issue with nodemon using Docker in Windows with WLS2

For the change to get propagated within our container using nodemon, the source code need to be located within the WLS2 filesystem and not directly on our Windows system.

Two options are possible, embrace WSL2 as our by default terminal or mount our Windows folder in WSL.

A more dirty options, but that works is using the -L flag.

package.json

{
 "scripts": "nodemon -L server.js"
}

Read-Only Volumes

To avoid the application into the running container to have write access on our localhost files.

By default, volumes are created with Read and Write accesses

Specify a Read-Only container

We can use the :ro expression within the volume path to specify we want this volume to be Read-Only. If we add an additional volume with a path deeper in the hierarchy than /app, this second volumes will overwrite the first one. This means that the /app/temp volumes will have with Read and Write permissions.

-v "<LOCALHOST_ABS_PATH_TO_SHARE>:/app:ro"
-v /app/temp # This second volume we specified will have R&W access

.dockerignore

Alike .gitignore, the .dockerignore file aim to specify which files and folders we do not want to have copied within the image using the COPY instruction in the Dockerfile.

Usually we want to avoid to copy these files/folders as they are not required for running our app.

.git
Dockerfile
/node_modules

ENV variables

Docker allows to specify environment variables that could be use within the container environment and specified at run time.

These variables can be specified in the Docker file with the variable ENV.

In the Docker file below, we created the PORT environment variable and set it default value to 80. Then, this variable is reused with the EXPORT instruction.

Dockerfile

ENV PORT 80
EXPORT $PORT #take note of the $ sign

Moreover, if we want the value of this ENV to be set dynamically by the user when running the container, we can specify the environment variable to be used in the source code of server.js.

server.js

# The server will start using the port specified
app.listen(process.env.PORT)

Running the container

The value of the environment variable can be specified using --env or -e and the key value pair.

docker run -d
           -p 4444:8888    # the app is accessible on port 4444 localhost
           --env PORT=8888 # the server.js will run on port 8888 within the container
           --rm --name feedback-app 
           -v "D:\Offensive_Notes\Udemy\Docker_Kubernetes_PraticalGuide_2022\data-     volumes-02-added-dockerfile\data-volumes-02-added-dockerfile:/app:ro" 
           -v /app/temp
           -v feedback:/app/feedback
           -v /app/node_modules data_volume

.env file

With multiple environment variables it is more convenient to use a .env file which contain the key value pair of our environment variable. Then, when running the container, point to that file location using --env-file <file_path>

.env

PORT=8888

Build ARG

ARG can be used to build an image with some specific configuration values. This is a built-time instruction.

ARG values can be specified within the Dockerfile using the ARG instruction. Here, the we set the default value of the DEFAULT_PORT variable to 80. Then, we use the DEFAULT_PORT arg to set the PORT environment variable.

ARG DEFAULT_PORT=80
ENV PORT $DEFAULT_PORT

An ARG value can not be used within a run time instruction or inside the source code.

Building the image

The utility of ARG instruction is that based on a same Dockerfile, we could build two images with two different configurations. Here, using two different ports.

# IMAGE 1 
docker build . -t feedback-app:first --build-arg DEFAULT_PORT=6000

# IMAGE 2 
docker build . -t feedback-app:second --build-arg DEFAULT_PORT=6666

PreviousBasic - Part 1 NextBasic - Part 3

Last updated 1 year ago

docker run --rm \ -d \ -p 80:80 \ --name feedback-app \ -v feedback:/app/feedback \ -v "<LOCALHOST_ABS_PATH_TO_SHARE>:/app" \ data_volume

docker run --rm \ -d \ -p 80:80 \ --name feedback-app \ -v feedback:/app/feedback \ -v "<LOCALHOST_ABS_PATH_TO_SHARE>:<CTN_PATH>" \ -v /app/node_modules # to avoid the node_modules being overwritten data_volume

docker run -d -p 4444:8888 # the app is accessible on port 4444 localhost --env PORT=8888 # the server.js will run on port 8888 within the container --rm --name feedback-app -v "D:\Offensive_Notes\Udemy\Docker_Kubernetes_PraticalGuide_2022\data- volumes-02-added-dockerfile\data-volumes-02-added-dockerfile:/app:ro" -v /app/temp -v feedback:/app/feedback -v /app/node_modules data_volume