Lab 04 - Docker + VSCode
Dockerization + VSCode

1. Activity Identity
| Activity title | Introduction to Robotics |
|---|---|
| Topic | Docker / DevOps / IDE |
| Authors | Institute of Robotics and Machine Intelligence Dominik Belter, Jakub Chudzinski, Marcin Czajka, Kamil Młodzikowski |
| Target learners | Bachelor (Computer Science / IT, Robotics) |
| Estimated duration | 1.5 hour |
| Difficulty level | Beginner |
| FOSSBot environment | Linux workstation |
| Licence | CC BY 4.0 |
2. Learning Objectives and Competences
| ID | Learning outcome | Related competences | Assessment evidence |
|---|---|---|---|
| LO1 | Students will be able to pull images and start, inspect, enter and
stop containers using basic Docker CLI commands (pull,
run, ps, exec, stop,
rm). |
Knowledge of containerisation tools; selecting programming tools | Screenshot of docker ps and curl against a
running container (Submission item
1) |
| LO2 | Students will be able to write a Dockerfile and build a
custom image that packages a Python application together with its
dependencies. |
Selecting programming tools; using libraries for designing robot software components | Screenshot of the built image (Submission item 2) |
| LO3 | Students will be able to use docker-compose to run a
multi-service setup and use VSCode Dev Containers to develop inside a
container. |
Selecting programming tools; integrating tooling for robot software development | Screenshots of docker compose ps -a and the VSCode dev
container (Submission items 3 and
4) |
3. Prerequisites
A workstation running Linux with a working network connection.
Basic computer literacy: comfortable using a keyboard and mouse, opening applications, capturing screenshots.
Basic terminal skills (Lab 1 covers everything you need).
4. Required Material and Setup
| Category | Item | Version / Quantity | Notes |
|---|---|---|---|
| Hardware | Workstation | 1 per student | Any Linux PC. |
| Software | Docker Engine | pre-installed on the lab workstations | Lab 4 assumes you can run docker without
sudo. |
| Software | VSCode + Dev Containers extension | pre-installed on the lab workstations | The extension is published as
ms-vscode-remote.remote-containers. |
| Software | git |
bundled with most Linux distributions | Used to clone the starter repository. |
| Starter code | fossbot-text-to-cmd |
from GitHub | Contains the application you will containerise. Pull a fresh clone in Step 1. |
| Hardware | NVIDIA GPU + container toolkit (optional) | only used in Step 7 | Required only for the GPU bonus step. Skip if not available. |
5. Safety, Ethics and Accessibility Notes
The only risks in this lab are operational:
docker runpulls images from public registries. Only run images you trust (the lab uses official images from Docker Hub).docker system pruneanddocker volume rmpermanently delete data. Read every destructive command before pressing Enter.Bind mounts expose part of your host filesystem to the container. A misbehaving (or malicious) program inside the container can modify those files.
6. Scenario and Problem Statement
In Lab 3 you built a command-line application that translates
natural-language commands into wheel motor speeds. It runs locally
inside a venv with several Python dependencies
(scikit-learn, sentence-transformers,
torch). Distributing it to a colleague means asking them to
install the right Python version and the right libraries on their own
machine - a step that breaks more often than not in practice.
In this lab you will package the same application into a Docker image so that anyone with Docker installed can run it with a single command. You will then learn how to:
- Develop inside a container using VSCode Dev Containers, so your editor uses the container’s Python and libraries (no host-side venv needed).
- Compose multiple services together with
docker-compose. - Pass GPU access through to a container - the standard pattern for AI workloads.
7. Lab Workflow
| Phase | Student action | Expected output | Time |
|---|---|---|---|
| 1. Setup | Verify Docker, clone the starter | docker --version works; starter cloned |
5 min |
| 2. Concepts | Read about how containers differ from VMs | Working mental model of containers | 10 min |
| 3. First container | Run hello-world and an interactive Ubuntu shell |
Two containers run successfully | 10 min |
| 4. Build image | Write a Dockerfile for the text-to-cmd app |
A built image runs the application | 15 min |
| 5. Volumes & bind mounts | Mount input / output directories into the container | Container reads and writes host files | 15 min |
| 6. docker-compose | Wire two services together with a compose file | docker compose up starts everything |
10 min |
| 7. GPU passthrough (optional) | Run a container with --gpus all |
nvidia-smi works inside the container |
5 min |
| 8. VSCode Dev Containers | Create .devcontainer/devcontainer.json and reopen in
container |
VSCode runs inside the image | 15 min |
| 9. Bonus: run on the FOSSBot (optional) | Ship the image to the robot and run it there | The classifier produces JSON on the robot | 5 min |
| 10. Cleanup | Remove containers, images, starter directory | Clean /tmp and Docker state |
3 min |
| 11. Reflection | Answer the analysis questions | Short answers | 2 min |
8. Step-by-Step Instructions
Step 1 - Environment preparation
💡 Lab workstation credentials. Every workstation in the lab uses the same local account: username
put, passwordlrm.
Log in to your lab workstation and open a terminal (
Ctrl+Alt+Ton Ubuntu).Clean up state from any previous lab session. Remove leftover screenshots, any starter directory from a previous run, and any Docker artifacts that this lab will (re)create. This matches what Step 10 at the end of the lab tears down, so if the previous user ran their cleanup properly most parts will be no-ops:
docker compose -p fossbot-text-to-cmd down --volumes 2>/dev/null; \
docker rm -f myweb 2>/dev/null; \
docker image rm fossbot-text-to-cmd:latest fossbot-text-to-cmd:gpu 2>/dev/null; \
docker image rm $(docker images --filter "reference=vsc-fossbot-text-to-cmd*" -q) 2>/dev/null; \
rm -rf ~/Pictures/Screenshots /tmp/fossbot-text-to-cmd /tmp/host-output /tmp/host-input.txtThe ; chains the sub-commands so each one runs even if a
previous one had nothing to remove, and 2>/dev/null
silences the “no such container / no such image” messages on a fresh
workstation.
- Verify that Docker is installed and that you can use it
without
sudo:
docker --version
docker infoBoth commands should print useful output without asking for a password. The first prints the Docker version; the second dumps a summary of the running daemon, including how many images and containers are currently on this workstation.
If Docker is not installed (reference only - the lab workstations come with it)
Follow the official Ubuntu install guide. The headline steps are:
# Remove any older versions
sudo apt remove docker docker-engine docker.io containerd runc
# Install prerequisites
sudo apt update
sudo apt install -y ca-certificates curl gnupg lsb-release
# Add Docker's official GPG key and repository
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Allow your user to run docker without sudo
sudo usermod -aG docker $USER
# Log out and back in for the group change to take effect.- Clone the starter repository into
/tmp:
cd /tmp
git clone https://github.com/LRMPUT/fossbot-text-to-cmd.git
cd fossbot-text-to-cmd💡 Tip: This is the same repository used in Lab 3. If you completed Lab 3, the classifiers in
src/are exactly the application we will containerise. If you skipped Lab 3 or did not finish, copy the reference solutions on top of the skeleton:cp _solutions/classifier_sklearn.py src/classifier_sklearn.py cp _solutions/classifier_st.py src/classifier_st.py
Expected result: docker --version
prints a version string, docker info runs without errors,
and your prompt is inside the cloned fossbot-text-to-cmd/
directory.
Step 2 - How Docker actually works
This step is a short conceptual read - no commands to run yet. The goal is to give you a working mental model of what a container is, what it is not, and why this matters in practice.
Key terms
- Image: a packaged filesystem snapshot plus metadata (entrypoint, default command, environment variables). It is read-only and reusable.
- Container: a running instance of an image. You can start, stop and remove containers; they leave the image untouched.
- Layer: an image is built up from stacked filesystem
layers. Each instruction in a
Dockerfileproduces one layer. Layers are cached and shared between images. - Registry: a server that stores images (e.g. Docker Hub at https://hub.docker.com).
Containers vs virtual machines
Both let you run “another system” on top of your host, but they are optimised for different things and have different trade-offs. Neither is universally better.
| Aspect | Virtual Machine | Docker container |
|---|---|---|
| What is virtualised | The whole computer, including its own kernel | Just the userspace - applications and libraries |
| Guest OS | Any OS (Linux, Windows, BSD, …) regardless of host | Same family as host - on Linux you run Linux containers |
| Isolation strength | Strong - hypervisor enforces separate kernels and memory | Process-level - all containers share the host kernel |
| Resource overhead | Higher - each VM boots and runs its own OS | Lower - no kernel to boot, no driver stack to load |
| Configuration | OS installer + manual setup, or a pre-built image | A short text recipe (Dockerfile) |
| Persistence model | VM keeps its disk and state across reboots | Containers are short-lived by default; persistent data lives in volumes |
| Networking | Each VM gets its own virtual network adapter | Containers share the host kernel’s network stack, with namespaces for isolation |
Where Docker has a clear advantage over virtual machines
- Describe an environment as a short text recipe. A
Dockerfileis a readable list of commands; the same recipe always builds the same image. With a VM you usually install an OS and click through configuration screens, then snapshot the result - much harder to keep in version control or to diff between two versions. - Share and reuse parts of an environment. Docker images are made of layers; if two images share a base layer it is stored once. You can pull a 200 MB image even if it conceptually contains “all of Ubuntu” because the Ubuntu layer is reused. VM images are monolithic disks - you copy the whole thing every time.
- Start a fresh, isolated environment in milliseconds. A container is just a process tree with its own filesystem view; there is no kernel to boot. That makes it cheap to create a new container per test, per build, per pull request.
- Run many lightweight instances side by side. Because containers share the host kernel and start fast, you can run tens or hundreds on a single workstation - one per microservice, one per worker, one per CI job. Trying the same with VMs would exhaust RAM.
- Distribute applications as a single artefact. A
Docker image bundles the app, the Python version, the libraries, the
system packages and the configuration. Anyone with Docker can run it
with one command - no
pip install, no “works on my machine” problems.
In one sentence: a VM virtualises the machine; a container packages the application’s environment.
What is actually inside an Ubuntu image?
When you docker pull ubuntu:24.04 you get the Ubuntu
userspace - the filesystem layout, bash,
apt, glibc, all the standard utilities. You do
not get the Linux kernel. Containers share the kernel
of the host.
That is why containers start in milliseconds: there is no kernel to boot.
Concrete consequence: an Ubuntu container running on top of Ubuntu 24.04 sees the host’s kernel:
docker run --rm ubuntu:18.04 uname -a
# Linux ...something... 6.8.0-117-generic ... (the host's kernel, not 18.04's)
The bash and apt inside the container come
from Ubuntu 18.04, but uname reports the host’s kernel
version. We will run this command for real in Step 3.
Docker on Windows and macOS
Docker containers are a Linux feature - they rely on Linux kernel facilities (namespaces, cgroups). So how can Docker also run on Windows and macOS?
- Windows: Docker Desktop uses WSL2 (Windows Subsystem for Linux 2), which is a thin Linux VM with its own kernel. All your containers run inside that hidden Linux environment.
- macOS: Docker Desktop runs a tiny Linux VM (HyperKit or similar).
- Linux: Docker runs natively, with no virtualisation layer in between.
This means an ubuntu:24.04 image runs the same
on every host, but on Windows and macOS there is an extra
virtualisation hop. You pay a small performance and disk-space cost on
those systems.
What if my image and my host use different Ubuntu versions?
Suppose your host runs Ubuntu 24.04 (kernel 6.x) and you run an
ubuntu:18.04 container.
- The image gives you Ubuntu 18.04 userspace - the
bash,aptand libraries you would have had on Ubuntu 18.04. - The kernel is still the host’s 6.x kernel.
This works because the Linux kernel exposes a stable, backward-compatible system call interface. Programs compiled for kernel 4.x normally still run on kernel 6.x. The rare exceptions are programs that depend on very old, removed system calls.
The reverse direction (a newer image on an
older host kernel - for example
ubuntu:24.04 on a host with kernel 4.x) sometimes works but
is riskier. The rule of thumb: the host kernel should be at
least as new as the kernel the image was built for.
In practice the common case - running an older or equal-age userspace on a modern host kernel - works freely. This is one of the most useful Docker features: you can run “Ubuntu 18.04” or “Debian 12” containers on any modern Linux host without installing a second OS.
Why we care for this course
Robotics projects pile up dependencies fast: a specific Python version, a specific OpenCV build, ROS 2, CUDA, PyTorch. Containers let you freeze those dependencies into an image, share it with collaborators or copy it onto the robot, and reproduce the same environment everywhere. In the rest of this lab you will do exactly that for the text-to-cmd application from Lab 3.
Expected result: You can answer in your own words:
“what is the difference between a container and a virtual machine?”,
“what is inside a Docker image?” and “why does an
ubuntu:18.04 container run on a 24.04 host?”. No
screenshots to take in this step.
Step 3 - Your first container
Time to use Docker. You will run two small containers, learn the
basic lifecycle commands (run, ps,
exec, stop, rm) and verify the
claim from Step 2 that a container uses the host’s kernel.
- Run the canonical “hello world” container. This is the simplest possible check that Docker works end to end:
docker run hello-worldThe first time you run it, Docker reports that the image is not
available locally and pulls it from Docker Hub. Then it starts a
container that prints a short message and exits. The image
(hello-world) is a few hundred bytes - the message is the
entire application.
- See what just happened. List the containers Docker remembers:
docker ps # currently running containers - probably empty
docker ps -a # all containers, including ones that have exiteddocker ps -a should show one entry: the
hello-world container with status Exited (0).
Containers stick around after they finish so you can inspect logs or
restart them. Remove the leftover with:
docker rm <CONTAINER_ID>(use the first few characters of the ID - Docker accepts unique prefixes).
💡 Tip: Add
--rmtodocker runto auto-delete the container as soon as it exits, for one-off commands:docker run --rm hello-world
- Start an interactive Ubuntu shell. This pulls a
real Ubuntu image (~80 MB) and drops you into a
bashprompt inside it:
docker run -it --rm ubuntu:24.04 bash-ikeeps STDIN open-tallocates a pseudo-TTY (so the shell behaves normally)--rmauto-removes the container on exitubuntu:24.04is the image (ubuntuis the name,24.04is the tag)bashis the command to run inside the container
Your prompt should change to something like
root@<container_id>:/#. You are now inside the
container as root.
- Look around inside the container. Try a few commands:
ls /
cat /etc/os-release # confirms the userspace - "Ubuntu 24.04.x LTS"
dpkg -l | wc -l # very small package count - this is a minimal Ubuntu
uname -a # prints the HOST's kernel version, not the image's
exit # leaves the container; --rm deletes itThe uname -a result is the proof that containers share
the host kernel: you are “inside Ubuntu 24.04” but the kernel version
matches whatever your workstation runs.
- Try an older Ubuntu to see the cross-version effect. Repeat the experiment with an older image:
docker run --rm ubuntu:18.04 bash -c "cat /etc/os-release | head -2 && uname -a"The first two lines of /etc/os-release should say Ubuntu
18.04. uname -a still reports your host kernel. You just
ran an Ubuntu 18.04 userspace on top of your modern kernel without
installing a second OS.
- Run something useful in the background. Start a small web server container in detached mode:
docker run -d --name myweb -p 8088:80 nginx:alpine-druns the container detached (returns immediately, container keeps running)--name mywebgives the container a memorable name-p 8088:80maps port 80 inside the container to port 8088 on the hostnginx:alpineis a small (~7 MB) image with the nginx web server
Check that it is running, then verify the web server responds:
docker ps # should show myweb, status "Up ..."
curl http://localhost:8088 # nginx welcome page (HTML)📸 Capture for submission: screenshot the terminal showing the
docker psoutput (includingmyweb) together with thecurl http://localhost:8088output, while the container is still running.
💡 Tip: If you see an error like
address already in use, another program on your workstation is already listening on that host port. Pick a different port (e.g.-p 8089:80) and re-run. Don’t forget todocker rm mywebfirst if the previous attempt left a stopped container behind.
- Enter the running container to look around without stopping it:
docker exec -it myweb sh
# inside the container:
ls /usr/share/nginx/html # nginx default site files
exit- Stop and remove the container when you are done:
docker stop myweb
docker rm myweb
docker ps -a # confirm it is goneExpected result: You have run three different
containers (hello-world, two interactive
ubuntu shells, an nginx web server), used the
lifecycle commands run, ps, exec,
stop and rm, and confirmed that the kernel
inside an Ubuntu container is your host’s kernel.
Step 4 - Build a custom image
So far you have run images that other people published. In this step you will package the text-to-cmd application from Lab 3 into your own Docker image.
What is a Dockerfile?
A Dockerfile is a text recipe that tells
docker build how to construct an image, one step at a time.
Each instruction creates a new layer on top of the
previous one. The same recipe always builds the same image, so the file
can live in your repository alongside the code.
The instructions you will use here:
| Instruction | What it does |
|---|---|
FROM <image> |
Start from an existing image (your base). Every Dockerfile starts
with FROM. |
WORKDIR <path> |
Set the working directory inside the image. Equivalent to
cd for later layers. |
COPY <src> <dst> |
Copy files from your host (the build context) into the image. |
RUN <command> |
Execute a command inside the image at build time. Its output is a new layer. |
CMD ["arg", ...] |
Default command that is run when someone does
docker run without overriding it. |
Write the Dockerfile
Make sure you are inside the starter directory, then create a file
called Dockerfile at its top level:
cd /tmp/fossbot-text-to-cmd
nano DockerfileType (or paste) the following content:
# Start from a small official Python image (Debian slim + Python 3.12)
FROM python:3.12-slim
# All subsequent paths are relative to /app inside the image
WORKDIR /app
# Install Python dependencies first, in two layers, so Docker can cache them
# Layer 1: CPU-only PyTorch (much smaller than the default CUDA build)
RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu
# Layer 2: the rest of the requirements
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
# Copy the application code and the dataset
COPY src/ src/
COPY data/ data/
# Default command if someone just runs `docker run <image>`
CMD ["python", "-m", "src.text_to_wheels", "--help"]Save (Ctrl+O, Enter) and exit
(Ctrl+X).
💡 Tip: Why install
torchfirst and thenrequirements.txt? Because the heaviest layer (the PyTorch download) almost never changes, so Docker can re-use it from cache on every subsequent build. Therequirements.txtlayer is smaller and will rebuild only when you change that file. We will look at caching again at the end of this step.
Build the image
docker build -t fossbot-text-to-cmd:latest .-t <name>:<tag>tags the image with a name (fossbot-text-to-cmd) and a version tag (latest).- The
.at the end is the build context - the directory whose contents are sent to the Docker daemon. Files outside this directory cannot beCOPYed.
The first build takes a few minutes - most of the time is spent
downloading the CPU-only PyTorch wheel and the other Python packages.
Docker prints one line per Dockerfile instruction; you should see seven
=> [step] lines.
Inspect what you built
docker images fossbot-text-to-cmd # the image and its size
docker history fossbot-text-to-cmd:latest # the layers, newest first (top) to oldest (bottom)The image is around 1.5 GB - most of it is PyTorch and its
dependencies. The docker history output shows the layers
and the size each one added.
Run the image
Run the default CMD (which prints the CLI help):
docker run --rm fossbot-text-to-cmd:latestNow override the default command and process the sample input file that lives inside the image:
docker run --rm fossbot-text-to-cmd:latest \
python -m src.text_to_wheels \
--input data/examples/basic.txt \
--output /tmp/result.json \
--classifier sklearnThe output JSON is written to /tmp/result.json
inside the container. Because we did not mount any host
directory, the file disappears with the container - that is what Step 5 is going to fix.
Layer caching - rebuild to see it work
Run the build again without changing anything:
docker build -t fossbot-text-to-cmd:latest .This time it finishes in seconds. Docker recognised that every
instruction had the same inputs as before and reused all cached layers.
Now edit src/wheel_mapping.py (for example change
0.5 to 0.6 in the forward
action), then rebuild:
docker build -t fossbot-text-to-cmd:latest .Only the layers from COPY src/ src/ onwards rebuild -
the heavy RUN pip install layers stay cached because the
files they depend on (requirements.txt, the index URL) did
not change. That is why the order of instructions in a
Dockerfile matters: cheap-to-rebuild things go at the
bottom, expensive things at the top.
Expected result: The terminal shows the running
container printing CLI help, and a second run that finishes successfully
and would have written /tmp/result.json inside the
container. docker images lists
fossbot-text-to-cmd with a tag of latest.
📸 Capture for submission: screenshot the terminal showing the last few lines of
docker build(with the linenaming to docker.io/library/fossbot-text-to-cmd:latest) followed bydocker images fossbot-text-to-cmd.
Step 5 - Volumes and bind mounts
In the previous step you wrote a JSON result to
/tmp/result.json inside the container -
and it disappeared with the container the moment it exited. In real use
you almost always want one of two things instead:
- Bind mount: take a directory or a file on your host and make it visible inside the container at a chosen path. The container reads and writes the same files you can read and write on the host.
- Volume: a directory managed by Docker, living
somewhere under
/var/lib/docker/. You give it a name, mount it into one or more containers, and Docker handles where the bytes actually live.
Roughly:
| Feature | Bind mount | Volume |
|---|---|---|
| Where it lives | A path on your host that you choose | Managed by Docker, hidden under /var/lib/docker/ |
| Created by | -v <absolute_host_path>:<container_path> -
Docker sees a path on the left and binds it |
docker volume create <name>, or implicitly by
-v <volume_name>:<container_path> - Docker sees
a bare name on the left and uses a managed volume |
| Best for | Sharing source code or data with the container, editing files on the host | Persistent state between container runs (databases, model caches) |
| Survives | As long as you do not delete the host directory | Until you docker volume rm it |
| Downsides | Tied to a host path, less portable | Less convenient to inspect from the host |
In this step you will use both.
Bind mount: read and write host files from the container
- Create an output directory on the host and remember its path:
mkdir -p /tmp/host-output- Run the application with a bind-mounted output directory so the JSON ends up on the host:
docker run --rm \
-v /tmp/host-output:/output \
fossbot-text-to-cmd:latest \
python -m src.text_to_wheels \
--input data/examples/basic.txt \
--output /output/sklearn_basic.json \
--classifier sklearn-v <host_path>:<container_path>is the bind-mount flag. Both paths must be absolute./tmp/host-outputon your workstation is mounted at/outputinside the container.- The application writes
/output/sklearn_basic.jsonfrom its point of view - which is/tmp/host-output/sklearn_basic.jsonon the host.
- Confirm the file is on the host:
ls /tmp/host-output/
cat /tmp/host-output/sklearn_basic.json | head -10- Bind a single input file to override the dataset baked into the image. Create your own input on the host:
cat > /tmp/host-input.txt <<'EOF'
forward
turn left
halt
EOFThen mount it into the container as the input file:
docker run --rm \
-v /tmp/host-input.txt:/input.txt \
-v /tmp/host-output:/output \
fossbot-text-to-cmd:latest \
python -m src.text_to_wheels \
--input /input.txt \
--output /output/custom_result.json \
--classifier stCheck the result:
cat /tmp/host-output/custom_result.jsonThe container processed YOUR file even though it was never copied into the image. Bind mounts are how you give a containerised application its data without rebuilding.
Volume: state managed by Docker
A volume is useful when you want persistent state that is not tied to a specific host path - for example a cache that several containers should share, or model files you do not want to re-download on every container start.
- Create a named volume:
docker volume create text-to-cmd-output
docker volume ls
docker volume inspect text-to-cmd-outputThe inspect output shows the on-disk location (under
/var/lib/docker/volumes/). You normally do not touch that
path directly - you just refer to the volume by name.
- Use the volume by mounting it the same way as a bind mount, but with the volume name on the left side of the colon:
docker run --rm \
-v text-to-cmd-output:/output \
fossbot-text-to-cmd:latest \
python -m src.text_to_wheels \
--input data/examples/basic.txt \
--output /output/in_volume.json \
--classifier sklearn- The result is in the volume, not on a host path you chose. Run a second throwaway container to read it back:
docker run --rm \
-v text-to-cmd-output:/output \
fossbot-text-to-cmd:latest \
cat /output/in_volume.jsonThe same volume mounted into two different container runs gave you persistent state without leaving any visible trace in your home directory.
- Remove the volume when you are done with it (the file inside disappears with it):
docker volume rm text-to-cmd-output
docker volume lsExpected result:
cat /tmp/host-output/sklearn_basic.json prints valid JSON,
cat /tmp/host-output/custom_result.json shows the
predictions for your own three-line input file, and
docker volume ls no longer lists
text-to-cmd-output after you removed it.
Step 6 - docker-compose
Up to now you have started containers one at a time with long
docker run commands. Real applications usually consist of
several services running together (a frontend + an API
+ a database, for example), and even single-service apps benefit from
having their run configuration written down so you do not
have to remember the right flags every time.
That is what Compose is for: a YAML file describes
one or more services, and docker compose up starts them all
with their volumes, environment variables and dependencies wired up
correctly.
Write the compose file
Create a file called docker-compose.yml at the top of
the starter directory (same place as the Dockerfile):
cd /tmp/fossbot-text-to-cmd
nano docker-compose.ymlPaste in the following:
services:
basic-sklearn:
image: fossbot-text-to-cmd:latest
volumes:
- ./compose-output:/output
command: >
python -m src.text_to_wheels
--input data/examples/basic.txt
--output /output/basic_sklearn.json
--classifier sklearn
basic-st:
image: fossbot-text-to-cmd:latest
volumes:
- ./compose-output:/output
command: >
python -m src.text_to_wheels
--input data/examples/basic.txt
--output /output/basic_st.json
--classifier stWhat this says:
services:is the top-level key. Everything below it defines one container that Compose will manage.- Each service has a name (
basic-sklearn,basic-st) and reuses the image you built in Step 4. - Both services bind-mount the same host directory
./compose-outputat/outputinside the container. The directory is created automatically if it does not exist yet. command:overrides the image’s defaultCMD. The>makes YAML fold the next indented lines into one string, so the long invocation stays readable.
💡 Tip: Even though the YAML key is
volumes:, the entry./compose-output:/outputis a bind mount - the left side starts with./, which Compose treats as a host path. A bare name likemydata:/outputwould refer to a managed volume that must also be declared in a top-levelvolumes:section. Same rule as fordocker run -vfrom Step 5.
Run everything with one command
docker compose upCompose pulls or reuses the image, creates the two containers, starts them in parallel and streams their stdout to your terminal, each line prefixed with the service name. The containers run, write their JSON files, and exit. Compose returns control once both services are done.
Check that both result files landed on the host:
ls compose-output/
cat compose-output/basic_sklearn.json | head -10
cat compose-output/basic_st.json | head -10Inspect what Compose did
The containers it just ran are now stopped but still listed:
docker compose ps # services managed by this compose project
docker compose ps -a # including the ones that exitedYou can also re-run them without recreating from scratch:
docker compose up # re-runs anything that has changedTear down
docker compose downThis stops and removes the containers and the default network Compose
created for them. The image stays on disk; the
./compose-output directory and its files also stay (it is a
host bind mount).
Switch the storage to a named volume and chain in reader services
Now redo the same exercise but with a managed volume
instead of a host directory, and add two extra services that consume the
JSON files that the first two services produce. This shows three things
at once: how to declare a top-level volume, how depends_on
orders services, and how containers share data through a volume without
anything appearing on the host filesystem.
Replace the contents of docker-compose.yml with the
skeleton below and complete the TODO sections yourself:
services:
basic-sklearn:
image: fossbot-text-to-cmd:latest
volumes:
- text-to-cmd-output:/output
command: >
python -m src.text_to_wheels
--input data/examples/basic.txt
--output /output/basic_sklearn.json
--classifier sklearn
basic-st:
image: fossbot-text-to-cmd:latest
volumes:
- text-to-cmd-output:/output
command: >
python -m src.text_to_wheels
--input data/examples/basic.txt
--output /output/basic_st.json
--classifier st
reader-sklearn:
image: fossbot-text-to-cmd:latest
# TODO 1: mount the named volume at /output (same as the producers above).
# Docs + example: https://docs.docker.com/reference/compose-file/services/#short-syntax-5
volumes:
# TODO 2: write the command that prints the sklearn JSON file to stdout.
# Docs + example: https://docs.docker.com/reference/compose-file/services/#command
command:
depends_on:
basic-sklearn:
condition: service_completed_successfully
reader-st:
image: fossbot-text-to-cmd:latest
# TODO 3: same volume mount as in reader-sklearn.
volumes:
# TODO 4: same idea as TODO 2 but for the ST result file.
command:
depends_on:
basic-st:
condition: service_completed_successfully
# TODO 5: declare the named volume that all four services mount above.
# Docs + example: https://docs.docker.com/reference/compose-file/volumes/
volumes:Hint - reference solution
reader-sklearn:
image: fossbot-text-to-cmd:latest
volumes:
- text-to-cmd-output:/output
command: cat /output/basic_sklearn.json
depends_on:
basic-sklearn:
condition: service_completed_successfully
reader-st:
image: fossbot-text-to-cmd:latest
volumes:
- text-to-cmd-output:/output
command: cat /output/basic_st.json
depends_on:
basic-st:
condition: service_completed_successfully
volumes:
text-to-cmd-output:Run it:
docker compose upEach reader’s output is streamed to your terminal prefixed with the
service name, so you see the contents of both JSON files printed inline.
There is nothing on the host: ls compose-output/ (if the
directory still exists from the earlier run) does not get any new files,
and docker volume ls lists the new
text-to-cmd-output volume.
📸 Capture for submission: after
docker compose upof the second compose file finishes, capture a screenshot ofdocker compose ps -a(showing all four services withExited (0)) together with the prompt where you ran it. The JSON content does not need to be in the screenshot - theExited (0)state of all four services is what proves the pipeline ran end-to-end.
Tear down everything, including the volume this time:
docker compose down --volumes
docker volume lsThe volume is gone, the containers are gone, the JSON files that
lived in the volume are gone. The image and the host
./compose-output from the earlier exercise are
untouched.
Expected result: After the first compose file
ls compose-output/ shows basic_sklearn.json
and basic_st.json with valid JSON content. After the second
compose file the readers stream the JSON to your terminal during
docker compose up, no new files appear in
compose-output/, and docker volume ls lists
text-to-cmd-output until you tear it down with
--volumes.
Step 7 - GPU passthrough
By default a container cannot see the host GPU - the Docker process
is isolated from /dev/nvidia* devices and from the
userspace driver libraries. To make the GPU visible inside the container
you need two things on the host:
- An NVIDIA driver installed and working (
nvidia-smiruns from the host shell). - The NVIDIA Container Toolkit installed and registered as a Docker runtime.
Tip: If you do not have an NVIDIA GPU on your machine (AMD/Intel only, or a non-Linux host without GPU passthrough configured), read through this step but skip the commands - the rest of the lab does not depend on a working GPU.
Verify the host has everything in place:
nvidia-smi docker info | grep -i "runtime"The first command must print a table with your GPU, driver version, and CUDA version. The second command must list
nvidiaamong the available runtimes (alongside the defaultrunc). Ifnvidia-smiworks butdocker infodoes not listnvidia, you are missing the NVIDIA Container Toolkit - install it from the official guide and rerun.Run a CUDA base image without the
--gpusflag and try to callnvidia-smifrom inside:docker run --rm nvidia/cuda:12.5.0-base-ubuntu22.04 nvidia-smiThe
nvidia-smibinary is present in the image, but the command fails because the host’s NVIDIA driver libraries (libnvidia-ml.so.1) and/dev/nvidia*device nodes are not visible inside the container - the container is isolated from the host’s GPU stack until something injects them.Add the
--gpus allflag and rerun the same image:docker run --rm --gpus all nvidia/cuda:12.5.0-base-ubuntu22.04 nvidia-smiNow
nvidia-smiruns inside the container and prints the same table you saw on the host - GPU model, driver version, CUDA version, memory, and the (empty) process list of the container. The container sees the GPU because the NVIDIA Container Toolkit injected the driver libraries and device nodes at startup.Compose has its own syntax for the same thing. The equivalent of
--gpus allin adocker-compose.ymlservice is:services: gpu-job: image: nvidia/cuda:12.5.0-base-ubuntu22.04 command: nvidia-smi deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu]You can replace
count: allwithcount: 1to expose only the first GPU, or usedevice_ids: ["0", "2"]to pick specific GPUs by index. No need to test this snippet now - the syntax is just for your reference.The
fossbot-text-to-cmdimage is CPU-only - rebuild a GPU variant. OurDockerfileinstalls PyTorch from--index-url https://download.pytorch.org/whl/cpu(Step 4), which is the CPU-only build. Even with--gpus all, sentence-transformers would still run on CPU because the installed PyTorch does not have CUDA support compiled in. GPU passthrough is two-sided: the host must expose the GPU (toolkit +--gpus all), and the image must be built with a GPU-capable framework. Build a GPU variant of the image and verify it actually uses CUDA:- Open the
Dockerfileand change only the PyTorch install line to use a CUDA wheel index. Pick a CUDA version supported by your driver from https://pytorch.org/get-started/locally/ -cu121is a safe default for recent drivers. Example:
RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cu121- Build the GPU variant under a separate tag so the CPU image you already have stays usable:
docker build -t fossbot-text-to-cmd:gpu .This download is significantly larger than the CPU build (~2 GB) and the build will take a few minutes.
- Before running the app, verify PyTorch inside the new image sees the GPU:
docker run --rm --gpus all fossbot-text-to-cmd:gpu \ python -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('Device:', torch.cuda.get_device_name(0))"You should see
CUDA available: Trueand your GPU model.- Now run the actual sentence-transformer classifier on
GPU. Go back to Step 5 task 2 (the
bind-mounted
docker runthat produced/tmp/host-output/sklearn_basic.json) and modify that command so it:
- uses the GPU image you just built (different tag than
:latest), - exposes the GPU to the container (the flag you learned in task 3),
- runs the
stclassifier instead ofsklearn(sklearn does not use PyTorch, so GPU would not help), - writes the output to a new filename so it does not overwrite the CPU result.
Write the modified command yourself and run it. The output JSON should match the structure of the CPU run from Step 5 (same actions, same wheel speeds for the same inputs). To prove the GPU was actually used, open a second terminal before launching the run and start a continuous monitor:
nvidia-smi -l 1 # refresh every 1 s; Ctrl+C to stop # or: watch -n 0.5 nvidia-smiThen trigger the
docker runin the first terminal. While the container is running you should see apythonprocess appear in theProcesses:section ofnvidia-smi, with a few hundred MB of GPU memory used. The process disappears as soon as the container exits.Hint - reference solution
docker run --rm --gpus all \ -v /tmp/host-output:/output \ fossbot-text-to-cmd:gpu \ python -m src.text_to_wheels \ --input data/examples/basic.txt \ --output /output/st_gpu.json \ --classifier st- Open the
Expected result: nvidia-smi runs
successfully inside the nvidia/cuda:... container when
--gpus all is passed, and fails or is missing when it is
not. The output of the in-container nvidia-smi matches the
host’s nvidia-smi for driver/CUDA version and GPU model.
The fossbot-text-to-cmd:gpu image prints
CUDA available: True (task 5c) and the GPU run from task 5d
produces an output JSON identical in structure to the CPU run, with a
python process visible in nvidia-smi on the
host while the container is running.
Step 8 - VSCode Dev Containers
Up to now every container has been a runtime sandbox
- you build an image, the container runs the application once, exits,
and you never edit code from inside it. A dev container
turns the picture inside out: VSCode itself runs as a thin client on the
host, but the workspace, the Python interpreter, the debugger, and every
command you type in the integrated terminal live inside
the container. Editing a .py file feels the same as editing
it on the host, except the runtime that executes it is the one from your
Dockerfile. This guarantees that your dev environment
matches whatever runs in production - no “works on my machine”.
You drive everything from a single configuration file:
.devcontainer/devcontainer.json. VSCode reads it, builds
(or reuses) the image, mounts your project folder into the container as
the workspace, attaches an editor server inside, and finally drops you
into a VSCode window that looks identical to a normal one - only the
bottom-left status bar shows Dev Container: ... to remind
you where you are.
Install the Dev Containers extension in VSCode (publisher: Microsoft, extension ID
ms-vscode-remote.remote-containers). Either click the Extensions icon and search for “Dev Containers”, or run from a host terminal:code --install-extension ms-vscode-remote.remote-containers💡 Tip: If you already have Microsoft’s Remote Development extension pack installed (
ms-vscode-remote.vscode-remote-extensionpack), you do not need to install Dev Containers separately - the pack bundles it along with Remote-SSH, Remote-Tunnels and WSL. The pack is heavier but useful if you also work over SSH or inside WSL.Create
.devcontainer/devcontainer.jsonat the root offossbot-text-to-cmdwith the contents below. The three things this config does:- builds the dev container image from the existing
Dockerfileyou wrote in Step 4 (no second Dockerfile), - installs the Microsoft Python extension automatically the first time the container is created,
- gives the container a recognisable name shown in the VSCode status bar.
{ // Human-readable name shown in the VSCode status bar when the // container is open. Anything would work; we match the project folder. "name": "fossbot-text-to-cmd", // Build the dev container from the project's Dockerfile. // "../Dockerfile" - this devcontainer.json sits in .devcontainer/, so ".." // points one level up to the project root where the Dockerfile lives. // ".." - build context = project root, so the Dockerfile's COPY steps see // the whole project (src/, data/, requirements.txt, ...). "build": { "dockerfile": "../Dockerfile", "context": ".." }, // Extensions installed inside the container on first open. The string is // the extension ID in the form "publisher.name" (visible on the // Marketplace page or in the Extensions panel), not the display name. // ms-python.python = official Microsoft Python extension (syntax, // debugger, linting). Add more IDs if you want them. "customizations": { "vscode": { "extensions": [ "ms-python.python" ] } } }- builds the dev container image from the existing
Reopen the folder in the container. Make sure VSCode is open on
fossbot-text-to-cmd. Then open the Command Palette (Ctrl+Shift+Pon Linux/Windows,Cmd+Shift+Pon macOS) and runDev Containers: Reopen in Container. VSCode reuses the cached image layers from your previousdocker buildso the first start should take seconds rather than minutes. A progress notification in the bottom-right shows what is happening; you can clickshow logto watch the actual build steps.When it finishes, the bottom-left of the window shows
Dev Container: fossbot-text-to-cmd.Open the integrated terminal (
Ctrl+`). The shell prompt is now coming from inside the container - you are no longer on the host. Verify:python --version pwd ls python -m src.text_to_wheels --helpYou should see Python 3.12 (the one from
python:3.12-slim, not the system Python from your host), a working directory matching what you set asworkspaceFolder(default/workspaces/fossbot-text-to-cmd), the project files, and the--helpoutput of the CLI.Edit a file from the VSCode editor and see the change from the in-container terminal. Open
src/text_to_wheels.pyin VSCode, add aprint("hello from devcontainer")line at the top of the file, save it (Ctrl+S), then in the container terminal run:python -m src.text_to_wheels --help | head -3The print appears. The host folder is bind-mounted into the container by VSCode, so edits propagate immediately in both directions.
What survives a rebuild? Your workspace files live on the host (bind-mounted into the container at
/workspaces/fossbot-text-to-cmd), but anything you install inside the container withpiporaptlives in the container’s writable layer and is erased when the container is rebuilt. Verify both halves:In the container terminal, install a package that is not in
requirements.txt:pip install requests python -c "import requests; print(requests.__version__)"It works.
Rebuild the container: Command Palette →
Dev Containers: Rebuild Container. Wait for VSCode to reload.In the new container terminal, retry the import:
python -c "import requests" # ModuleNotFoundError - the pip install is goneYour workspace files (
src/,data/,requirements.txt…) are untouched - they live on your host, the rebuild only replaces the container’s filesystem.
The
pip installonly affected the container’s writable layer and disappeared with the rebuild. To make a package permanent, add it torequirements.txtand rebuild - theRUN pip install -r requirements.txtstep in yourDockerfilewill pick it up.
Expected result:
Dev Container: fossbot-text-to-cmd is visible in the
bottom-left of the VSCode window. The integrated terminal runs Python
from the container image, sees the project files, and reflects edits
made in the VSCode editor instantly. After a rebuild, in-container
pip installs are gone but workspace edits remain.
📸 Capture for submission: screenshot of the VSCode window showing (1)
Dev Container: fossbot-text-to-cmdin the bottom-left status bar, (2) the integrated terminal with the output ofpython --versionandpython -m src.text_to_wheels --help, (3)src/text_to_wheels.pyopen in an editor tab.
💡 Tip:
devcontainer.jsonhas a second mode - instead ofbuild.dockerfileyou can usedockerComposeFile+serviceto point at a compose file and name one of its services as your dev environment. The other services in the same compose file come up alongside.
Step 9 - Optional bonus: run the container on the FOSSBot
If your FOSSBot:v2 has Docker installed, you can ship the image you built on the workstation to the robot and run the classifier on the robot itself - no rebuild, just transfer.
The lab FOSSBot:v2 platform is a Raspberry Pi 5 (8 GB RAM, ARM64) running Ubuntu Server 24.04 LTS. SSH into it as:
- hostname:
fossbotrpi1.local - user:
admin - password:
!F055b0t
The fossbot-text-to-cmd:latest image you built in Step 4 is the CPU variant,
which is exactly what you want here: small, fast to transfer, and
self-contained.
Architecture caveat:
docker save | docker loadcopies the bytes as-is; it does not cross-compile. Your workstation is x86_64 and the lab Pi is ARM64 - if you ship the workstation build straight to the robot, the image will load but fail to start withexec format error. Rebuild the image for ARM64 before shipping:
If you completed Step 7 task 5a, revert the change in
/tmp/fossbot-text-to-cmd/Dockerfile. Find the line:RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cu121and change
cu121back tocpu:RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpuThe CUDA wheel only exists for x86_64, and the FOSSBot has no NVIDIA GPU anyway.
Register cross-platform emulators on the workstation (one-time, uses QEMU under the hood):
docker run --privileged --rm tonistiigi/binfmt --install allBuild for ARM64 and reuse the same
:latesttag:docker buildx build --platform linux/arm64 -t fossbot-text-to-cmd:latest --load .The build runs through QEMU emulation and is significantly slower than a native build - expect 10-30 minutes for the torch and sentence-transformers wheels.
Ship the image to the robot as a single pipe -
docker savewrites a tarball to stdout,docker loadon the other end reads it from stdin:docker save fossbot-text-to-cmd:latest | ssh admin@fossbotrpi1.local docker loadRun the classifier on the robot, writing the JSON to a host path on the robot:
ssh admin@fossbotrpi1.local "mkdir -p /tmp/out && docker run --rm \ -v /tmp/out:/output \ fossbot-text-to-cmd:latest \ python -m src.text_to_wheels \ --input data/examples/basic.txt \ --output /output/basic_sklearn.json \ --classifier sklearn"--rmmakes the container delete itself once the run finishes. The JSON stays on the robot under/tmp/out/basic_sklearn.json.Drive the wheels. Feed the JSON into your robot driver from Lab 2 - each entry’s
wheels: {left, right}field maps directly to the speeds the driver expects. The mapping itself is insrc/wheel_mapping.pyso you can read out the speed for any predicted action.Remove the image from the robot when you no longer need it (the workstation copy stays untouched):
ssh admin@fossbotrpi1.local docker rmi fossbot-text-to-cmd:latest
The full lifecycle - build once on the workstation, ship as a tarball, run on the robot, clean up - is the same pattern you would use to deliver a containerised application to any machine without a registry.
Step 10 - Cleanup
Leave the dev container (if Step 8 is still open): in VSCode open the Command Palette and run
Dev Containers: Reopen Folder Locally. The window reloads as a normal VSCode window on the host.Tear down everything you created in this lab in one chained command. From any directory:
docker compose -p fossbot-text-to-cmd down --volumes 2>/dev/null; \ docker rm -f myweb 2>/dev/null; \ docker image rm fossbot-text-to-cmd:latest fossbot-text-to-cmd:gpu 2>/dev/null; \ docker image rm $(docker images --filter "reference=vsc-fossbot-text-to-cmd*" -q) 2>/dev/null; \ rm -rf /tmp/fossbot-text-to-cmd /tmp/host-output /tmp/host-input.txtThe five sub-commands tear down (in order): the Compose stack and its named volume from Step 6, the standalone
mywebcontainer from Step 3, the two project images you built (:latestand the optional:gpufrom Step 7), the VSCode-built dev container image (taggedvsc-fossbot-text-to-cmd-...), and the host directories.
Expected result:
docker images | grep fossbot-text-to-cmd returns nothing,
docker ps -a | grep myweb returns nothing, and
/tmp/fossbot-text-to-cmd no longer exists. The shared base
images (python:3.12-slim, nginx:alpine, …) are
still on disk for the next student.
9. Analysis Questions
Answer each in 3-5 sentences. Refer to specific commands, files or observations from the lab where relevant.
- Bind mounts vs named volumes. What is the single key difference between a bind mount and a named volume that decides which one to use? Give one realistic situation for each.
After attempting it yourself, you may review the suggested answer
The key difference is portability. A bind mount is
tied to the host filesystem - your compose file or
docker run -v only works on machines where that exact host
path exists. A named volume is referred to by name; Docker creates it
locally on whichever machine runs the workload, so the same compose file
works the same on dev, staging and production. Bind mount fits cases
where the host path is the point - a dev workspace you edit live, an
output folder you cat from the host. Named volume fits
cases where the workload should run unchanged on any machine - a
database’s storage, a shared model cache between containers.
- Layer caching. Look at the order of instructions in
the
Dockerfilefrom Step 4:FROM,WORKDIR,RUN pip install torch,COPY requirements.txt,RUN pip install -r requirements.txt,COPY src/,COPY data/. If you movedCOPY src/aboveRUN pip install -r requirements.txt, how would the rebuild behaviour change the next time you only edited one.pyfile insidesrc/? Why?
After attempting it yourself, you may review the suggested answer
Editing a .py file would invalidate the cache for
COPY src/, and Docker invalidates every layer
after a changed one - so the expensive
pip install -r requirements.txt would re-run on every
rebuild. Rule: put stable + expensive layers (dependencies)
before frequently-changing + cheap layers (source
code).
- Containers vs virtual machines. Step 2 listed concrete advantages of containers over VMs, but containers are not always the better choice. Name one task where a full VM is genuinely the better option despite slower boot time and larger size, and explain why containers are not enough there.
After attempting it yourself, you may review the suggested answer
A VM is needed whenever you require a different OS kernel from the host, kernel-level changes, or stronger isolation. Examples: running Windows on a Linux host, experimenting with kernel modules, or running untrusted code where the hypervisor boundary matters. Containers share the host kernel and cannot help with any of those.
- GPU passthrough is two-sided. For a container to actually use the host GPU, two things must be set up. Name both sides and say who controls each - the image author or the person running the container.
After attempting it yourself, you may review the suggested answer
- Host side - NVIDIA driver + NVIDIA Container
Toolkit +
--gpus allondocker run. Controlled by the person running the container. - Image side - the framework inside must be a
GPU-capable build (e.g. PyTorch installed from the
cu121wheel index, not thecpuindex). Controlled by the image author through theDockerfile.
10. Submission Requirements
- A screenshot of
docker psshowing themywebcontainer alongside thecurl http://localhost:8088output from Step 3. - A screenshot of the final
docker buildlines (with thenaming to docker.io/library/fossbot-text-to-cmd:latestline) anddocker images fossbot-text-to-cmdfrom Step 4. - A screenshot of
docker compose ps -awith all four services inExited (0)from Step 6. - A screenshot of the VSCode dev container window from Step 8 showing the
Dev Container: ...status bar, the integrated terminal, and an open editor tab onsrc/text_to_wheels.py. - Short answers (2-3 sentences each) to the four analysis questions.
11. References and Open Licence
- Docker official documentation - https://docs.docker.com/
- VSCode Dev Containers documentation - https://code.visualstudio.com/docs/devcontainers/containers
- Compose file specification - https://docs.docker.com/compose/compose-file/
The Creative Commons Attribution 4.0 International (CC BY 4.0) license allows users to share, copy, distribute, and adapt the work, even for commercial purposes, as long as proper credit is given to the original creator.
EU funding disclaimer
Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Education and Culture Executive Agency (EACEA). Neither the European Union nor EACEA can be held responsible for them.