MLOPS Tutorial- Automating Workflow Of CI/CD for Dockerized Flask App Using Github Action

TL;DR

A two-job GitHub Actions workflow enforces CI quality by running pytest before any Docker image is built or published.

Briefing Cornell Notes

Briefing

A complete CI/CD workflow for a Dockerized Flask app is built using GitHub Actions, with automated unit testing, Docker image creation, and publishing to Docker Hub. The core payoff is that every push to the main branch triggers a pipeline that validates the code with pytest, builds a fresh container image from a Dockerfile, and then pushes that image to Docker Hub using credentials stored as GitHub secrets—eliminating manual steps.

The setup starts with a minimal Flask application (an endpoint returning “Hello World”) and a matching pytest suite. Unit tests are organized so pytest automatically discovers them: test files are named starting with “test_”, and the example test imports the Flask `app` object and asserts both the HTTP status code (200) and the response body. Dependencies are captured in a `requirements.txt`, and a `.gitignore` prevents local virtual environment folders (created via `cond create -p vnv python=3.10` and activated with `cond activate vnv`) from being committed.

Next comes containerization. A Dockerfile uses a slim Python base image (`python:3.9-slim`), sets a working directory, copies the app code into the container, installs Flask (and relies on the container build to include what’s needed for running tests and the app), exposes port 5000, and starts the service with `python app.py`. This Dockerfile is essential because the CI job later needs to build the image inside a Linux runner.

The GitHub Actions pipeline is defined in a YAML workflow (named `cicd` in the example) under `.github/workflows/`. It triggers on pushes and pull requests targeting the `main` branch. Two jobs structure the automation: **build and test** runs on `ubuntu-latest`, checks out the repository, sets up Python 3.9, installs dependencies (including pytest), and executes `pytest` to validate the unit tests. A second job, **build and publish**, depends on the first job via `needs: build and test`, then sets up Docker Buildx, logs into Docker Hub, builds the Docker image, and pushes it.

Docker Hub authentication is handled securely. The workflow reads `DOCKER_USERNAME` and `DOCKER_PASSWORD` from GitHub Secrets. The password is generated on Docker Hub as a Personal Access Token with read/write/delete permissions, then stored in GitHub under repository secrets. During the push step, the image tag follows the Docker Hub naming convention: `<docker-username>/<image-name>:latest` (the example uses `krishn 06/sl flask test app:latest`, preserving the naming pattern used in the transcript).

A practical troubleshooting moment occurs when the Docker build fails with “failed to solve… no such file or directory” because the YAML didn’t specify the Dockerfile path. Adding `file: ./Dockerfile` (or the correct Dockerfile location) fixes the build. After successful runs, the published image can be pulled locally and executed with `docker run -p 5000:5000 <image>:latest`, returning “Hello World” from the container.

Finally, an optional third job is demonstrated to build a Docker image standalone (without publishing) to confirm Docker build correctness. The overall result is a developer workflow where code changes automatically trigger testing and container publishing, and broken Docker builds prevent bad images from reaching Docker Hub.

Cornell Notes

The workflow automates CI/CD for a Dockerized Flask app using GitHub Actions. On every push or pull request to the `main` branch, a **build and test** job runs on `ubuntu-latest`, sets up Python 3.9, installs dependencies, and executes `pytest` to validate the Flask endpoint behavior. A dependent **build and publish** job then builds a Docker image from the Dockerfile and pushes it to Docker Hub, using `DOCKER_USERNAME` and `DOCKER_PASSWORD` stored as GitHub Secrets (with the password coming from a Docker Hub Personal Access Token). The pipeline ensures only code that passes tests produces and publishes a new container image. A common failure—Dockerfile not found in the build step—is resolved by specifying the Dockerfile path in the YAML.

How does the pipeline ensure code quality before publishing a Docker image?

It splits work into two jobs: **build and test** and **build and publish**. The publish job uses `needs: build and test`, so Docker image building/pushing won’t run unless pytest passes. The test suite is discovered by pytest conventions: test files start with `test_`, and the example test imports the Flask `app` and asserts `response.status_code == 200` and `response.data == 'Hello World'` (matching the endpoint behavior).

What file and naming conventions make pytest automatically find unit tests in this setup?

pytest looks for test modules and test functions by naming patterns. The transcript emphasizes that any file name beginning with `test_` is treated as a test file. Inside that file, test functions are also expected to follow pytest’s discovery rules (the example uses a test function that calls `app.test_client().get('/')` and then uses `assert` checks).

What does the Dockerfile do, and why is it central to the CI/CD pipeline?

The Dockerfile defines how the Flask app runs inside a container. It uses `python:3.9-slim`, sets `WORKDIR /app`, copies the project files into the image, installs required Python packages (the example installs Flask), exposes port `5000`, and starts the app with `python app.py`. GitHub Actions later builds the image from this Dockerfile; if the Dockerfile path isn’t correctly referenced in the YAML, the build fails.

How are Docker Hub credentials handled securely in GitHub Actions?

The workflow stores credentials as GitHub repository secrets: `DOCKER_USERNAME` and `DOCKER_PASSWORD`. The password is not the Docker Hub account password; it’s a Docker Hub Personal Access Token generated under account settings. In the workflow, the Docker login step reads these secrets so the pipeline can authenticate and push images without exposing credentials in the YAML.

Why did the Docker build fail once, and what fixed it?

The build failed with an error indicating it couldn’t find the Dockerfile (“failed to solve… no such file or directory”). The cause was that the YAML didn’t specify the Dockerfile location. Adding the Dockerfile path (e.g., `file: ./Dockerfile`) to the Docker build/push configuration corrected the build so the image could be created and pushed.

How can someone verify the published image actually works after CI/CD runs?

After the pipeline pushes to Docker Hub, the image can be pulled locally with `docker pull <username>/<image>:latest`. Then it can be run with `docker run -p 5000:5000 <username>/<image>:latest`. Opening `localhost:5000` should return the Flask response (“Hello World”), confirming the containerized app behaves as expected.

Review Questions

In the GitHub Actions workflow, what mechanism prevents the Docker image from being published when unit tests fail?
Which pytest naming conventions are required for the unit tests to be discovered and executed by the CI job?
What YAML parameter must be set to avoid Docker build errors when the Dockerfile path isn’t automatically detected?

Key Points

1
A two-job GitHub Actions workflow enforces CI quality by running pytest before any Docker image is built or published.
2
pytest discovery relies on test file naming (files starting with `test_`) and test functions that assert expected Flask responses.
3
The Dockerfile containerizes the Flask app using `python:3.9-slim`, exposes port 5000, and launches with `python app.py`.
4
Docker Hub publishing uses GitHub Secrets (`DOCKER_USERNAME`, `DOCKER_PASSWORD`) populated from a Docker Hub Personal Access Token.
5
The publish job depends on the test job via `needs`, preventing broken builds from reaching Docker Hub.
6
Docker build failures like “Dockerfile not found” are fixed by explicitly specifying the Dockerfile path in the YAML (e.g., `file: ./Dockerfile`).
7
Verification is done by pulling the pushed image and running it locally with port mapping to confirm the endpoint returns “Hello World.”

Highlights

The workflow publishes to Docker Hub only after unit tests pass, enforced by `needs: build and test` between jobs.

Docker Hub authentication is handled through GitHub Secrets using a Personal Access Token rather than a raw password.

A common CI/CD failure—Dockerfile path missing—was resolved by adding the Dockerfile location to the Docker build step.

Topics

Mentioned

Krish Naik
CI/CD