Define your CI/CD pipeline with Argo Workflows

Zubair Haque
6 min readMar 19, 2023

--

Argo Workflows is an open-source, container-native workflow engine for orchestrating your CI/CD jobs on Kubernetes. It’s implemented as a Kubernetes Custom Resource Definition (CRD), which enables contributors to create custom API objects to extend the capabilities of Kubernetes in a compliant manner.

But why Argo Workflows?

Argo Workflows is designed to run on top of Kubernetes, and not on other platforms such as VMs or cloud services. Let’s take a minute to highlight the benefits and drawbacks of using Kubernetes as a platform for running Argo Workflows.

Argo Workflows is implemented as a Kubernetes Custom Resource Definition (CRD), which enables you to:

  • Define Kubernetes workflows using separate containers for each step in the workflow.
  • Model workflows with directed acyclic graphs (DAGs) to capture dependencies between multiple steps.
  • Run compute-intensive data processing or machine learning tasks quickly and easily.
  • Natively run CI/CD pipelines on Kubernetes without configuring a complex application development solution.

CI/CD using Argo Workflows

Argo Workflows is the right fit for your needs if you require:

  • Resilience to handle container crashes and failures.
  • Autoscaling options that can manage large numbers of workflows simultaneously.
  • Comprehensive enterprise features such as Role-Based Access Control (RBAC) and Single Sign-On (SSO).

You may want to avoid using Argo Workflows if:

  • The complexity of YAML file maintenance may increase as the number of workflows and infrastructure requirements grow, although Argo offers tricks and templating features to manage this aspect.
  • Your team does not have experience with containers and Kubernetes.
  • You require maintaining a full enterprise setup, which involves managing a large number of configuration options.

Understanding the Key Components of Argo Workflows

The core concept of Argo Workflows involves defining and storing live Kubernetes CRD objects that will specify the workflow being executed in its proper state:

  • The workflow.spec contains a list of templates and an entrypoint that serves as the primary function or first template to execute.

Templates are defined as functions and can be of various types:

  • such as containers, scripts, or resource templates.

The container template is the most commonly used. The script template allows for a script to be defined and then executed. The resource template handles requests, while the suspend template pauses the workflow execution for a defined duration and can be resumed using the Argo UI. The Templates themselves use nested lists to run in sequence or parallel, which can be customized based on different settings.

Installation

To get started we’re going to need to download the Argo Workflows binary on your machine and save it:

$ curl -sLO https://github.com/argoproj/argo-workflows/releases/download/v3.4.5/argo-darwin-amd64.gz

Now we need to decompress the file, we downloaded from the command above:

$ gunzip argo-darwin-amd64.gz

Once, we have all of that done we need to modify the permissions of this file, allowing it to be run as a program. This will allow you to execute the argo commands directly from the command line:

$ chmod +x argo-darwin-amd64

Once we have done that, we need to move this file from the current directory to the /usr/local/bin directory:

mv ./argo-darwin-amd64 /usr/local/bin/argo

This will allow you to execute the argo command from anywhere in the terminal. To verify that Argo Workflows was installed successfully, run the following command:

$ argo version

The output will display the version number installed, indicating that everything is good to go.

Running your sample application

In this section, we will be deploying a workflow with a simple Python app using Argo Workflows. We’ll start by defining our build step, then finally creating a workflow to deploy and test the app. We’re going to be doing all of this on our local minikube cluster, on your local machine by running the following command:

$ minikube start

you should be in the default namespace in your minikube cluster, we should now be able to apply the following Kubernetes manifest file, which is going to create the various Kubernetes resources needed for Argo Workflows to do it’s magic:

$ kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.4.5/install.yaml

We’ve got everything we need to get started now, our cluster now has:

  • The necessary (CRDs) to create and run workflows.
  • An RBAC for secure and controlled access to Kubernetes resources.
  • A ConfigMap which is responsible for executing workflows.
  • We also set aPriorityClass to ensure that the Argo Workflow controller has a higher priority & has the necessary resources to run in the cluster.

A key thing to note here is Argo Workflows will define your CI/CD workflow using a YAML file to specify the following:

  • steps: You can add various build steps, there can be one or more.
  • dependencies: any dependencies you have which is needed to run your workflow can be added.
  • parameters: enables you to customize your workflow’s behavior, you can use input parameters & pass those in to trigger workflows. Giving you the flexibility you need as well as making it reusable.

Now that we understand what Argo Workflows can do, we can start by creating and managing our workflows as code. Let’s create a Python 3 application workflow with three steps (deploying, testing, and building). First, create a file named python-app.yaml inside our app directory and add the workflow definition:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: python-app
spec:
entrypoint: python-app
templates:
- name: python-app
steps:
- - name: build
template: build
- - name: test
template: test
- - name: deploy
template: deploy
- name: build
container:
image: python:3.11
command: [python]
args: ["-c", "print('build')"]
- name: test
container:
image: python:3.11
command: [python]
args: ["-m", "unittest", "discover", "-s", "/app/tests"]
volumeMounts:
- name: test-volume
mountPath: /app/tests
volumes:
- name: test-volume
hostPath:
path: /path/to/tests
- name: deploy
container:
image: python:3.11
command: [python]
args: ["-c", "print('deploy')"]

In this example, the Workflow CRD defines the CI/CD pipeline, and each step in the template has a specific purpose:

  • The build step builds the image with the latest changes, using a Python 3.11 image for this demo.
  • The test step mounts a volume with test files and runs unit tests with the Python unittest library.
  • The deploy step runs the Python container and prints deploy. Normally, this step would involve pushing the tested code to a container registry, like AWS ECR or Harbor, and then deploying it to the production environment.

To apply this file, you would use the kubectl apply command with the -f flag followed by the path to the YAML file. Here's an example command:

$ kubectl apply -f python-app.yaml

This will create an Argo Workflow with the name python-app along with their respective templates and steps. Now, let’s execute the argo get command to retrieve information about the workflow, including its status and logs:

$ argo get python-app
Name: python-app
Namespace: default
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Succeeded
Conditions:
PodRunning False
Completed True
Created: Mon Mar 20 16:23:06 -0500 (37 seconds ago)
Started: Mon Mar 20 16:23:06 -0500 (37 seconds ago)
Finished: Mon Mar 20 16:23:36 -0500 (7 seconds ago)
Duration: 30 seconds
Progress: 3/3
ResourcesDuration: 10s*(1 cpu),10s*(100Mi memory)

STEP TEMPLATE PODNAME DURATION MESSAGE
✔ python-app python-app
├───✔ build build python-app-build-3831382643 4s
├───✔ test test python-app-test-461837862 4s
└───✔ deploy deploy python-app-deploy-988129288 4s

Argo Workflows benefits

Argo Workflows is a powerful tool for managing and automating complex workflows in a containerized deployment environment. It offers several benefits, including:

  • Easy version control and modification of workflows as code.
  • Integration with Argo Events for triggering workflows based on events.
  • Automatic retries and error handling for reliable and robust workflow execution.
  • Seamless integration with Kubernetes and other container orchestration platforms.
  • Support for parallel and sequential workflows, enabling faster and more efficient execution.
  • Extensive monitoring and logging capabilities, including integration with Prometheus and Grafana.

Conclusion

In conclusion, if your DevOps team is looking to simplify the management and automation of complex containerized workflows, Argo Workflows is an essential tool. However, when considering whether to use it or not, you should evaluate your team’s cloud and Kubernetes experience, size, and growth targets. It’s likely that you’ll end up somewhere in the middle. If you’re building platforms on top of Kubernetes, Argo Workflows and Tekton CD can be great tools for platform engineers to build abstractions and hide the pain of maintaining YAML files from dev teams.

--

--

Zubair Haque

The Engineering Chronicles: I specialize in Automated Deployments