Mastering GitLab CI/CD: From Beginner to Pro in Pipeline Creation
Written on
In the world of modern software development, terms like DevOps, CI/CD, and GitLab frequently arise as essential components of code deployment. This guide seeks to clarify what CI/CD entails and how to construct a CI/CD pipeline in GitLab from the ground up.
Prerequisites: Familiarity with Git and basic Git commands.
Understanding CI/CD CI/CD, which stands for Continuous Integration and Continuous Deployment, is a strategy for delivering regular code updates to users through the automation of various phases within the software development lifecycle.
In essence, it automates all stages, such as integrating code, testing, linting, and deployment. This approach aims to reduce human error during these processes and significantly accelerate the integration and deployment of code. With CI/CD practices, the days of extensive pre-deployment planning are behind us; now, code can be deployed to production multiple times a day, showcasing the efficiency of this method.
Important Note: CI/CD is not a technology but rather a methodology or set of guidelines aimed at enhancing and automating the software delivery process.
What is GitLab? GitLab is a comprehensive web-based platform for DevOps that allows professionals to carry out various project-related activities, including project planning, source code management, monitoring, and security.
Building Your First CI/CD Pipeline in GitLab
Let's dive into creating your first CI/CD pipeline in GitLab while exploring its various concepts.
- Create a GitLab Account: To begin, you will need a GitLab account to set up a project and a CI/CD pipeline. GitLab offers a free account option, which is adequate for our needs.
- Create a Project/Repository: A project is essential for storing files, planning work, and collaborating on code. Log into GitLab, navigate to "Create New Project," and select "Create blank project."
- Clone the Project: Developers typically write code on their local machines before pushing it to the server. Start by cloning the project to your local machine, create a branch, and push your changes back to the GitLab server. If you're new to Git, I recommend reviewing the "Must Know Git Commands" article.
- Pipelines: Pipelines are the fundamental components of CI/CD, consisting of a series of elements, each performing specific tasks. Below is an example of a pipeline that starts with process āAā and ends with process āCā.
In GitLab, pipelines are defined in the .gitlab-ci.yml file located in the root directory of the repository. This YAML file specifies the pipeline that will run when changes are made to the code in the repository, detailing the stages, tasks, and their execution order.
- Stages: Stages can be viewed as processes that execute specific jobs when code changes occur in the repository. They determine the order in which jobs or groups of jobs are executed. GitLab includes five default stages: .pre, build, test, deploy, and .post. You can also define additional stages in the .gitlab-ci.yml file. The .pre stage is the first to execute, while .post is the last.
stages:
- build
- run_python
- test
- deploy
- post_testing
In this example, we have added two new stages: run_python and post_testing, establishing their execution order as build >> run_python >> test >> deploy >> post_testing.
- Jobs: Jobs are the individual tasks executed within a stage when a code change is detected in the repository. They serve as the foundational elements of CI/CD pipelines. If no specific stage is assigned to a job, it defaults to the test stage.
Jobs are defined in the .gitlab-ci.yml file, beginning with the job name, followed by the stage it belongs to, and then the tasks specified under the script keyword.
stages:
- build
- run_python
- test
- deploy
- post_testing
echo_ap:
stage: build
script:
- echo 'Hey, I am in the build stage'
In this instance, we have created a job named echo_ap within the build stage that outputs the message 'Hey, I am in the build stage.' Clone your project repository, create a local branch, and add the above code to the .gitlab-ci.yml file at the root of your repository.
After committing your code and pushing the changes to the GitLab server, the CI/CD pipeline will trigger and execute the echo_ap job.
We have successfully established our first CI/CD pipeline in GitLab!
Note: Jobs that belong to the same stage run in parallel.
Next, let's explore another CI/CD pipeline. Create a Python file named myname.py and add the print statement print('Hey, I am a Python script').
stages:
- build
- run_python
- test
- deploy
- post_testing
echo_ap:
stage: build
script:
- echo 'Hey, I am in the build stage, My name is echo_ap'
echo_kp:
stage: build
script:
- echo 'Hey, I am in the build stage, My name is echo_kp'
print_py:
stage: run_python
script:
- python3 myname.py
Here, we have created three jobs. The two jobs (echo_ap, echo_kp) in the build stage will run simultaneously, while the print_py job in the run_python stage will execute only after all build stage jobs have completed.
- Artifacts: By default, jobs operate independently, meaning they do not share data. Artifacts are the outputs generated by jobs, such as files, directories, binaries, and dependencies. Once jobs are completed, artifacts are usually discarded unless specified otherwise in the .gitlab-ci.yml.
Create two Python files in the root directory:
# write.py print('Hey, I am a Python script')
# read.py with open('ap.txt', "r") as f:
r = f.read()
print(r)
Next, update your .gitlab-ci.yml with the following code and push your changes:
stages:
- build
- run_python
- test
- deploy
- post_testing
write_py:
stage: build
script:
- python3 write.py >> ap.txt
artifacts:
paths:
- ap.txt
expire_in: 1 day
read_py:
stage: run_python
script:
- python3 read.py
In the write_py job, the output from write.py will be saved in ap.txt, which will then be accessed by the read_py job. You can specify which files to retain and for how long using the artifacts keyword, where expire_in denotes the duration for which the artifacts will be accessible to subsequent jobs.
- GitLab Runner: These lightweight agents are responsible for executing CI/CD jobs. When a CI/CD pipeline is initiated, GitLab Runners clone the repository, interpret the .gitlab-ci.yml file, and run the specified jobs. GitLab Runners are open source and written in Go, functioning as a process that carries out designated tasks.
By default, GitLab provides shared runners, but you can also configure your own runners within your infrastructure. The topic of GitLab runners could be explored further, and I plan to write a dedicated article on it.
You can find information about the shared runners in the settings of your repository under: Repository >> Settings >> CI/CD >> Runners
- Variables: Variables hold information that can be accessed throughout the code. They are often used to store sensitive data such as passwords, tokens, keys, and host information, which should not be disclosed in the code.
There are two methods to define and utilize variables in GitLab CI/CD pipelines: 1. Predefined CI/CD Variables: GitLab provides a set of predefined variables that can be directly utilized in jobs without additional specification, e.g., CI_COMMIT_BRANCH (The name of the commit branch), CI_JOB_NAME (The name of the job). A comprehensive list of these variables is available.
- Custom Variables: You can create custom CI/CD variables tailored to specific needs, which can be defined in the .gitlab-ci.yml file under the variables keyword.
To illustrate the use of CI/CD variables, create a local branch in your terminal, update the .gitlab-ci.yml with the following code, and push your changes to the GitLab server:
stages:
- build
- run_python
- test
- deploy
- post_testing
variables:
NAME: 'Khusboo'
MESSAGE: 'I love Data'
echo_variables_global:
script:
- echo $NAME
- echo $MESSAGE
echo_variables_local:
variables:
NAME: 'Ayush'script:
- echo $NAME
- echo $MESSAGE
echo_variables_predefined:
script:
- echo $CI_COMMIT_BRANCH
- echo $CI_JOB_NAME
In this example, we establish two global variables, NAME: 'Khusboo' and MESSAGE: 'I love Data', along with three jobs in the test stage (the default stage in GitLab CI/CD).
The job echo_variables_global will display the variables mentioned in the variables section under stages. This method of variable declaration is referred to as global variable declaration in CI/CD.
Check the logs for the echo_variables_global job.
The job echo_variables_local will override the global variable NAME: 'Khusboo' with NAME: 'Ayush', as we have declared the same variable at the job level. This behavior mirrors local variable declarations in programming languages, where local variables take precedence over global ones.
In the echo_variables_predefined job, we echo the predefined GitLab variables CI_COMMIT_BRANCH and CI_JOB_NAME. For instance, my commit branch name may be test, and my job name is echo_variables_predefined. Check the logs to ensure the job displayed the correct values.
Another type of variable available in GitLab CI/CD is Git secrets, which can be used to store sensitive information. You can define them in your project settings by navigating to: Project Settings >> CI/CD >> Variables >> Add variable.
Summary: This article aimed to introduce the foundational concepts of the GitLab CI/CD pipeline, making it accessible for those new to the CI/CD landscape to understand and create basic pipelines in GitLab.
What's Next? Utilize your newfound knowledge to build stages, rearrange their order, create new stages, and develop various job types for automated testing, code linting, requirement readiness, etc. Assign jobs appropriately to their respective stages and deploy code to the server as per your project requirements.
If you found this article helpful, please clap and follow my Medium account (datageeks.medium.com). For any questions regarding this topic, feel free to leave a comment or email me at [email protected]. By signing up as a member (https://datageeks.medium.com/membership), you can access every story and support authors on Medium.