| File | Date | Author | Commit |
|---|---|---|---|
| images | 2019-02-07 |
|
[27ca23] github |
| .gitattributes | 2019-02-06 |
|
[c6a35e] initial |
| .gitignore | 2019-02-06 |
|
[c6a35e] initial |
| .gitlab-ci.yml | 2019-02-06 |
|
[c6a35e] initial |
| LICENSE | 2019-02-06 |
|
[c6a35e] initial |
| Makefile | 2019-02-06 |
|
[c6a35e] initial |
| README.md | 2019-02-08 |
|
[22c587] review |
| article.css | 2019-02-07 |
|
[66df73] bitbucket |
| azure-pipelines.yml | 2019-02-07 |
|
[27ca23] github |
| bitbucket-pipelines.yml | 2019-02-07 |
|
[66df73] bitbucket |
title: 'Git Pipelines'
author: 'frank.jung@marlo.com.au'
date: '6 February 2019'
output:
html_document: default
Git has become the de facto standard for version control. This has given rise
to many vendors hosting Git repositories. Each vendor provides Git functionality
such as branching, pull requests, project membership. There is now growing
competition to provide facilities for Continuous Integration / Continuous
Delivery (CI/CD). This is supported by pipelines. Pipelines are extensible suite
of tools to build, test and deploy source code. Even data hosting sites like
Kaggle now support
pipelines.
This article provides a brief summary of some pipeline features from three
popular hosting sites, GitLab,
Bitbucket and GitHub.
This markdown for this article will be used to show features for each of these
Git repositories.
The features being tested are:
The workflow is:
Here are other source repositories that offer pipelines that you may also like
to try:
GitLab pipelines are a well integrated tool. CI / CD pipelines are easily accessed
from the sidebar:

Viewing jobs gives you a pipelines history:

The YAML configuration .gitlab-ci.yml looks like:
image: conoria/alpine-pandoc
variables:
TARGET: README.html
stages:
- build
before_script:
- apk update
- apk add make
render:
stage: build
script:
- make $TARGET
artifacts:
paths:
- $TARGET
Where:
image - specifies a custom Docker image from Docker Hubvariables - define a variable to be used in all jobsstages - declares the jobs to runbefore_script - commands to run before all jobsrender - name of job associated with a stage. Jobs of the same stage are run in parallelstage - associates a job with a stagescript - commands to run for jobartitacts - path to objects to archive, these can be downloaded if jobWhat this pipeline configuration does is:
render job which makes the given targetGitLab is easy to configure and easy to navigate. There are many other features
including scheduling pipelines and configuring jobs by branch. One feature that
I have used on Maven / Java projects is caching the .m2 directory. This speeds
up the build as you don't have a completely new environment for each build, but
can leverage previous cached artefacts. GitLab also provides a clear cache
button on the pipeline page.
GitLab also provides additional services that can be integrated with you
project, for example: JIRA tracking, Kubernetes, Prometheus monitoring.
The example is publicly available here.
The configuration is similar to that from GitLab. The pipeline and settings are
easily navigated into using the side-bar.

The pipeline configuration is similar. But there are important differences.
Below is the configuration file
bitbucket-pipelines.yml:
pipelines:
branches:
master:
- step:
name: render
image: conoria/alpine-pandoc
trigger: automatic
script:
- apk update && apk add make curl
- export TARGET=README.html
- make -B ${TARGET}
- curl -X POST --user "${BB_AUTH_STRING}" "https://api.bitbucket.org/2.0/repositories/${BITBUCKET_REPO_OWNER}/${BITBUCKET_REPO_SLUG}/downloads" --form files=@"${TARGET}"
Here the pipeline will be triggered automatically on commits to master branch.
A Docker image can be defined at the level of the pipeline step. Variables can
be defined and read from the Bitbucket settings page. This is useful for
recording secrets that you don't want to have exposed in your source code.
However, internal script variables are set via the script language, which here
is Bash. Finally, in order for the build artefacts to be preserved after the
pipeline completes, you can publish to a downloads location. This requires that
a secure variable be configured, as described
here.
If you don't the pipeline workspace is purged on completion.

Pipeline build performance is very good, where this entire step took only around
11 seconds to complete.
One limitation is that the free account limits you to only 50 minutes per month
with 1GB storage.
That you have to externally / manually configure repository settings has some
benefits. The consequence though, is that there are settings that are not
recorded by your project.
A feature of being able to customise the Docker image used at the step level is
that your build and test steps can use different images. This is great if you
want to trial your application on a production like image.
When you create a GitHub repository, there is an option to include Azure
Pipelines.
However this is not integrated to GitHub directly, but is configured under Azure
DevOps. Broadly, the steps to set-up a pipeline are:

Builds can be run and managed from the Azure DevOps dashboard. There appears no
way to manually trigger a build from the GitHub repository. Though, if you
commit it will happily trigger a build for you. But, again, you need to be on
the Azure DevOps dashboard to monitor the pipeline steps.
The following YAML configuration uses an Azure provided Ubuntu 16.04 image.
There are limited images, but they are maintained and kept installed packages
are kept up to date. There are many pre-installed
packages.
trigger:
- master
pool:
vmImage: 'Ubuntu-16.04'
steps:
- script: |
sudo apt-get install pandoc
displayName: 'install_pandoc'
- script: |
make -B README.html
displayName: 'render'
- powershell: gci env:* | sort-object name | Format-Table -AutoSize | Out-File $env:BUILD_ARTIFACTSTAGINGDIRECTORY/environment-variables.txt
- task: PublishBuildArtifacts@1
inputs:
pathtoPublish: '$(System.DefaultWorkingDirectory)/README.html'
artifactName: README
If the package you need is not installed, then you can install it if available
in the Ubuntu package repositories. The default user profile is not root, so
installation requires sudo.

Finally, to provide the generated artefacts as a downloaded archive you need to
invoke specific PublishBuildArtifacts task as described
here.

Azure is fast as it uses images that Microsoft build and host. The above job to
install pandoc and render this page as HTML takes only 1 minute.
I found the biggest negative to Azure Pipelines was the poor integration to the
GitHub dashboard. Instead, you are strongly encouraged to manage pipelines
using the Azure DevOps dashboard.
Git pipelines will be not suite in every circumstance. There are however, clear
advantages to using a hosted pipeline that ensures that your project builds
somewhere other than your laptop. It also removes the cost of building and
maintaining your own infrastructure. The pipeline configuration is also augments
your projects documentation for build, test and deployment. It is an independent
executable description for your project that explicitly lists dependencies.
Hosted pipelines also lightens the effort for provisioning and maintaining your
own infrastructure. This could be a great benefit to projects where
time constraints limit ones ability to prepare an environment.
This project can be viewed from these Git repositories: