As soon as your pipeline grows a little bit beyond basic, it's common to get into a situation where a lot of the commands you run in each job are the same.

So how can you reduce this duplication? And should you?

In general, it's ok to have a bit of duplication, as it increases readability, but as the pipeline grows, that might not be so feasible anymore.

Let's take a look at 3 possible strategies using a simple example where we'll extract the echo 'prepare' command, so it's no longer duplicated in every job.

build:
  script:
    - echo 'prepare'
    - echo 'build'

test:
  script:
    - echo 'prepare'
    - echo 'test'

Use a default before_script

If your command is something you want to run before all jobs, all the time, then the easiest way is to simply extract the common code as a before_script key at the beginning of the file:

before_script:
  - echo 'prepare'

build:
  script:
    - echo 'build'

test:
  script:
    - echo 'test'

While this does not give you a lot of flexibility, it's a very quick and straightforward way to reuse commands.

Use YML anchors

Anchors are YML's way of reusing code - you can think of them a little bit like functions.

You can define a block of configuration somewhere and create a reference to it using &. Then, you can use it with *.

Let's see an example:

# Create an anchor called `&prepare_step`
.prepare_step: &prepare_step
  - echo 'prepare'

build:
  script:
   # Merge the anchor into the `script` using `<<*:`
    - <<:*prepare_step
    - echo 'build'

test:
  script:
   # Aaand reusing the command here again
    - <<:*prepare_step
    - echo 'test'

You'll notice prepare_step job starts with a dot - this is intentional to prevent Gitlab from running that job; by starting it with a dot, it will be ignored, thus behaving like a "template".

Anchors can be tricky to master, so if you want to learn more I found this article to be helpful, and of course, the official Gitlab docs on anchors.

Use extends

While anchors can be quick to get started with, they do have their downsides. The main one I've encountered is that you can't use anchors to reuse code across several files - they only work within the same file.

If you want to reuse code across several files, then you can use the extends keyword. This works similar to inheritance - you define a "template" job that other jobs can then extend.

Some things to keep in mind regarding extends:

  • if the key is a hash it will get merged
  • if the key is an array, it will get overridden
.prepare_step:
  before_script:
    - echo 'prepare'

build:
  extends:
    - .prepare_step
  script:
    - echo 'build'

test:
  extends:
    - .prepare_step
  script:
    - echo 'test'

Here are the Gitlab docs on extends keyword and some details on its merging strategy.