Headers examples

Git: An opinionated tutorial

Part 4: Branches

In our previous lesson we saw the basics of Git - how to keep track of changes, how to share our work, and how to solve conflicts.

The way that Git manages to do all of this is by keeping track of changes across versions in commits that keep track of how a file has changed over time. We further said that "we can visualize our commits as a chain going back to our very first file", which is true but only up to a point.

If Git were a chain, it would be a strange one: our chain would start looking normal, but soon it would start splitting into two or more sub-chains of different length that once in a while come together. How to navigate these paths is what we call branching, and it’s the topic of this lesson.

What is a branch?

Branching as a whole is a difficult concept to understand. It is technically simple: the idea that one commit can split into several (as we’ve seen before) and that multiple commits can be merged into one. The confusion comes when we start talking about why we are doing the splitting/merging.

With what we have learned up until now, our commit chain will naturally split when two people work in parallel and will merge when their work comes together. This is perfectly fine, and most regular Git projects don't need anything more complex than this. But once your team starts growing, this style of work may no longer be enough:

The concept of "I’d like to keep my work apart from everyone else’s" is what Git calls a branch. The term is supposed to remind us of trees, but I encourage you not to do that: branches in a real tree rarely merge back into the trunk, while the ones in Git do that all the time.

Let's create a branch

Up until now in this tutorial we have been working inside a single branch called master (or main in some versions). This branch is created by default, and every time we saw something about master it was Git letting us know on which branch we are working. Since we never left the master branch, we could safely ignore it until now.

So let’s create a new branch. If we go back to our example from the last lesson, this is what our recipe looked like:

Poached egg
===========
Ingredients:
  * 1 egg
  * 1 spoonful mustard
  * Salt

Instructions:
  * Bring the water to a boil and then lower the temperature to 80°C.
  * Add salt and mustard to the almost-boiling water.
  * Submerge the egg for 3-4 minutes, and then carefully take it out.
  * Add salt to taste.

And if we look at the top of the commit chain until now, it looks like this:

The last commit in the chain is what we call the HEAD. That is the latest commit so far in this branch, it represents the point where our next commit will be inserted, and it is updated automatically. You can see it in the output of git log, where we see that HEAD points to the latest commit in the master branch, as expected:

$ git log
commit 8f5076549c4e361a3271de815d3945acaab0d258 (HEAD -> master, origin/master)
Merge: 0e5ca24 9b39da1
Author: Richard Lance 
Date:   Wed Sep 8 08:01:12 2021 +0200

    Resolves mustard conflict

commit 0e5ca243ac27f3f8ae59bd8adec997be5a115d32
Author: Richard Lance 
Date:   Wed Sep 8 07:49:53 2021 +0200

    Adds details about water temperature
    
(... rest of the log omitted ...)

In addition to master, we also have an extra branch called origin/master. This one is simply telling us where the HEAD label in our central GitLab repository was when we pulled the last time.

Let’s say Sarah wants to reduce the amount of salt in the recipe. Since she doesn't want to bother Richard with her changes until she's done (maybe it is impossible to make good poached eggs without salt after all) she starts a new branch called no_salt

$ git branch no_salt

It may look like nothing happened, but git log knows better, and the description of the current commit now acknowledges that no_salt exists:

$ git log
commit 8f5076549c4e361a3271de815d3945acaab0d258 (HEAD -> master, origin/master, no_salt)
Merge: 0e5ca24 9b39da1
Author: Richard Lance 
Date:   Wed Sep 8 08:01:12 2021 +0200

    Resolves mustard conflict

commit 0e5ca243ac27f3f8ae59bd8adec997be5a115d32
Author: Richard Lance 
Date:   Wed Sep 8 07:49:53 2021 +0200

    Adds details about water temperature
    
(... rest of the log omitted ...)

As expected, the no_salt branch was created in the commit marked with the HEAD label. But we are not done - we have created the branch, but we are not using it yet. If we were to commit something right now, it would be still added to the master branch because we have not yet told Git to use the new branch. To switch to the no_salt branch we need the git checkout command:

$ git checkout no_salt
Switched to branch 'no_salt'

And now even git log agrees with us: the HEAD is now pointing to the no_salt branch, and all commits from here on will be added to this new branch.

$ git log
commit 8f5076549c4e361a3271de815d3945acaab0d258 (HEAD -> no_salt, origin/master, master)
Merge: 0e5ca24 9b39da1
Author: Richard Lance 
Date:   Wed Sep 8 08:01:12 2021 +0200

    Resolves mustard conflict

commit 0e5ca243ac27f3f8ae59bd8adec997be5a115d32
Author: Richard Lance 
Date:   Wed Sep 8 07:49:53 2021 +0200

    Adds details about water temperature
    
(... rest of the log omitted ...)

Moving on, let’s remove the salt from our recipe and commit that change:

Poached egg
===========
Ingredients:
  * 1 egg
  * 1 spoonful mustard

Instructions:
  * Bring the water to a boil and then lower the temperature to 80°C.
  * Add mustard to the almost-boiling water.
  * Submerge the egg for 3-4 minutes, and then carefully take it out.
$ git commit -a -m "Removes the salt"
[no_salt 7a3c7ce] Removes the salt
 1 file changed, 1 insertion(+), 3 deletions(-)

In previous commits Git would have shown us master before the commit hash, but now it’s telling us that this commit belongs to the no_salt branch. And git log confirms it:

$ git log
commit 7a3c7ce27b767bb66211e9d964cef60e9bb8f897 (HEAD -> no_salt)
Author: Sarah Arleen 
Date:   Wed Sep 8 08:29:42 2021 +0200

    Removes the salt

commit 8f5076549c4e361a3271de815d3945acaab0d258 (origin/master, master)
Merge: 0e5ca24 9b39da1
Author: Richard Lance 
Date:   Wed Sep 8 08:01:12 2021 +0200

    Resolves mustard conflict

(... rest of the log omitted ...)

The HEAD commit (that is, the one where our next commit will be inserted) is pointing to the latest commit in the no_salt branch. The master branch, on the other hand, has remained intact. If Sarah pushes her changes now and Richard pulls them, he wouldn’t get a conflict. In the previous lesson both Sarah and Richard were working on the same branch master, but Sarah is now working on the branch no_salt. This means that all of her commits are being kept separate and conflict-free.

We can always jump back to the master branch, though

$ git checkout master
Switched to branch 'master'
Your branch is up to date with 'origin/master'.

And if we use git log we can confirm that we are exactly where we wanted. A small caveat: since a Git repo can have lots of branches, git log only shows us those that it thinks are important for the work we are doing right now. Therefore we need the extra parameter --branches if we want to see all branches:

$ git log --branches
commit 7a3c7ce27b767bb66211e9d964cef60e9bb8f897 (no_salt)
Author: Sarah Arleen 
Date:   Wed Sep 8 08:29:42 2021 +0200

    Removes the salt

commit 8f5076549c4e361a3271de815d3945acaab0d258 (HEAD -> master, origin/master)
Merge: 0e5ca24 9b39da1
Author: Richard Lance 
Date:   Wed Sep 8 08:01:12 2021 +0200

    Resolves mustard conflict

(... rest of the log omitted ...)

But let's stick to the no_salt branch. Now that all of the changes are done, it's time to push. And for a completely new branch, you might find yourself with the same error message that Sarah now has:

$ git checkout no_salt
Switched to branch 'no_salt'
Your branch is ahead of 'origin/master' by 1 commit.

$ git push
fatal: The current branch no_salt has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin no_salt

Git is complaining about something that at first makes no sense: it wants to push our changes, but it doesn't know where to. The reason for this is that Git allows us to keep branches in multiple repositories, which is an advanced feature we don't need for this tutorial and, more generally, in daily life. And because it's such a rarely-used feature, Git already gives us the command we are most likely to use next, namely, "push this branch to the same repo we have been using all this time":

$ git push --set-upstream origin no_salt
Username for 'https://gitlab.com': sarah
Password for 'https://sarah@gitlab.com': 
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 8 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 277 bytes | 277.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.
remote: 
remote: Create a pull request for 'no_salt' on GitLab by visiting:
remote:      https://gitlab.com/sarah/recipes/pull/new/no_salt
remote: 
To https://gitlab.com/sarah/recipes.git
 * [new branch]      no_salt -> no_salt
Branch 'no_salt' set up to track remote branch 'no_salt' from 'origin'

That's a lot of output right there. Let's divide into three parts:

  1. The first part is composed of the same standard messages we've been getting until now.
  2. The lines starting with remote: tell us how to create a pull request. We'll talk about this in Part 5.
  3. The last part is letting us know that our local branch now lives in the central repository too, which is exactly what we wanted.

Let's now go back to Richard, who suddenly realizes that we haven't cracked the eggs yet - we were so focused on discussing water temperatures that we ended up making boiled eggs instead! Since he's in the master branch, he barely even notices our changes when he pulls:

$ git pull
Username for 'https://gitlab.com': richard
Password for 'https://richard@gitlab.com': 
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 3 (delta 1), reused 3 (delta 1), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://gitlab.com/sarah/recipes
 * [new branch]      no_salt    -> origin/no_salt
Already up to date.
Poached egg
===========
Ingredients:
  * 1 egg
  * 1 spoonful mustard
  * Salt

Instructions:
  * Bring the water to a boil and then lower the temperature to 80°C.
  * Add salt and mustard to the almost-boiling water.
  * Crack the egg into a saucer.
  * Slip the egg for 3-4 minutes, and then carefully take it out.
  * Add salt to taste.
$ git commit -a -m "Cracks the egg"
[master ec07fd3] Cracks the egg
 1 file changed, 7 insertions(+), 7 deletions(-)

Since Richard's newest commit has absolutely nothing to do with Sarah's last commit, they can both commit, push, and pull in parallel without the other one even noticing that something has changed.

Now that you know how to make a branch and how to switch between them, we need to talk about two opposite concepts: how to delete a branch you no longer need, and how to merge branches for changes you want to keep. We’ll see how to do this in the context of specific goals:

Deleting a branch

Imagine that you realize that salt cannot be removed from the recipe, and you decide to give up on the whole idea. You can either leave your branch were it is, or you can delete the branch entirely by switching to a different branch (you cannot delete the branch you are currently on) and running the command git branch -D:

$ git checkout master
Switched to branch 'master'
Your branch is up to date with 'origin/master'.

$ git branch -D no_salt
Deleted branch no_salt (was 7a3c7ce).

The message simply lets us know that the branch no longer exists, and it reminds us what the last commit in that branch was. Note that this will delete the branch but not your commits - those will live forever in the ether, invisible to everyone except for those who know what to look for.

Note: this is not exactly true. There are ways in which you can remove unused commits, but this tutorial doesn't cover them because deleting commits can easily lead to a corrupted repository. If you want to know more, Part 5 talks more about this in the "Rewriting history" section.

Merging a branch I: merge into master

On the other hand, maybe Sarah does manage to remove all the salt from the recipe and is now ready to merge her changes (that is, her branch) back into the main code. She can do this by switching to the master branch and using the command git merge:

$ git checkout master

$ git merge no_salt
Auto-merging recipe.txt
CONFLICT (content): Merge conflict in recipe.txt
Automatic merge failed; fix conflicts and then commit the result.

By now, we have seen this story play out a thousand times: we are bringing commits together, they caused a conflict, and therefore we need to resolve them and commit the result. But notice that this conflict only happens now at the end of the branch's life. Sarah could have added a dozen commits to her branch and she would only need to worry about conflicts this one time.

With all of her changes already in the master branch, all that Sarah needs to decide is whether she wants to keep her branch around or not. If she thinks she's done with this little side-experiment forever, she can use the same command git branch -D as before. Or even better, she can use git branch -d instead, as the lowercase -d deletes a branch only if all of its changes have been merged, guaranteeing that she doesn't accidentally lose any work.

Merging a branch II: merge from master

Here's a slightly more complex case. In the previous paragraphs we assumed that Sarah was working on changes that she eventually intended to merge back into master. But what if she wanted to keep her no_salt branch forever, where she keeps salt-free version of all recipes?

With Git, this is perfectly fine - no one says that a branch has to be closed. All Sarah needs to do is to pay attention to the master branch and, whenever a new recipe is added, bring those changes into her no-salt version and replace the salt. If she wants to bring Richard's egg cracking update into her no-salt branch the commands are exactly the same as for merging into master, only with the names of the branches reversed:

$ git checkout no_salt

$ git merge master

After solving any potential conflicts, Sarah's branch has all of her branch-only changes plus everything that was added to the master branch. In our example, that means that she has a salt-free version of the recipe plus the correction that Richard added in parallel. And Richard hasn't noticed a thing - since he's only interested in the master branch, he doesn't even need to know about all the changes in the parallel branch.

Branching is a powerful tool, and like all powerful tools it offers plenty of ways for you to shoot yourself in the foot. Whenever that happens, keep the following tips in mind.

First, make sure that you know exactly what is being merged into what. The graphs we've seen all along this tutorial are a very useful abstraction - if you are dealing with merge conflicts you don't understand, quickly sketching the branches and commits involved can be of help.

And second, remember that Git offers many mechanisms for rolling back your changes. The next (and final) lesson will show some common commands for dealing with errors.

Part 4 quick review


git branch
Creates a new branch starting on the current commit.
git checkout
Switches to another branch.
git merge (II)
In addition of being used to merge two sets of changes on the same branch when pulling changes, this command can also be used to merge two branches into one.
git branch -d
Deletes an existing branch.
git branch -D
Deletes an existing branch, but only if all of its changes have been already merged.

The road ahead

If you have reached this point, congratulations! You now know more than enough Git for your daily work, and then some more. If you are a complete beginner I suggest you stop here, make your own repository, and start playing around with the concepts we've seen until now. Alternatively, Learn Git branching is an interactive website that guides you through the concepts we've seen here plus some complex ones.

The next and final lesson is a brief introduction to some concepts you'll find when talking to more experienced programmers: rewriting history (how to modify commits) and traversing the graph (how to add new commits on top of work you did last month). We won't be seeing them in detail, but at least you'll get an idea of what those things are about.