Part 5: Advanced concepts

As the title implies, Part 5 is less of a lesson and more of a collection of Git recipes for how to do specific things.

Pull requests
Checking old versions of your repository
Undoing mistakes
Rewriting old commits

Pull requests

One objection you may have with Git is that everyone can do whatever they want with the repository. Sure, if Richard messes something up Sarah can always go back to a previous commit and fix it, but if you have hundred of developers (or even worse, if you accept anonymous contributions) this approach simply won't do.

One popular solution to this problem is the pull request. The basic idea is as follows:

You want to collaborate with someone. Because you are not part of their team, they give you read-only access to their repository.
You make a clone of their repository with the git clone command and immediately create a branch where you'll work on your feature/improvement/bugfix.
You work on your local version until you are happy and ready to share your code.
You push your branch to your repository and create a pull request.

A pull request is nothing more than a notification to your collaborator letting them know that you'd like your branch to be merged with their code. The web interface will show what your changes are, and they can either accept your request (and merge it), reject it, or ask for more information.

That's all there is to it. Some development teams do all of their work via pull requests, while others have a core group of developers working on the master branch and a wide array of collaborators who send requests.

Revisiting the past

Now that we know about commits and branches, we can revisit the strategy we saw in Part 1 for going back in time. If we'd like to revert your project to a specific point in time, all we need is the commit or tag of that version and the git checkout command. We have been using this command to jump across branches, but we can use it to jump across commits too:

$ git checkout v1.0
Note: checking out 'v1.0'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 2045685 Lowers the required temperature

That's a lot of output, even for Git. So let's go step-by-step:

The first message tells us that we are jumping to tag v1.0 or, equivalently, commit 2045685. You probably remember that we created this tag in Part 1.
The second message is warning us that the HEAD label is not at the top of the graph anymore. It's also telling us something important: that if we don't create a branch, any change we make here will be lost (we'll explain why soon).
The third paragraph explains how to create a branch - if we want to make changes and not lose them, this is our chance.
The last line tells us the current commit and its description - we already knew this, but it's good to get confirmation anyway.

If you are just looking at an older version with no intentions of updating code, then there's not much else to do. You can open whichever files you want to check and, once you are done, you come back to the top of the commit graph with the git checkout command:

$ git checkout master
Previous HEAD position was 2045685 Lowers the required temperature
Switched to branch 'master'
Your branch is up to date with 'origin/master'.

But what if you really wanted to make a change on top of an old commit? Well, here we need to talk some more about what a branch is. As far as Git is concerned, a branch is nothing more than a tag that points to a specific commit and gets updated automatically, just like the HEAD tag. When you delete a branch, all you do is removing this one label. The commits are still there, but you have no way of reaching them other than knowing exactly what their commit hash is.

When we jumped to the v1.0 tag, we jumped onto a random part of the graph with no such associated label. This is what Git calls a detached HEAD situation. If you make a commit at this point of the graph, the resulting commit will simply dangle outside the graph. It would be exactly the same as a commit belonging to a deleted branch. Therefore, if you want to add a commit on top of the v1.0 commit with intentions of keeping it, you first need to make a branch to allow you to come back to this commit in the future.

In general, you can use git checkout to jump to any point of the graph. The most direct application of this command is for you to check older versions of your files, but there are some specialized uses for this ability that you can look up.

One final trick: you can use the HEAD~2 to refer to the commit that's two steps behind HEAD (and three, and four, and so on). This notation is very useful when you want to go a couple versions back without having to check their specific commit hash. And if you don't want to count jumps by yourself, the command git reflog will do it for you.

Undoing all kind of mistakes

Because there are all kinds of mistakes one can make, here's a list of the most common ones and how to resolve them:

You ruined a file and would like to go back to the committed version: git checkout -- <file>
You committed a file and then realize that you forgot an important detail. Or maybe you typed a wrong commit message. Either way, and as long as you haven't pushed your changes, you can update the last commit with the git commit --amend command.
You committed a file only to realize that you made a mistake. If you haven't pushed this change yet, you can use the git reset command in several ways:

git reset --soft will bring your repository to the state it was immediately before you wrote git commit. If you were to type git commit again, your files would still contain the modifications you tried to undo in the first place and the same files would be in the staging area. Therefore, committing again would bring you back where you started.
git reset --mixed does the same as before, with the difference that it removes all files from the stage area: if you typed git commit, Git would reply that there are no changes to be committed. But if you typed git add <file> followed by git commit, you'd be back where you started.
git reset --hard does the same as --mixed, only it throws away all changes that took place after the second-to-last commit. This one is good if you'd like to start from a completely blank state, but keep in mind that it's the only one that actually deletes the changes you made to your files.

If you committed your changes and pushed them, you have two choices:

git revert <commit hash> will create a new commit that undoes the current commit. This is not a "true" undo because the "wrong" version of the file is still in there, sandwiched between two commits, but it's better than the alternative.
If no one has pulled your changes yet, you can use git reset <commit hash> followed by git push -f. The -f flag means "force", and it tells the central repository to ignore all safety checks and accept your current version. Note that if someone has already pulled your changes this will break their repository, so use this command with care or, even better, not at all.

If you are completely lost and don't know what to do, you can always follow the advice on this XKCD strip.

For more tips, the appropriately-named website Oh Shit, Git!?! has a list of recipes for what to do in the most common difficult situations.

Rewriting history

The last example from above brings us to a complex and controversial topic: rewriting history. This tutorial avoids rewriting history like the plague, and we only bring it up in case you ever find the term "rebase" in the wild and wonder what it's all about.

Some Git users are not happy with having every single change taking space in their repository. And some of them are not happy about having so many branches when they'd prefer to have a single, linear history. Both of these "problems" can be solved with the git rebase command:

git rebase master will take your current branch, take it out from the commit where the branch started, and place those commits on top of the HEAD commit of the master branch. In this way, it looks as if you never branched to begin with.
git rebase -i <starting commit hash> will take all the commits starting at the specified commit up until the HEAD commit and compress them all into a single commit. All intermediate steps disappear forever.

Rewriting history like this is dangerous: once you modify your local history it will no longer match that of your remote repository. And even though you can force the remote repository to accept your changes with the git push -f flag, you will still wreck the repositories of anyone who pulled your changes in between. Use with extreme caution!

One final tip: if you want to delete commits that are no longer used because they belong to a branch that was deleted, you can save some hard drive space with the git gc --prune=now command. Just make sure that absolutely no one is using those commits!

Where to go from here

With this tutorial coming to an end, here's a final set of tips to help you in your day-to-day work with Git: