Basics like initializing (a repository), staging and commiting files aren’t explained here; they simply make sense; no ‘Aha!’s there. Moving references, branching and merging — coupled with Git’s arcane command names1 — are the confusing parts.

Basics

  • Git is a distributed VCS; each repo can be both a server/client
  • Honestly, git (sub)commands are just graph manipulating commands
  • Every codebase is made of a graph; each commit is a node with edges to parent(s)2
    • Git diagrams often have arrows backwards (←) for this reason
  • Git stores snapshots not differences i.e. entire file contents — as a blob

“finally figuring out that git commands are strangely named graph manipulation commands – creating/deleting nodes, moving around pointers” – Kent Beck

  • Nodes of the graph are created by your commits
  • Nodes are never really deleted in the traditional sense; they’re made unreachable (see below)

Reachablity

      A---B---C
     /
D---E---F---G
     \
      H---I

An important (linked-list) concept that applies to Git (too)

If the first node is lost, the list, too, is lost.

References

“References make commits reachable” – Think like a Git

  • Plainly, references are “meaningful” names to some commits
    • They facilitate easy git-speak with your friends/colleagues 😜
    • Branches and tags are references too
  • Creating a branch reference is a way to “nail down” part of the graph that you want to return to later (reachability)
  • References are just reference-named files containing a 40-byte commit ID
    • They’re specific to a single repository
    • Remote references are local, remote-tracking references to a commit in a remote repository 5
  • There’re many more ways of referring to commits: man gitrevisions is your friend

Commands Affecting Refs

These are the primary subcommands that allow you to move refs directly:

  • commit
  • merge
  • rebase
  • reset

Subcommands that affect moving remote refs:

  • fetch
  • push

Commands like pull, cherry-pick, … work atop these.

Checkout vs Reset

Before getting into the details, here’s the gist

checkout mostly operates on the working tree, while reset operates on index.

To understand both commands, you first need to understand HEAD6. Most people know about the working tree and stating area but not HEAD.

HEAD references the currently checked out commit; your working tree will mostly be from this snapshot – the commit pointed to by HEAD. Pro Git summarizes this nicely

HEAD will be the parent of the next commit that is created.

Checkout

git checkout HEAD -- file

When you checkout a file from HEAD, what you do is get a clean copy of file from the commit HEAD is pointing to; this replaces your working tree copy. Of course, one could use other refs too, HEAD is just a convenient default, you can replace it with any ref; if HEAD is omitted, it’ll be from index — the stage.

git checkout topic

When you checkout a branch (reference to a commit/node) e.g. topic, HEAD will be set to its tip commit and hence the entire working tree, not just a file, will be from the commit that branch is pointing to.

Reset

Plainly, reset moves HEAD around. It’s used to move HEAD to a given commit. There’re different flavours of doing this — depending on what happens to the index7 and working tree (--hard, --soft, --mix …) — but the crux is to move HEAD.

But isn’t that what checkout does too? Yes, but with a difference. Quoting Pro Git, with my emphasis

reset will […] move what HEAD points to. This isn’t the same as changing HEAD itself (which is what checkout does); reset moves the branch that HEAD is pointing to8.

Caveat: with reset, HEAD moves the branch reference along with it, only if it’s attached.

Detached HEAD

Whoa! Slow down there, cowboy. Before talking about detached, what’s the attached state of HEAD? We already know that HEAD is just a reference to a commit. Say this commit also has another reference pointing to it: a branch name.

When HEAD is moved by reset, if it’s attached to a branch, that reference too will move with HEAD.

C1 <-- C2 <-- C3 <-- C4 <-- C5 <-- master
                             ^
                             |
                            HEAD

git reset --hard C3

This would move both HEAD and master to C39. HEAD would continue to be attached. Now if it weren’t attached, it’ll only move HEAD leaving master behind, hence the detached HEAD state10.

In its detached state, HEAD refers to a specific commit as opposed to referring to a named branch. Like Git’s diagnostic message says, it’s useful to poke around and inspect the code base at a particular commit. Making a new commit now would mean a commit only pointed to by HEAD.

There’re a couple of ways to identify if HEAD is detached. git status’s very first line will tell you:

> git status
On branch master
> git status
# HEAD detached at 847fe59

Another way is to use git log; I learnt from this actually.

> git log --oneline -5
847fe59 (HEAD -> master) Initial commit
> git log --oneline -5
847fe59 (HEAD, master) Initial commit

Notice that when HEAD is attached, you see an arrow (→) pointing to the branch it’s attached to. However, in the detached state they’re listed as independent items.

Attach/Detaching HEAD

How do we attach or detach HEAD to a reference? Both are done with checkout, but with a subtle difference. To attach HEAD, you’d checkout

> git checkout master

When you checkout a commit using anything other than a branch name, you’d detach HEAD e.g. commit ID, HEAD~1, branch~3, HEAD{5}, HEAD^^, etc. Since it wouldn’t know what to associate HEAD with, Git detaches HEAD. When you want to inspect the code base at a particular unnamed – except for its commit ID – commit, this is what you normally do.

> git checkout lk3nw7ef

Here, it doesn’t matter if this commit has other branch references to it. Since you referred to it using the raw commit ID, Git takes it as a cue to detach HEAD.

Practise

I highly recommend playing around in Visualizing Git with checkout, reset; also get your hands dirty with the whole attach/detach business. Here’s a small snippet to get you started; see what happens as each command gets executed:

git commit
git commit
git commit
git commit
git commit
# create topic branch and checkout; HEAD now attached to topic
git checkout -b topic
# move HEAD one commit behind topic; this will also move topic with HEAD
git reset topic~1
# detach HEAD!
git checkout HEAD~2
# attach to master
git checkout master
# move back master by 3
git reset master~3
# move master forward/backward with commit ID
git reset f08ad6

Rebase

rebase seems to have a scary reputation on the web, with good reason of course. It’s infamous for rewriting history; something your teammates mightn’t take kindly. However, when you’re doing this only locally, within your repo, before pushing, it’s a great tool.

The crux of a rebase: given a subgraph’s root node, rebase changes its parent pointer from one node to another; thereby rebasing the entire subgraph to a new parent.

Take note, a commit is not just its contents but also includes its parent(s). So any kind of rebase entails — since the parent/lineage is changed — a change of commit ID for the same commit contents.

Interactive rebase (rebase -i) is quite useful. I frequently use it to amend (not just the recent commit), fix, reword, edit, drop or squash commits. During an interactive rebase, one can even create multiple commits as usual and continue with the rebase; things will be taken care of! This is normal when dividing a commit into smaller parts.

pull = fetch + merge rebase? 🤔

When pulling from a remote branch, you might know that your changes are unrelated to the ones coming down. In this case, to avoid a merge commit and have a linear commit history, you’d pass --rebase do override the default merge strategy of pull: merge.

git pull --rebase origin master

git pull is just git fetch followed by git merge which creates a new merge commit. git pull --rebase, however, is git fetch and git rebase; it pulls commits from remote to your current branch and then replay your commits atop your current branch’s tip – this works if there’re no merge conflicts; otherwise you’ve to resolve conflicts as you’d normally. The resolution (changes) become a part of one of your commits where rebase halted; you’d end up re-writing your commit. However, you don’t have to force push your changes to the remote since the resolution just happened in your local commits. Rewriting (commit) history, as long as it is not public, is OK 😉

A counter point to pull-with-rebase: if you want logical separation of a set of commits, say for a completely new feature, then rebase — which makes them inline, muddled with unrelated history — isn’t the right tool; use merge instead.

Use git pull --rebase when your changes do not deserve a separate branch.

seems to be the appropriate answer to when should I git pull --rebase.

See Also

I get surprised by Git commands every now and then, I document the obscure but useful ones!

Learn by Doing

try.github.io for good DIY resources.

References

  1. Think like a Git
  2. Pro Git
  3. Git Tutorials by Atlassian
  4. Git Ready

See Also


  1. Magit – Git porcelain for Emacs – shields me mostly but knowing them helps. ↩︎

  2. Merge commits have more than one parent. ↩︎

  3. Not to be confused with git clean which removes untracked files from the working tree. ↩︎

  4. git reflog shows these otherwise unreachable commits. You’ve time until git gc is run to make a commit reachable by adding a reference to it. ↩︎

  5. Remote-tracking branches (origin/master) are different from remote branches (origin master); former is local, updated by fetching from the latter. ↩︎

  6. Case-sensitive! HEAD will be the parent of a new commit in working tree, while a branch’s head means its tip; see glossary↩︎

  7. I thought index is empty until something’s staged. However, Pro Git clarifies that index actually has “all the file contents that were last checked out into your working directory!” Don’t believe me? Try git ls-files -s. You’ve to grok this to get why git reset --mixed works the way it does. ↩︎

  8. Pro Git is explaining reset’s internals here, so it may sound like it won’t move HEAD but only the branch, but rest assured that it moves both. ↩︎

  9. Using C3 for readability; substitute with proper commit ID. ↩︎

  10. Refer man git-checkout; §DETACHED HEAD details it with nice ASCII art ✨. ↩︎