Rewriting Git History With Confidence: A Guide

As a developer, you work with Git all the time. Did you ever get to a point where you said: “Uh-oh, what did I just do?” This post will give you the tools to rewrite history with confidence. Notes Before We Start I also gave a live talk covering the contents of this post. If you prefer a video (or wish to watch it alongside reading) — you can find it . here I am working on a book about Git! Are you interested in reading the initial versions and providing feedback? Send me an email: gitting.things@gmail.com Recording Changes in Git Before understanding how to things in Git, you should first understand how we changes in Git. If you already know all the terms, feel free to skip this part. undo record It is very useful to think about Git as a system for recording snapshots of a filesystem in time. Considering a Git repository, it has three “states” or “trees”: Usually, when we work on our source code, we work from a . A (or ) is any directory in our file system that has a associated with it. working dir working dir(ectrory) working tree repository It contains the folders and files of our project and also a directory called . I described the contents of the folder in more detail in . .git .git a previous post After you make some changes, you may want to record them in your . A (in short: ) is a collection of , each of which is an archive of what the project’s looked like at a past date, whether on your machine or someone else’s. repository repository repo commits working tree A also includes things other than our code files, such as , branches, etc. repository HEAD In between, we have the or the ; these two terms are interchangeable. When we a branch, Git populates the with all the file contents that were last checked out into our and what they looked like when they were originally checked out. index staging area checkout index working directory When we use , the is created based on the state of the . git commit commit index So, the or the is your playground for the next commit. You can work and do whatever you want with the , add files to it, remove things from it, and then only when you are ready, you go ahead and commit to the repository. index, staging area, index Time to get hands-on 🙌🏻 Use to initialize a new repository. Write some text into a file called : git init 1.txt Out of the three tree states described above, where is now? 1.txt In the working tree, as it hasn’t yet been introduced to the index. In order to it, to add it to the index, use . stage git add 1.txt Now, we can use to commit our changes to the repository. git commit You created a new commit object, which includes a pointer to a tree describing the entire working tree. In this case, it’s gonna be only within the root folder. In addition to a pointer to the tree, the commit object includes metadata, such as timestamps and author information. 1.txt For more information about the objects in Git (such as commits and trees), . check out my previous post (Yes, “check out”, pun intended 😇) Git also tells us the SHA-1 value of this commit object. In my case, it was (which are only the first 7 characters of the SHA-1 value, to save some space). c49f4ba If you run this command on your machine, you would get a different SHA-1 value, as you are a different author; also, you would create the commit on a different timestamp. When we initialize the repo, Git creates a new branch (named by default). And . So by default, you have only the branch. What happens if you have multiple branches? How does Git know which branch is the active branch? main a branch in Git is just a named reference to a commit main Git has another pointer called , which points (usually) to a branch, which then points to a commit. By the way, It includes the name of the branch with some prefixes. HEAD under the hood, HEAD is just a file. Time to introduce more changes to the repo! Now, I want to create another one. So let’s create a new file, and add it to the index, as before: Now, it’s time to use . Importantly, does things: git commit git commit two First, it creates a commit object, so there is an object within Git’s internal object database with a corresponding SHA-1 value. This new commit object also points to the parent commit. That is the commit that was pointing to when you wrote the command. HEAD git commit Second, moves the pointer of the active branch — in our case, that would be , to point to the newly created commit object. git commit main Undoing the Changes To rewrite history, let’s start with undoing the process of introducing a commit. For that, we will get to know the command , a super powerful tool. git reset git reset --soft So the very last step you did before was to , which actually means two things — Git created a commit object and moved , the active branch. To undo this step, use the command . git commit main git reset --soft HEAD~1 The syntax refers to the first parent of . If I had more than one commit in the commit-graph, say “Commit 3” pointing to “Commit 2”, which is, in turn, pointing to “Commit 1”. HEAD~1 HEAD And say was pointing to “Commit 3”. You could use to refer to “Commit 2”, and would refer to “Commit 1”. HEAD HEAD~1 HEAD~2 So, back to the command: git reset --soft HEAD~1 This command asks Git to change whatever is pointing to. (Note: In the diagrams below, I use for “whatever is pointing to”). In our example, is pointing to . So Git will only change the pointer of to point to . That is, will point to “Commit 1”. HEAD *HEAD HEAD HEAD main main HEAD~1 main However, this command did affect the state of the index or the working tree. So if you use you will see that is staged, just like before you ran . not git status 2.txt git commit What about It will start from , go to , and then to “Commit 1”. Notice that this means that “Commit 2” is no longer reachable from our history. git log? HEAD main Does that mean the commit object of “Commit 2” is deleted? 🤔 No, it’s not deleted. It still resides within Git’s internal object database of objects. If you push the current history now, by using , Git will not push “Commit 2” to the remote server, but the commit object still exists on your local copy of the repository. git push Now, commit again — and use the commit message of “Commit 2.1” to differentiate this new object from the original “Commit 2”: Why are “Commit 2” and “Commit 2.1” different? Even if we used the same commit message, and even though they point to (of the root folder consisting of and ), they still have different timestamps, as they were created at different times. the same tree object 1.txt 2.txt In the drawing above, I kept “Commit 2” to remind you that it still exists in Git’s internal object database. Both “Commit 2” and “Commit 2.1” now point to “Commit 1", but only “Commit 2.1” is reachable from . HEAD Git Reset --Mixed It’s time to go even backward and undo further. This time, use (note: is the default switch for ). git reset --mixed HEAD~1 --mixed git reset This command starts the same as . Meaning it takes the pointer of whatever is pointing to now, which is the branch, and sets it to , in our example — “Commit 1”. git reset --soft HEAD~1 HEAD main HEAD~1 Next, Git goes further, effectively undoing the changes we made to the index. That is, changing the index so that it matches with the current , the new after setting it in the first step. HEAD HEAD If we ran , it means would be set to (“Commit 1”), and then Git would match the index to the state of “Commit 1” — in this case, it means that will no longer be part of the index. git reset --mixed HEAD~1 HEAD HEAD~1 2.txt It’s time to create a new commit with the state of the original “Commit 2”. This time we need to stage again before creating it: 2.txt Git Reset --Hard Go on, undo even more! Go ahead and run git reset --hard HEAD~1 Again, Git starts with the stage, setting whatever is pointing to ( ), to (“Commit 1”). --soft HEAD main HEAD~1 So far so good. Next, moving on to the stage, matching the index with . That is, Git undoes the staging of . --mixed HEAD 2.txt It is time for the step where Git goes even further and matches the working dir with the stage of the index. In this case, it means removing also from the working dir. --hard 2.txt (**Note: In this specific case, the file is so it won’t be deleted from the file system; it isn’t really important in order to understand though). untracked, git reset So to introduce a change to Git, you have three steps. You change the working dir, the index, or the staging area, and then you commit a new snapshot with those changes. To these changes: undo If we use , we undo the commit step. git reset --soft If we use , we also undo the staging step. git reset --mixed If we use , we undo the changes to the working dir. git reset --hard Real-Life Scenarios! Scenario #1 So in a real-life scenario, write “I love Git” into a file ( ), as we all love Git 😍. Go ahead, stage and commit this as well: love.txt Oh, oops! Actually, I didn’t want you to commit it. What I actually wanted you to do is write some more love words in this file before committing it. What can you do? Well, one way to overcome this would be to use , effectively undoing both the committing and the staging actions you took: git reset --mixed HEAD~1 So points to “Commit 1” again, and is no longer a part of the index. However, the file remains in the working dir. You can now go ahead, and add more content to it: main love.txt Go ahead, stage and commit your file: Well done 👏🏻 You got this clear, nice history of “Commit 2.4” pointing to “Commit 1”. We now have a new tool in our toolbox, 💪🏻 git reset This tool is super, super useful, and you can accomplish almost anything with it. It’s not always the most convenient tool to use, but it’s capable of solving almost any rewriting-history scenario if you use it carefully. For beginners, I recommend using only for almost any time you want to undo in Git. Once you feel comfortable with it, it’s time to move on to other tools. git reset Scenario #2 Let us consider another case. Create a new file called ; stage and commit: new.txt Oops. Actually, that’s a mistake. You were on , and I wanted you to create this commit on a feature branch. My bad 😇 main There are two most important tools I want you to take from this post. The is . The first and by far more important one is the current state versus the state you want to be in. second git reset to whiteboard For this scenario, the current state and the desired state look like so: You will notice three changes: points to “Commit 3” (the blue one) in the current state, but to “Commit 2.4” in the desired state. main branch doesn’t exist in the current state, yet it exists and points to “Commit 3” in the desired state. feature points to in the current state, and to in the desired state. HEAD main feature If you can draw this and you know how to use , you can definitely get yourself out of this situation. git reset So again, is to take a breath and draw this out. the most important thing Observing the drawing above, how do we get from the current state to the desired one? There are a few different ways of course, but I will present one option only for each scenario. Feel free to play around with other options as well. You can start by using . This would set to point to the previous commit, “Commit 2.4”: git reset --soft HEAD~1 main Peeking at the current-vs-desired diagram again, you can see that you need a new branch, right? You can use for it or (which does the same thing): git switch -c feature git checkout -b feature This command also updates to point to the new branch. HEAD Since you used , you didn’t change the index, so it currently has exactly the state you want to commit — how convenient! You can simply commit to branch: git reset --soft feature And you got to the desired state 🎉 Scenario #3 Ready to apply your knowledge to additional cases? Add some changes to , and also create a new file called . Stage them and commit: love.txt cool.txt Oh, oops, actually I wanted you to create two separate commits, one with each change 🤦🏻 Want to try this one yourself? You can undo the committing and staging steps: Following this command, the index no longer includes those two changes, but they’re both still in your file system. So now, if you only stage , you can commit it separately, and then do the same for : love.txt cool.txt Nice 😎 Scenario #4 Create a new file ( ) with some text, and add some text to . Stage both changes, and commit them: new_file.txt love.txt Oops 🙈🙈 So this time, I wanted it to be on another branch, but not a branch, rather an already-existing branch. new So what can you do? I’ll give you a hint. The answer is really short and really easy. What do we do first? No, not . We draw. That’s the first thing to do, as it would make everything else so much easier. So this is the current state: reset And the desired state? How do you get from the current state to the desired state, what would be easiest? So one way would be to use as you did before, but there is another way that I would like you to try. git reset First, move to point to branch: HEAD existing Intuitively, what you want to do is take the changes introduced in the blue commit, and apply these changes (“copy-paste”) on top of branch. And Git has a tool just for that. existing To ask Git to take the changes introduced between this commit and its parent commit and just apply these changes on the active branch, you can use . This command takes the changes introduced in the specified revision and applies them to the active commit. git cherry-pick It also creates a new commit object, and updates the active branch to point to this new object. In the example above, I specified the SHA-1 identifier of the created commit, but you could also use , as the commit whose changes we are applying is the one is pointing to. git cherry-pick main main But we don’t want these changes to exist on branch. only applied the changes to the branch. How can you remove them from ? main git cherry-pick existing main One way would be to back to , and then use : switch main git reset --hard HEAD~1 You did it! 💪🏻 Note that actually computes the difference between the specified commit and its parent, and then applies them to the active commit. This means that sometimes, Git won’t be able to apply those changes as you may get a conflict, but that’s a topic for another post. git cherry-pick Also, note that you can ask Git to the changes introduced in any commit, not only commits referenced by a branch. cherry-pick We have acquired a new tool, so we have as well as under our belt. git reset git cherry-pick Scenario #5 Okay, so another day, another repo, another problem. Create a commit: And it to the remote server: push Um, oops 😓… I just noticed something. There is a typo there. I wrote instead of . Whoops. So what’s the big problem now? I ed, which means that someone else might have already ed those changes. This is more tezt This is more text push pull If I override those changes by using , as we’ve done so far, we will have different histories, and all hell might break loose. You can rewrite your own copy of the repo as much as you like until you it. git reset push Once you the change, you need to be certain no one else has fetched those changes if you are going to rewrite history. push very Alternatively, you can use another tool called . This command takes the commit you’re providing it with and computes the Diff from its parent commit, just like , but this time, it computes the reverse changes. git revert git cherry-pick So if in the specified commit you added a line, the reverse would delete the line, and vice versa. created a new commit object, which means it’s an addition to the history. By using , you didn’t rewrite history. You admitted your past mistake, and this commit is an acknowledgment that you made a mistake and now you fixed it. git revert git revert Some would say it’s the more mature way. Some would say it’s not as clean a history as you would get if you used to rewrite the previous commit. But this is a way to avoid rewriting history. git reset You can now fix the typo and commit again: Your toolbox is now loaded with a new shiny tool, : revert Scenario #6 Get some work done, write some code, and add it to . Stage this change, and commit it: love.txt I did the same on my machine, and I used the arrow key on my keyboard to scroll back to previous commands, and then I hit , and… Wow. Up Enter Whoops. Did I just use ? 😨 git reset --hard What happened? Git moved the pointer to , so the last commit, with all of my work, is not reachable from the current history. Git also unstaged all the changes from the staging area, and then matched the working dir to the state of the staging area. actually HEAD~1 precious That is, everything matches this state where my work is… gone. Freak out time. Freaking out. But, really, is there a reason to freak out? Not really… We’re relaxed people. What do we do? Well, intuitively, is the commit really, gone? No. Why not? It still exists inside the internal database of Git. really If I only knew where that is, I would know the SHA-1 value that identifies this commit, we could restore it. I could even undo the undoing, and back to this commit. reset So the only thing I really need here is the SHA-1 of the “deleted” commit. So the question is, how do I find it? Would be useful? git log Well, not really. would go to , which points to , which points to the parent commit of the commit we are looking for. Then, would trace back through the parent chain, which does not include the commit with my precious work. git log HEAD main git log Thankfully, the very smart people who created Git also created a backup plan for us, and that is called the . reflog While you work with Git, whenever you change , which you can do by using , but also other commands like or , Git adds an entry to the . HEAD git reset git switch git checkout reflog We found our commit! It’s the one starting with . 0fb929e We can also relate to it by its “nickname” — . So such as Git uses to get to the first parent of , and to refer to the second parent of and so on, Git uses to refer to the first reflog parent of , where pointed to in the previous step. HEAD@{1} HEAD~1 HEAD HEAD~2 HEAD HEAD@{1} HEAD HEAD We can also ask to show us its value: git rev-parse Another way to view the is by using , which asks to actually consider the : reflog git log -g git log reflog We see above that the , just as , points to , which points to “Commit 2”. But the parent of that entry in the points to “Commit 3”. reflog HEAD main reflog So to get back to “Commit 3”, you can just use (or the SHA-1 value of “Commit 3”): git reset --hard HEAD@{1} And now, if we : git log We saved the day! 🎉👏🏻 What would happen if I used this command again? And ran ? Git would set to where was pointing before the last , meaning to “Commit 2”. We can keep going all day: git commit --reset HEAD@{1} HEAD HEAD reset Looking at our toolbox now, it’s loaded with tools that can help you solve many cases where things go wrong in Git: With these tools, you now better understand how Git works. There are more tools that would allow you to rewrite history specifically, ), but you’ve already learned a lot in this post. In future posts, I will dive into as well. git rebase git rebase The most important tool, even more important than the five tools listed in this toolbox, is to whiteboard the current situation vs the desired one. Trust me on this, it will make every situation seem less daunting and the solution more clear. Learn More About Git I also gave a live talk covering the contents of this post. If you prefer a video (or wish to watch it alongside reading) — you can find it . here In general, covers many aspects of Git and its internals; you are welcomed to (pun intended 😇) my YouTube channel check it out About the Author is the CTO and Co-Founder of , a devtool that helps developers and their teams manage knowledge about their codebase with up-to-date internal documentation. Omer is the founder of Check Point Security Academy and was the Cyber Security Lead at ITC, an educational organization that trains talented professionals to develop careers in technology. Omer Rosenbaum Swimm Omer has a MA in Linguistics from Tel Aviv University and is the creator of the . Brief YouTube Channel First published here