Lesson goal: make another few commits in the repo; learn about the HEAD, index, and working directory.
git statusto check which files have been modified or staged
git addto stage changes for the next commit
git committo make each commit
git logshows the history of commits made.
In this lesson we're going to look at different ways to make new commits. All of them follow the same pattern. First we stage the changes we want to add to the next commit, that is, tell Git to get ready to commit them. Then we actually make the new commit.
Let's look at where we left off last time. (If you are just starting, download the repo from
the end of lesson 2 here.) In the repo, type
$ git status On branch master nothing to commit, working directory clean
We see that currently there's "nothing to commit" because we've made no changes since the last commit.
Let's fix that. In
months.txt, every month is abbreviated except June and July. Modify
so that "June" is replaced with "Jun" and "July" with "Jul". It should look like this:
Jan Feb Mar Apr May Jun Jul Aug Sept Oct Nov Dec
git status shows:
$ git status On branch master Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: months.txt no changes added to commit (use "git add" and/or "git commit -a")
months.txt has been modified since the last commit. To make a new commit that saves these changes, we need
to stage the modified version of
$ git add months.txt $ git status On branch master Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: months.txt
The status message has changed slightly, indicating the
months.txt is ready to be committed. To make the commit,
git commit. This will open a text editor like last time. Give a brief one-line description of the commit we're
about to make, like:
Abbreviate June and July in months.txt # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # On branch master # Changes to be committed: # modified: months.txt #
Save the message and exit the text editor to get a message like:
[master e8e99aa] Abbreviate June and July in months.txt 1 file changed, 2 insertions(+), 2 deletions(-)
telling us that we changed 2 lines in one file. (Git sees a change to a line as deleting the old line and inserting a new one, so in this case "2 insertions, 2 deletions" corresponds to modifying two lines.)
At this point, let's look at the log to see the two commits we've made:
$ git log commit 09685d8b8ed5125ba9a0be61eb73a91c03cab82d Author: Josh Laughner <firstname.lastname@example.org> Date: Mon Jul 8 23:02:40 2019 -0700 Abbreviate June and July in months.txt commit e8e99aa950bdf3f357ebb1acf2137720928fe1ca Author: Josh Laughner <email@example.com> Date: Mon Jul 8 23:02:32 2019 -0700 Added daysofweek.txt and months.txt
For now, don't worry too much about the details of the log. We'll go through it in more detail in Lesson 4.
For this next option, let's modify both files. In
months.txt, change "Sept" to "Sep" so that all the month use
three letter abbreviations. In
daysofweek.txt, change all the days to use three letter abbreviations:
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Sun Mon Tue Wed Thu Fri Sat
Let's also create a third file, which we won't commit, but will use to demonstrate something. I did:
git status now shows that we have two files with changes and one "untracked" file:
$ git status On branch master Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: daysofweek.txt modified: months.txt Untracked files: (use "git add <file>..." to include in what will be committed) hello.txt
As far as Git is concerned, untracked files barely exist. If someone "clones" your repository (we'll discuss in Part III, but basically this is using Git to make a new copy of a repo), they won't get any of the untracked files.
For this example, let's say we want to commit the changes to
daysofweek.txt but not
start tracking the new file,
hello.txt. We could stage only the files we want with
git add months.txt daysofweek.txt,
but if we had dozen of files this would get tedious. We can't use
git add *.txt or
git add . now,
because that would add
Instead, we can use
git add -u:
$ git add -u $ git status On branch master Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: daysofweek.txt modified: months.txt Untracked files: (use "git add <file>..." to include in what will be committed) hello.txt
months.txt were staged, but not
-u flag tells
git add to add changes to any already tracked files. Go ahead and commit with
git commit, and
try your hand at making your own commit message.
One more way to make commits: you can skip
git add by doing
git commit -a. This is the equivalent
git add -u immediately followed by
git commit - the
-a flag stands for "all" meaning "go ahead
and commit all pending changes." To test it out, let's modify
daysofweek.txt to put Sunday at the end:
$ git status On branch master Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: daysofweek.txt modified: months.txt Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: daysofweek.txt Untracked files: (use "git add <file>..." to include in what will be committed) hello.txt $ git commit -a [master 4cc58be] Put Sunday at the end of the week 2 files changed, 8 insertions(+), 8 deletions(-)
Now if we look at the log we see:
$ git log commit 4cc58beaa6c7c2fea045210a482bd28ed768a223 Author: Josh Laughner <firstname.lastname@example.org> Date: Mon Jul 8 23:22:43 2019 -0700 Put Sunday at the end of the week commit 50be39ace1af0c9b332dabdc5877bfde092984ed Author: Josh Laughner <email@example.com> Date: Mon Jul 8 23:22:24 2019 -0700 Changed all days and months to 3 letter abbreviations commit 09685d8b8ed5125ba9a0be61eb73a91c03cab82d Author: Josh Laughner <firstname.lastname@example.org> Date: Mon Jul 8 23:02:40 2019 -0700 Abbreviate June and July in months.txt commit e8e99aa950bdf3f357ebb1acf2137720928fe1ca Author: Josh Laughner <email@example.com> Date: Mon Jul 8 23:02:32 2019 -0700 Added daysofweek.txt and months.txt
One flag to
git commit that I rarely use outside of tutorials is
-m. This lets you
pass the commit message as the next value on the command line. For example, I could have
made the above commit in one line with
git commit -a -m 'Put Sunday at the end of the week'.
Note that the message will just be the one line, and it needs to be in quotes.
I usually don't use this in practice - it precludes writing more detailed messages. But for tutorials it's useful because everything can be done on the command line, so the commands are clearer.
If you want to go on to Lesson 3 and start learning about diffs and more about the log, feel free. The rest of this will be more about good practice, so come back once you're feeling comfortable with the basics and want to refine your usage.
I generally never use
git commit -a. I prefer
git add -u then
git commit. I generally
consider it good practice to verify what you're about to commit before making the commit, so
I like staging my changes, reviewing them with
git status (and
git diff --cached, see Lesson 4),
then committing. I prefer the additional control.
So far our commit messages have been very short, because we've haven't been making very many changes. That's fine for this demo repo, but in practice your commit messages are an important part of making a maintainable repo. The messages are the highest level information available when you're trying to understand how the repo evolved over time, so it pays to make them informative.
The first widely expected rule to formatting your commit messages:
The first line is crucial because it gets used by itself often to describe your commit.
git log --oneline
for instance prints just that first line for each commit.
The second formatting rule is:
When you print
git log, is doesn't do any text wrapping. Keeping lines short
helps them fit on standard size terminal windows.
The third generally accepted rule is:
It's not worth the time to rehash exactly what changed in the files themselves, because we can go
back and see that with commands like
git diff and
git log -p. It's more useful to explain the big pictue
of what change was made and why that change was necessary. Let's look at one of my
actual commits from a project in my Ph.D.:
Number of user-friendliness improvements * namelist.input.template: added the scale_nei_emiss and scale_closest_year options to the chem namelist. These are implemented in commits a263d29 through 52cf5d6 of the WRF-Chem-R2SMH repo (https://github.com/CohenBerkeleyLab/WRF-Chem-R2SMH) * PREPINPUT/precheck now checks that mozbcFile has been set i.e. is not a zero-length string; if it is, then the error message for "Cannot find data/$mozbcFile" is very confusing as it looks like its saying that it cannot find the data link. * PREPINPUT/prepnei_convert: convert_emiss error messages now correctly give the convert_emiss exit code rather than the real.exe exit code. * Many files: modified or added messages before `exit 1` calls to explicitly indicate that the program is aborting.
Some background: this is a commit from a wrapper I built around the WRF-Chem model, which simulates atmospheric chemistry, to simplify preparing the inputs needed by the model. Let's break down the good and bad in this message (and I promise to only be a little biased).
"Number of user-friendliness improvements" - conveys the overall purpose of this commit, which is to make the program more user-friendly.
"namelist.input.template: ..." - this bullet could be better. If you look at the
actual changes for the
namelist.input.template file, you'll see that it really is
just adding those two options with their default values. In this case the change is so
small that the line between "what" and "how" blurs - but, saying why those options
were added would have been more helpful.
"PREPINPUT/precheck..." - this bullet point does a good job on focusing on what/why.
The change to this file was to check that the variable
mozbcFile actually got a value.
Why? If not, then when the program checked to see that it pointed to an actual file, it
would not, but the error message would be confusing because the problem isn't that the
file doesn't exist, it's that the path to the file was never set.
"PREPINPUT/prepnei_convert..." - says what but not why, though in this case the why is implied - the wrong exit code was being given, which (if you know about exit codes in bash) means that the user would get the wrong information about why the program failed.
"Many files:... - this tells us what (added messages before programs exit) and why (to make it clear the program is exiting). This is particularly helpful because in the commit itself you'll see this sort of change in a bunch of files, so it helps to understand that all those changes are working towards one goal.
I find when you're starting out it's good to have a more structured recommendation for how to format your commit messages. So here's what I recommend you use as you start out:
Summary line, < 50 char if possible, definitely <= 72 [optional] Additional paragraph(s) expanding on what the commit does and why it was necessary. These should deal with the commit as a whole. NEW FILES: * Itemize any new files here, for example: * daysofweek.txt - a list of days of week added to enforce standard US weekday order. * Still focus on why the new files were added and what purpose they serve rather than how they work. * If no new files added, skip this section. MODIFIED FILES: * Itemize any modified files here, for example: * months.txt: changed so that all months use standard 3-letter abbreviations. * Again, focus on what (high-level) changes were made to each file and why they were necessary. * Skip this section if no files modified, though that will be rare. RENAMED FILES: * Enumerate any renames, e.g.: * daysofweek.txt -> days_of_week.txt * If there was a reason for the rename, mention it * Skip this section if no files renamed DELETED FILES: * Enumerate deleted files * Skip if no files were deleted
Early on, I think having a "form" like this to fill out helps structure your ideas and makes sure you cover everything important. As you get more experience writing (and reading) commit messages, you will probably find yourself relaxing this strict format.
Some other people's thoughts on formatting commit messages:
First the practical advice. When you're first starting out, I recommend that you force yourself to make commits when:
As you get more experience working with Git, you'll find yourself looking back through the history with moments of "Argh, why didn't I break this up better?" Embrace those moments as learning opportunities.
Now, the Platonic ideal is that you would make a commit for each conceptual change. Put another way, each commit should have a purpose:
This requires a little bit of foreplanning, but really just requires you to be conscious about your goals when you start modifying your code. Assuming you start from a state where there's no pending changes (the "working directory is clean"), ask yourself: "What am I trying to accomplish with this change?" That will probably be the first line of your commit message once it's done.
As you work on the code, ask yourself about every 15-30 minutes "What goal am I working towards now?" If it has changed since you started, imagine your were explaining the changes you made to someone else who uses the code. Are both your original and current goal close enough that you would explain them together? If so, keep working. If not, finish the first goal, make sure it works, then commit those changes.