Setup | Shells | Git | Markdown and IDEs | Virtual Environments | Task Management

Git

image

Why Version Control?

You’re working on a team project and need to make edits to reports and code. You waiting for your team member to make a change and then email you back another a copy. There has to be a better way…

“Version control is the lab notebook of the digital world: it’s what professionals use to keep track of what they’ve done and to collaborate with other people. Every large software development project relies on it, and most programmers use it for their small jobs as well. And it isn’t just for software: books, papers, small data sets, and anything that changes over time or needs to be shared can and should be stored in a version control system.” – Version Control with Git

Understanding Git

What better way to understand git, then check out git itself. This might take a while…

[command:]
git clone https://github.com/git/git

We’ll be working inside the git/ directory set our working state to v2.23.0.

[command:]
cd git
git reset --hard v2.23.0

Git’s Object Model: Content-Addressable Data Store.

git object model

  • Every object has a SHA-1 hash: 40 hex characters.
  • Given 40 hex characters, we can find the unique object with that hash.

Let’s examine a single commit.

[command:]
git log -1 --abbrev=40

Object Types: Blobs, Trees, Commits

We will use the git cat-file command to help us search for objects inside the store. If we provide git with a partial hash, it will attempt to find a unique match, and if it is unable to, it will provide a list of those that did match.

[command:]
git cat-file -p 5fa0

Blobs

Let’s examine a blob object. A blob contains file contents. img

[command:]
git cat-file -p 5fa073a885

Trees

Let’s examine a tree object. A tree contains folder contents. img

[command:]
git cat-file -p 5fa02bff4e

Example representation of folder contents contained by a tree:

img

Commits

Perhaps one of the most important type of object inside the object model is a commit. A commit contains many things:

  • A root tree
  • A list of parent commits
  • A commit message
  • An author name, email, time.
  • A committer name, email, time.

git commit

Let’s examine an example commit.

[command:]
git cat-file -p 5fa00a4dcf

We can examine the commit graph (but only the first part!).

[command:]
PAGER='head -n 80' git log --graph --oneline

Diffs

Diffs are not part of the object model!

Commits are NOT diffs

Instead, diffs are dynamically calculated from the commit graph inside the object store. For example, even object attributes, such as file renames are not represented inside the datastore and must be calculated dynamically.

Let’s examine a diff.

[command:]
git diff --raw v2.22.0 v2.23.0

Merkle Trees

To enable efficient representation and fast computations of git operations, merkle trees provide forward references within the graph to blobs.

merkle-tree

Branches

Branches are simply pointers to commits. Tags are pointers to anything (commits, trees, blobs).

git-branches

Move between branches with git switch

git switch is a new feature in v2.23.0 of git. It essentially replaces and does less work than git checkout. Primarily, git switch will:

  • Change HEAD to point to a new branch.
  • Updates the working directory to match the commit’s tree.

We can switch our branch to the maintenance branch.

[command:]
git switch maint

Let’s confirm.

[command:]
git status

We can return to the main branch.

[command:]
git switch master

Practice: Creating a Repo

Let’s try the basics. Let’s create a new local git repository.

Create a new directory (Basics) and file (README.md).

Basics/README.md
[file:]
# Project 0
Hello!

We are going to create a new git repository, but maybe not the way you’ve done it before. In the next set of commands, we will be working inside the Basics/ directory.

This will create a new .git directory to store commits and other objects.

[command:]
cd Basics
git init

We can quickly inspect the contents of the git’s directory and object store.

[command:]
ls -l .git
echo "objects:"
ls -l .git/objects

Before adding a file to the repository, it must first be staged.

[command:]
git add README.md

We will commit our staged changes into the repository.

[command:]
git commit -m "initial commit"

Nice work!

Stage, unstage, and discard changes

Changes flow from our working tree, to staging index, and into repository.

git-staging

Exercise: Use the following sets of steps and execute them in any order you wish. Observe what happens to the working tree and index, by running the git status step.

Update the README.md and stage our change.

[command:]
echo " Update: $(date)" >> README.md
cat README.md
git add README.md

View the current state of our working tree and index.

[command:]
git status

Unstage file (remove from index), but keep changes in working tree.

[command:]
git restore --staged README.md

Discard changes in worktree (we will lose our work!). This will restore changes to both the index and the working tree based on the latest version in the repo.

[command:]
git restore --source=HEAD --staged --worktree README.md

Remotes

While having a local git repository is cool, we should connect it to another remote repository. In other words, we have no place to git push to…

git-remote

Remote operations

  • Get new data: git fetch <remote> [branch]
  • Upload your data: git push <remote> <branch>
  • Get new data and merge into working tree: git pull <remote> <refspec>

Hot Take: Avoid git pull on large repositories! You may want to handle merges yourself into your target branch instead of having git mess with your working tree.

Exercise: Let’s open a terminal and perform the following steps.

Windows:

[command:]
start bash

Mac/Linux:

[command:]
open -a "Terminal" .
  1. Create a repo on GitHub (If you are a NCSU student, use GitHub Enterprise: https://github.ncsu.edu).

  2. Follow the instructions on GitHub to add a remote url to an existing git repository. Basically, you need to run something like: git remote add origin https://github.com/<user>/<repo>.git

  3. Push your changes to GitHub. Verify you can see your updated README.md!

  4. On GitHub, edit the README.md, to say “Hello GitHub!”. Commit the changes on GitHub. Now you have changes in your remote (origin), that are missing on your local copy.

  5. Run git pull and verify you now have the updated changes.

Git Branching Playground

Manipulating the commit graph can get quite complicated! This interactive visualization is very useful for getting a deeper understanding of how operations such as branches, merges, cherry-picking, and more work!

We will solve the “Introduction Sequence” levels in:
http://pcottle.github.io/learnGitBranching/

example

Git Configuration and Security

If you want to make sure your commits are properly linked to your GitHub account, make sure you have configured your computer to have your name and email filled out.

$ git config --global user.name "FirstName LastName"
$ git config --global user.email email@example.com

You might also consider an authenication strategy. If you’re being asked to login everytime your pull/push to your remote repository, you might want to enable caching of your credentials. For example, you could use:

git config --global credential.helper store

However, this may store your credentials in plain text on your computer. There are other platform-specific credential.helpers that you can use to more securely store your credentials. It is also possible to generate personal access tokens that you can use authenicate instead of a passcode.

An alternative approach is to use sshkeys. In this case, you have a public/private keypair, with the public key stored on GitHub. You then use a different url pattern for your commands such as git clone. Instead of the https:// prefix, you instead use git@github.com:user/repo.git.

If you are interested in exploring this option: See these guides on GitHub: