How to Use GitPython to Manage Git Repositories?

I know what you're thinking, we usually manage our Python code via Git to track changes, but what do I mean by using GitPython to manage Git repositories? I recently faced a situation where I needed to automate a Git workflow. This includes pulling the latest changes from a Git repository, creating a branch, making some changes, viewing the diff, committing, and then pushing my branch back to the remote repository.

Doing this repeatedly was time-consuming, and I figured there must be a way to automate this. With Python, virtually anything is possible. I found a Python library called 'GitPython' that does exactly this. So, let's get to it.

What Is GitPython?

GitPython is a Python library that lets you work with Git repositories. It allows you to manage Git tasks using Python code, making it easy to automate things like commits, branches, and pushes without using the command line. This is useful for automating repetitive Git tasks directly from Python.

For example, you can use it to pull the latest updates from a repository, create new branches, and commit to your changes. It also provides a way to view diffs, so you can see what has changed between commits. You can install GitPython with a simple pip command.

pip install gitpython

Can I Not Use Subprocess?

You might ask, why bother with this when I can use subprocess with Python and run normal commands? Absolutely, you can, but at some point, you may need to put some logic into your workflow. For example, before creating a new branch, you may want to check if the branch already exists locally, or you might need to check if you have any uncommitted changes before proceeding.

GitPython provides various methods to handle these checks. For example, you can use the is_dirty() method to check for uncommitted changes, rather than relying on raw CLI outputs. This can simplify your code and make it easier to manage complex conditions within your Git workflow.

Here Is an Example

For this example, I'm going to use a repo located in /Users/suresh/Documents/code/campus-network that I've already cloned. I'm going to edit one of the files called requirements.txt within this repo just to show you how it works.

First, we import the necessary modules and define a function to append a line to a file within the repo. We then set up variables for our new branch and the repository directory.

import sys
from git import Repo

def edit_file(file):
    with open(file, 'a') as file:
        file.write("\nadding new line")

new_branch = 'dummy'
repo_dir = "/Users/suresh/Documents/code/campus-network"
repo = Repo(repo_dir)

try:
    repo.git.checkout('main')
    repo.remotes.origin.pull()
except Exception as e:
    print(f'Error during pull: {e}')
    sys.exit(1)

The script starts by checking out the main branch and pulling the latest changes to ensure we have the most recent version. If there’s an issue during the pull, it prints an error message and exits to avoid further complications.

# Checkout or create the new branch
if new_branch not in repo.branches:
    repo.git.checkout('-b', new_branch)
else:
    repo.git.checkout(new_branch)

edit_file(f'{repo_dir}/requirements.txt')

Next, the script checks if the new branch already exists. If it doesn’t, it creates a new branch; otherwise, it simply switches to it. We then call our previously defined function to edit a file, in this case, adding a new line to requirements.txt

# Check for modifications
modified_files = [item.a_path for item in repo.index.diff(None)]
if modified_files:
    print('Modified Files:', modified_files)
    for diff in modified_files:
        print(f'Diff for {diff}:')
        print(repo.git.diff(diff))

    # Stage, commit, and push changes
    repo.index.add(modified_files)
    repo.index.commit('another new change')

    try:
        repo.remotes.origin.push(new_branch)
    except Exception as e:
        print(f'Error during push: {e}')
        sys.exit(1)

After modifying the file, the script checks for any changes in the repository. It lists modified files and displays the differences for each. If there are modifications, it stages these changes, commits them with a message, and attempts to push the commit to the remote repository. If the push fails, the script handles the error by printing a message and exiting.

Uncommitted Changes

In GitPython, you can check if you have any uncommitted changes by using the is_dirty() method of the Repo class. This method returns True if there are uncommitted changes in the working directory, including untracked, modified, or staged files.

# Check for uncommitted changes
if repo.is_dirty():
    print("You have uncommitted changes")
else:
    print("No uncommitted changes")

Discarding The Changes

You can also discard your changes without committing them and even delete the branch you just created. Here are the methods you can use for that.

repo.git.reset('--hard')
repo.git.checkout('main')
repo.git.branch('-d', new_branch)