6,222 Introduction to Programming

In-class Exercise: Git and GitHub for Collaboration

Author

Dr. Aurélien Sallin

Published

March 6, 2026

Overview

Goal: By the end of this exercise, you and your partner will share a GitHub repository, each working on your own branch, with a merged pull request (PR) in the history. This is a typical workflow we would like you to use for your group project in this course, and you can also use it for your other group projects in R or Python or any programming language.

Work in pairs. Decide who is Student A (the repository owner) and who is Student B (the collaborator).


Phase 2: Set Up the Environment

In this part, we will set up the environment and generate a dataset using a simple Python script. The script generate_data.py creates a CSV file with GDP data for a few countries:

import pandas as pd

data = pd.DataFrame({
    "country": ["Switzerland", "Germany", "France", "Italy", "Spain"],
    "gdp_bn_usd": [800, 4000, 2800, 2100, 1400],
    "year": 2023
})

data.to_csv("gdp_data.csv", index=False)
print("File written: gdp_data.csv")

A

Student A

Create a virtual environment and add the data script

  1. In the repository, create a virtual environment and install pandas:
uv init --python 3.13
uv add pandas
  1. Download generate_data.py (available on Canvas) and copy it into collab-exercise-week03

  2. Using the terminal or a text editor, create a .gitignore that excludes the virtual environment. In the terminal, you can use echo like this:

echo ".venv/
__pycache__/
*.pyc" > .gitignore

The .gitignore file should now look like this:

.venv/
__pycache__/
*.pyc

Note: > creates the file (or overwrites it if it already exists). >> (used later) appends to an existing file.

  1. Commit and push:
git add .
git commit -m "Add virtual environment and data generation script"
git push

B

Student B

Sync the environment and generate the data

  1. Pull Student A’s changes and sync the virtual environment:
git pull
uv sync

Note: uv sync recreates the virtual environment from uv.lock, reproducing the exact same environment that was started by Student A. You don’t need to run uv init yourself.

  1. Run the data generation script, either directly with uv run or from VS Code by selecting the right Python interpreter (like in the first exercise session):
uv run python generate_data.py
  1. gdp_data.csv is now generated locally. Remember from the lecture: generated files should not be tracked. Add it to .gitignore and push:
echo "gdp_data.csv" >> .gitignore
git add .gitignore
git commit -m "Ignore generated CSV file"
git push

A

Student A

Pull the changes

git pull

Checkpoint: At this point, both students have generate_data.py and .gitignore. The CSV is generated locally but not tracked by Git. The .gitignore file is in sync.


Phase 3: Branch and Pull Request

B

Student B

Create a branch and extend the dataset

  1. Create a new branch:
git switch -c add-population-data
  1. Open generate_data.py and add a population_mn column:
data = pd.DataFrame({
    "country": ["Switzerland", "Germany", "France", "Italy", "Spain"],
    "gdp_bn_usd": [800, 4000, 2800, 2100, 1400],
    "population_mn": [8.7, 84.4, 68.2, 59.0, 47.4],
    "year": 2023
})
data.to_csv("gdp_data.csv", index=False)
  1. Commit and push the branch:
git add generate_data.py
git commit -m "Add population data to analysis"
git push -u origin add-population-data
  1. Open a Pull Request on GitHub:
    • GitHub will show a banner: “add-population-data had recent pushes”, click Compare & pull request

Compare and pull request
  • Title: Add population data to GDP analysis
  • Write a short description of what you changed and why
  • Click Create pull request

Open pull request

A

Student A

Review the pull request

  1. Open the Pull Request on GitHub and go to the Files changed tab. It gives you an overview of all the changes made on the branch that wants to be pulled. Check that only generate_data.py was modified.

Files changed tab
  1. Leave a review comment requesting one more change:

“Great work! Before we merge, can you add a gdp_per_capita_usd column? It should be GDP in billions of USD divided by population in millions, scaled to USD per person.”

B

Student B

Address the review

Add the column to generate_data.py:

data["gdp_per_capita_usd"] = data["gdp_bn_usd"] / data["population_mn"] * 1000

Commit and push. The PR updates automatically:

git add generate_data.py
git commit -m "Add GDP per capita column"
git push

A

Student A

Approve and merge

  1. Check the updated Files changed tab and verify the formula is correct
  2. Approve and click Merge pull request

Both students sync main:

git switch main
git pull

Checkpoint: Student B’s changes are in main. Both students are in sync. The full collaborative workflow is complete.


Ensuring that your feature/working branch is up-to-date with the main branch (self-study)

When working in a team, the main branch changes frequently. Before continuing your work (or opening a pull request), you should update your feature branch to align with main.

There are two safe and common ways to do this.

Note

This part is more advanced and not required for the course. It is however an important workflow to understand and master for any collaborative project.

Option 1: Merge main into your feature branch

Make sure you are on your feature / working branch

git switch <feature-branch>

# Download newest changes from GitHub. This updates your local copy of main but does not change your branch yet.
git fetch origin

# Merge main into your branch. Now your branch contains all changes from main
git merge origin/main

# If there are conflicts, Git will tell you. Fix them, then run:
git add .
git commit

Option 2: Rebase your branch onto main

This keeps the commit history linear and clean, but is slightly more advanced.

git switch <feature-branch>

# Download newest changes from GitHub. This updates your local copy of main but does not change your branch yet.
git fetch origin

# rebase onto main
git rebase origin/main

# If there are conflicts, Git will tell you. Fix them, then run:
git add .
git rebase --continue

# repeat (Fix -> git add -> git rebase --continue) until Git says something like "Successfully rebased and updated <feature-branch>."

Summary

You have practised the core collaborative Git workflow:

Step Command(s)
Clone an existing repo git clone
Ignore generated files .gitignore
Sync a virtual environment uv sync
Get a collaborator’s changes git pull
Work on a branch git switch -c, git push -u origin <branch>
Contribute via PR GitHub web interface
Note

From now on, this is the workflow you will use for your group project: each team member works on their own branch and contributes changes via pull requests.