import pandas as pdimport matplotlib.pyplot as plt
---------------------------------------------------------------------------ModuleNotFoundError Traceback (most recent call last)
CellIn[1], line 1----> 1importpandasaspd 2importmatplotlib.pyplotaspltModuleNotFoundError: No module named 'pandas'
Exercise 0: Warm-Up
Let’s practice the core concepts of classes with a simple example. A class is a blueprint for creating objects. Each object created from a class is called an instance and can hold data (stored as attributes) and perform actions (defined as methods). We will create a Country class that stores some basic facts about a country and can report them.
Create a class called Country. Write the __init__ method so that it takes three arguments — name, population (in millions), and gdp_per_capita (in USD) — and assigns them as instance attributes.
Add a method describe(self) that prints a short summary of the country, for example: “Switzerland: population 8.7 million, GDP per capita $87,000”.
Add a method total_gdp(self) that returns the total GDP (population times GDP per capita).
Solution: Country Class
class Country:def__init__(self, name, population, gdp_per_capita):self.name = nameself.population = populationself.gdp_per_capita = gdp_per_capitadef describe(self):"""Print a short summary of the country."""print(f"{self.name}: population {self.population} million, "f"GDP per capita ${self.gdp_per_capita:,}")def total_gdp(self):"""Return total GDP in millions of USD."""returnself.population *self.gdp_per_capita
1
Each argument is stored as an instance attribute using self. — this makes the data accessible to all methods of the class and to anyone who holds a reference to the object.
2
A method is just a function defined inside the class that always takes self as its first argument. self refers to the specific instance the method is called on — so self.name gives the name of this country, not any other.
3
Methods can compute and return values just like regular functions, but they have access to all the instance’s attributes via self.
Test your code: create two instances for countries of your choice, call describe() on each, and print their total GDPs.
Test your code
# Create two instancesswitzerland = Country(name='Switzerland', population=8.7, gdp_per_capita=87000)germany = Country(name='Germany', population=84.0, gdp_per_capita=54000)# Call describe() on eachswitzerland.describe()germany.describe()# Compare total GDPsprint(f"\nTotal GDP — Switzerland: ${switzerland.total_gdp():,.0f} million")print(f"Total GDP — Germany: ${germany.total_gdp():,.0f} million")
Switzerland: population 8.7 million, GDP per capita $87,000
Germany: population 84.0 million, GDP per capita $54,000
Total GDP — Switzerland: $756,900 million
Total GDP — Germany: $4,536,000 million
Exercise 1: The Solow Growth Model
Before we write any code, let’s review the economics. The Solow model describes how capital accumulation, population growth, and technological progress impact economic growth over time.
A central equation in the Solow model is the law of motion for capital per worker:
It describes how the capital stock evolves from one period to the next. Output per worker is \(Ak_t^\alpha\), so savings generate \(sAk_t^\alpha\) in new investment. Tomorrow’s capital \(k_{t+1}\) is then what remains of today’s capital after depreciation, plus this new investment — all divided by \((1+n)\) to adjust downward for population growth, since capital is measured per worker.
Over the course of this exercise you will build a Solow class that encapsulates this model economy. You will write methods to simulate the evolution of capital period by period, find the steady state numerically, and finally visualize how economies with different initial capital stocks converge to the same long-run equilibrium.
1.1 Creating the Solow Class
Create a class called Solow.
Write the __init__ method so that it takes the six parameters above as arguments with the following default values: n=0.05, s=0.25, delta=0.1, alpha=0.3, A=2.0, and k=1.0.
Assign each argument to an instance attribute (e.g. self.n = n). Note that unlike the other parameters, k is the state variable — it will change over time as we simulate the economy, while all other parameters remain fixed.
Solution: Create the Solow Class
class Solow:def__init__(self, n=0.05, # population growth rate s=0.25, # savings rate delta=0.1, # depreciation rate alpha=0.3, # capital share A=2.0, # productivity k=1.0): # current capital stockself.n, self.s, self.delta, self.alpha, self.A = n, s, delta, alpha, Aself.k = k
Tuple unpacking
You can assign attributes one by one, which is more explicit and easier to read, or use tuple unpacking to assign all of them in a single line for more compact code — both produce exactly the same result.
# Without unpacking (recommended for clarity):self.n = nself.s = sself.delta = deltaself.alpha = alphaself.A = Aself.k = k# With tuple unpacking (more compact, same result):self.n, self.s, self.delta, self.alpha, self.A, self.k = n, s, delta, alpha, A, k
Test your code: it is a fundamental rule of programming to test your code frequently! Create two instances — one using all default values, and one with a custom parameter — and verify that the attributes were stored correctly.
Test your code
# 1. Create an instance with default values and check attribute Aeconomy = Solow()print(f"Default A: {economy.A}") # Expected: 2.0# 2. Create an instance with a custom savings rate and verify it was storedeconomy_high_s = Solow(s=0.4)print(f"Custom s: {economy_high_s.s}") # Expected: 0.4print(f"Default n: {economy_high_s.n}") # Expected: 0.05 — other parameters unchanged
Default A: 2.0
Custom s: 0.4
Default n: 0.05
1.2 Simulating One Period
The law of motion you saw above defines exactly how the economy moves from one period to the next — that is what the update() method will implement. Each time it is called, it takes the current capital stock self.k, applies the formula, and updates self.k to the new value.
Create a method update(self) that calculates next period’s capital stock using the law of motion and updates self.k to this new value.
Solution: add update() method
class Solow:def__init__(self, n=0.05, # population growth rate s=0.25, # savings rate delta=0.1, # depreciation rate alpha=0.3, # capital share A=2.0, # productivity k=1.0): # current capital stockself.n, self.s, self.delta, self.alpha, self.A = n, s, delta, alpha, Aself.k = kdef update(self):"Update the current state (i.e., the capital stock) to the next period."# Unpack parameters to simplify notation n, s, delta, alpha, A =self.n, self.s, self.delta, self.alpha, self.A# Calculate next periods capital stock k_next = (s * A *self.k**alpha + (1- delta) *self.k) / (1+ n)# Update the capital stock self.k = k_next
1
When a method uses several parameters in a formula, repeatedly writing self. can make the expression cluttered and hard to read. Unpacking the attributes into local variables at the top of the method — n, s, delta, alpha, A = self.n, self.s, self.delta, self.alpha, self.A — keeps the formula clean and close to its mathematical notation.
Test your Code: create a new instance of the class, print the initial capital stock \(k_0\), call the update() method and print the new capital stock \(k_{1}\)
Test your code
# 1. Create a new instanceeconomy = Solow()# 2. Print initial capitalprint(f"Initial capital: {economy.k}")# 3. Call the update methodeconomy.update()# 4. Print new capitalprint(f"Capital after one period: {economy.k:.4f}") # Note: The expected output should be roughly 1.3333
Initial capital: 1.0
Capital after one period: 1.3333
1.3 Finding the Steady State
A key concept in the Solow model is the steady state, denoted as \(k^{*}\). This is the point of equilibrium where the capital stock per worker no longer changes over time, meaning \(k_{t+1}=k_t\). Once an economy reaches its steady state, it stays there forever (unless a parameter changes).
Our goal is to find the steady state capital stock \(k^{*}\) numerically. The idea behind the numerical approach is simple: we repeatedly apply the law of motion, plugging today’s capital in to get tomorrow’s, then plugging tomorrow’s back in to get the day after, and so on. Because the model converges, the change between consecutive periods gets smaller and smaller over time, and once that difference falls below some tolerance threshold we know the economy has effectively reached its steady state.
Add a method find_steady_state(self, tolerance=1e-5) that finds the steady state numerically. As a starting point, use the economy’s current capital stock self.k.
Start by setting k_current = self.k and initialising difference to a large number
Use a while loop that runs as long as difference > tolerance:
Calculate k_next using the law of motion formula
Update difference as the absolute difference between k_next and k_current (use abs())
Set k_current = k_next to move one period forward
Return k_current once the loop ends
Solution: add find_steady_state() method
class Solow:def__init__(self, n=0.05, # population growth rate s=0.25, # savings rate delta=0.1, # depreciation rate alpha=0.3, # capital share A=2.0, # productivity k=1.0): # current capital stockself.n, self.s, self.delta, self.alpha, self.A = n, s, delta, alpha, Aself.k = kdef update(self):"Update the current state (i.e., the capital stock) to the next period."# Unpack parameters to simplify notation n, s, delta, alpha, A =self.n, self.s, self.delta, self.alpha, self.A# Calculate next periods capital stock k_next = (s * A *self.k**alpha + (1- delta) *self.k) / (1+ n)# Update the capital stock self.k = k_nextdef find_steady_state(self, tolerance=1e-5):"""Find the steady state numerically."""# 1. Unpack parameters n, s, delta, alpha, A =self.n, self.s, self.delta, self.alpha, self.A# 2. Set starting value and initial difference k_current =self.k difference =1000# 3. Simulate forwardwhile difference > tolerance:# Calculate tomorrow's capital k_next = (s * A * k_current**(alpha) + (1- delta) * k_current) / (1+ n)# Update the difference difference =abs(k_next - k_current)# Move the simulation forward one step k_current = k_nextreturn k_current
1
Start the loop for some initial value of \(k\). The initial value doesn’t matter as long as it’s reasonable (i.e. for example not negative), so starting with the economies current capital makes sense.
2
We need difference > tolerance to be True before the first iteration so the loop actually starts. Any value larger than tolerance would work — 1000 is simply a conveniently large number that is guaranteed to satisfy this regardless of the tolerance chosen.
3
Apply the law of motion to get next period’s capital
4
How much did capital change? Once this is tiny (below tolerance), we have converged
5
Move one period forward (i.e. the k_next we just calculated is now the k_current) and repeat
6
The loop has ended, meaning we have converged - return the steady state value
Now that find_steady_state() works, let’s store the result properly as an attribute rather than just returning it. Add self.steady_state = None to __init__ and update the method to store the result as self.steady_state instead of returning it. This way the object “knows” its own steady state after the method is called.
Solution: add self.steady_state and adjust find_steady_state()
class Solow:def__init__(self, n=0.05, # population growth rate s=0.25, # savings rate delta=0.1, # depreciation rate alpha=0.3, # capital share A=2.0, # productivity k=1.0): # current capital stockself.n, self.s, self.delta, self.alpha, self.A = n, s, delta, alpha, Aself.k = kself.steady_state =Nonedef update(self):"Update the current state (i.e., the capital stock) to the next period."# Unpack parameters to simplify notation n, s, delta, alpha, A =self.n, self.s, self.delta, self.alpha, self.A# Calculate next periods capital stock k_next = (s * A *self.k**alpha + (1- delta) *self.k) / (1+ n)# Update the capital stock self.k = k_nextdef find_steady_state(self, tolerance=1e-5):"""Find the steady state numerically, set attribute steady_state"""# 1. Unpack parameters n, s, delta, alpha, A =self.n, self.s, self.delta, self.alpha, self.A# 2. Set starting value and initial difference k_current =self.k difference =1000# 3. Simulate forwardwhile difference > tolerance:# Calculate tomorrow's capital k_next = (s * A * k_current**(alpha) + (1- delta) * k_current) / (1+ n) # Update the difference difference =abs(k_next - k_current) # Move the simulation forward one step k_current = k_next self.steady_state = k_current
1
The steady state is unknown until we compute it, so we initialise it to None — a clear signal that find_steady_state() has not been called yet.
2
Instead of returning the result, we store it as an attribute. The object now “knows” its own steady state and it can be accessed at any point via economy.steady_state.
Test your code: create a new instance of the economy, call find_steady_state() and print the steady_state attribute.
Test your code
# 1. Create a new instanceeconomy = Solow(k=2.0)# 2. Call find_steady_state() and print the steady_state attributeeconomy.find_steady_state()print(f"Numerical Steady State: {economy.steady_state:.6f}")
Numerical Steady State: 5.584222
1.4 Simulating the Economy
We can now simulate the full path of capital accumulation over time. Rather than re-implementing the law of motion, generate_sequence will simply call self.update() repeatedly — this is one of the key advantages of organizing code into a class: methods can build on each other.
Add a generate_sequence(self, t) method that returns a list path with the values of \(k\) over \(t\) periods.
Create an empty list path and append the initial capital stock as the first element
Use a for loop to run t times, calling self.update() and appending self.k at each step
Return path once the loop ends
Hint: since you don’t actually use the loop counter, use _ as the loop variable (for _ in range(t):). This is a Python convention that signals the variable is intentionally unused.
Solution: add generate_sequence method
class Solow:def__init__(self, n=0.05, # population growth rate s=0.25, # savings rate delta=0.1, # depreciation rate alpha=0.3, # capital share A=2.0, # productivity k=1.0): # current capital stockself.n, self.s, self.delta, self.alpha, self.A = n, s, delta, alpha, Aself.k = kself.steady_state =Nonedef update(self):"Update the current state (i.e., the capital stock) to the next period."# Unpack parameters to simplify notation n, s, delta, alpha, A =self.n, self.s, self.delta, self.alpha, self.A# Calculate next periods capital stock k_next = (s * A *self.k**alpha + (1- delta) *self.k) / (1+ n)# Update the capital stock self.k = k_nextdef find_steady_state(self, tolerance=1e-5):"""Find the steady state numerically, set attribute steady_state"""# 1. Unpack parameters n, s, delta, alpha, A =self.n, self.s, self.delta, self.alpha, self.A# 2. Set starting value and initial difference k_current =self.k difference =1000# 3. Simulate forwardwhile difference > tolerance:# Calculate tomorrow's capital k_next = (s * A * k_current**(alpha) + (1- delta) * k_current) / (1+ n) # Update the difference difference =abs(k_next - k_current) # Move the simulation forward one step k_current = k_next self.steady_state = k_current def generate_sequence(self, t):"""Generate and return a time series of capital of length t.""" path = [] path.append(self.k)for _ inrange(t):self.update() path.append(self.k)return path
1
An empty list path to store the capital stock at each period.
2
Record the initial capital stock \(k_0\) before the loop starts — the path should include the starting point.
3
_ is used as the loop variable instead of i because the loop counter is never needed inside the loop — this is a Python convention that signals the variable is intentionally unused. The loop runs exactly t times.
4
Call self.update() to apply the law of motion and move one period forward — no need to re-implement the formula here.
5
Record the updated capital stock after each step. After t iterations, path contains \(k_0, k_1, \ldots, k_t\).
6
Return the complete path.
Test your code: create a new instance, print the initial capital stock, generate a sequence of length 5 and print it. Then print the capital stock again — has it changed? Why?
Test your code
# 1. Create a new instanceeconomy = Solow()# 2. Print initial capital stockprint(f"Initial capital: {economy.k}")# 3. Generate a sequence of length 5path = economy.generate_sequence(5)print(path)# 4. Print capital stock after generating the sequenceprint(f"Capital after generating sequence: {economy.k}")
Initial capital: 1.0
[1.0, 1.3333333333333333, 1.6619706464615955, 1.9791307461805898, 2.2808150655927233, 2.5648128561783388]
Capital after generating sequence: 2.5648128561783388
Yes, economy.k has changed — it now holds \(k_5\) rather than \(k_0\). This is because generate_sequence calls self.update() internally, which permanently modifies self.k at each step. This is an important thing to be aware of: if you need to generate multiple sequences from the same starting point, you should create a new instance each time rather than reusing the same object.
1.5 Visualization
Now that our model can simulate a full path of capital accumulation, we can visualize how the economy evolves over time. A key prediction of the Solow model is convergence: regardless of where an economy starts, it will always move towards the same steady state \(k^{*}\).
Create two instances of the Solow economy with different initial capital stocks, one below (k=1.0) and one above (k=8.0)the steady state.
Generate a sequence of length 60 for each instance
Find the steady state for one of them (both have the same since they have the same parameters) and save it as kstar
Create a figure that plots the two sequences, as well as a horizontal line that indicates the steady state capital stock \(k^*\). Your plot should have a legend indicating the starting capital stock of each economy, and labeled axes.
Solution: Visualization
# 1. Create two instances with different initial capital stockss1 = Solow(k=1.0)s2 = Solow(k=8.0)# 2. Generate sequencespath1 = s1.generate_sequence(60)path2 = s2.generate_sequence(60)# 3. Find the steady state (same for both since parameters are identical)s1.find_steady_state()kstar = s1.steady_state# 4. Plotplt.figure(figsize=(9, 6))plt.plot(path1, 'o-', lw=2, alpha=0.6, label='Initial capital $k_0 = 1.0$')plt.plot(path2, 'o-', lw=2, alpha=0.6, label='Initial capital $k_0 = 8.0$')plt.axhline(kstar, color='black', linestyle='--', label=f'Steady state $k^* = {kstar:.2f}$')plt.xlabel('$t$', fontsize=14)plt.ylabel('$k_t$', fontsize=14)plt.title('Solow Model: Convergence to Steady State', fontsize=16)plt.legend()plt.grid(True, alpha=0.3)plt.show()
1
Plot the two simulated paths: 'o-' draws both dots and a connecting line, lw=2 sets the line width, alpha=0.6 makes it slightly transparent, and label sets the legend text
2
plt.axhline() draws a horizontal line across the full plot at the height kstar. linestyle='--' makes it dashed, and :.2f in the label formats the number as a float with 2 decimal places; so for example 2.995732 would display as 2.99.
3
Displays the legend using the label strings we defined in each plt.plot() call and plt.axhline()
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[13], line 14 11 kstar = s1.steady_state
13# 4. Plot---> 14plt.figure(figsize=(9, 6))
16 plt.plot(path1, 'o-', lw=2, alpha=0.6, label='Initial capital $k_0 = 1.0$') #<1> 17 plt.plot(path2, 'o-', lw=2, alpha=0.6, label='Initial capital $k_0 = 8.0$') #<1>NameError: name 'plt' is not defined
Exercise 2: Growth Accounting Analyzer
In Exercise 1 you built a class to simulate a theoretical model economy. In this exercise you will use a class to organize a data analysis workflow.
The application is growth accounting — the same framework you worked through in Week 5. The goal is to decompose GDP growth into the contributions of capital, labor, and Total Factor Productivity (TFP) using the Solow residual method, now applied to the full cross-country dataset.
The section below contains a complete script that runs this analysis step by step: it loads the data, computes growth rates, estimates the capital share, calculates TFP growth, builds a summary table, and produces a chart. Read through it carefully before you start coding.
Your task is then to reorganize this script into a class called GrowthAccountingAnalyzer. Instead of a linear sequence of operations on a global DataFrame, the class will bundle the data and each step of the pipeline together into a single object.
Growth Accounting - Script
In Week 5 you applied the Solow residual method to two countries — Switzerland and the United States. Here we run the same analysis on the full dataset. Before writing the code, recall the key idea: assuming a Cobb-Douglas production function
\[Y_t = A_t K_t^{\alpha} L_t^{1-\alpha}\]
we can rearrange to isolate TFP growth \(g_A\) as the part of GDP growth not explained by capital and labor:
This is called the Solow residual. The four variables we need from the dataset are:
Variable
Description
rgdpna
Real GDP at constant 2021 national prices
rnna
Capital stock at constant 2021 national prices
emp
Number of persons employed
labsh
Labour share of GDP
Step 1 — Load and sort the data
Load the data data/pwt_data_ex7.csv. Growth rates are computed as changes between consecutive rows, so the data must be in chronological order within each country before we do anything else.
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[14], line 1----> 1 df = pd.read_csv("data/pwt_data_ex7.csv")
2 df = df.sort_values(['Country', 'year'])
3 df.head()
NameError: name 'pd' is not defined
Step 2 — Compute annual growth rates
.pct_change() computes the percentage change relative to the previous row. We call it inside .groupby('Country') so that the calculation restarts at the beginning of each country’s time series — without grouping, the first observation of every country after the first would be compared against the last observation of the previous country, which makes no sense.
df['g_y'] = df.groupby('Country')['rgdpna'].pct_change() # GDP growthdf['g_k'] = df.groupby('Country')['rnna'].pct_change() # Capital growthdf['g_l'] = df.groupby('Country')['emp'].pct_change() # Labour growthdf[['Country', 'g_y', 'g_k', 'g_l']].head()
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[15], line 1----> 1 df['g_y'] = df.groupby('Country')['rgdpna'].pct_change() # GDP growth 2 df['g_k'] = df.groupby('Country')['rnna'].pct_change() # Capital growth 3 df['g_l'] = df.groupby('Country')['emp'].pct_change() # Labour growthNameError: name 'df' is not defined
Notice that the first row of each country is NaN: there is no previous year to compare against, so the growth rate is undefined. This is expected and will be handled automatically when we take country averages later.
Step 3 — Estimate the capital share \(\alpha\)
We treat capital share as constant within a country and estimate it from the data. Since labsh (the labour share) is already in the dataset, the capital share for any given year is simply \(1 - \texttt{labsh}\). We then average this across all years for each country to get a single representative value of \(\alpha\).
The key tool here is .transform('mean'). Unlike .agg('mean'), which would collapse the DataFrame to one row per country, .transform('mean')broadcasts the country-level mean back to every row of that country. This keeps the DataFrame at its original length so we can use alpha row-by-row in the formula below.
# Annual capital share (one value per row)df['one_minus_labsh'] =1- df['labsh']# Country-level mean, broadcast back to every row of that countrydf['alpha'] = df.groupby('Country')['one_minus_labsh'].transform('mean')df[['Country', 'g_y', 'g_k', 'g_l', 'alpha']].head()
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[16], line 2 1# Annual capital share (one value per row)----> 2 df['one_minus_labsh'] = 1 - df['labsh']
4# Country-level mean, broadcast back to every row of that country 5 df['alpha'] = df.groupby('Country')['one_minus_labsh'].transform('mean')
NameError: name 'df' is not defined
Step 4 — Compute TFP growth (the Solow residual)
With growth rates and \(\alpha\) in hand, the Solow residual formula translates directly into a single line:
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[17], line 1----> 1 df['g_tfp'] = df['g_y'] - (df['alpha'] * df['g_k'] + (1 - df['alpha']) * df['g_l'])
2 df[['Country', 'g_y', 'g_k', 'g_l', 'alpha', 'g_tfp']].head()
NameError: name 'df' is not defined
Step 5 — Build the country-level summary table
Year-by-year figures are noisy. We collapse to country-level averages and compute the weighted contributions of capital and labour — how much of GDP growth each factor accounts for on average.
# Average all growth variables across years, for each countrysummary = df.groupby('Country')[['g_y', 'g_k', 'g_l', 'alpha', 'g_tfp']].mean()# Weighted contributions: raw growth rates scaled by their factor sharessummary['contrib_k'] = summary['alpha'] * summary['g_k'] # alpha * g_Ksummary['contrib_l'] = (1- summary['alpha']) * summary['g_l'] # (1-alpha) * g_L# Reset index so 'Country' becomes a regular columnsummary = summary.reset_index()summary.head()
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[18], line 2 1# Average all growth variables across years, for each country----> 2 summary = df.groupby('Country')[['g_y', 'g_k', 'g_l', 'alpha', 'g_tfp']].mean()
4# Weighted contributions: raw growth rates scaled by their factor shares 5 summary['contrib_k'] = summary['alpha'] * summary['g_k'] # alpha * g_KNameError: name 'df' is not defined
Step 6 — Visualize the decomposition
We wrap the chart in a function so we can reuse it for any selection of countries without repeating code. The chart is a stacked bar chart: each bar’s total height equals average GDP growth, split into the three components.
def plot_growth_decomposition(countries, summary_df):"""Plot the growth decomposition as a stacked bar chart."""# Filter the summary table to the requested countries data = summary_df[summary_df['Country'].isin(countries)]# Select the three components and rename them for the legend plot_data = data.set_index('Country')[['contrib_k', 'contrib_l', 'g_tfp']] plot_data.columns = ['Capital contribution', 'Labour contribution', 'TFP growth']# Draw the stacked bar chart plot_data.plot(kind='bar', stacked=True, figsize=(8, 5)) plt.axhline(0, color='black', linestyle='--', linewidth=0.8) plt.ylabel('Average annual growth rate') plt.title('Growth Accounting Decomposition') plt.legend() plt.xticks(rotation=0) # keep country names horizontal plt.show()
1
We set Country as the index so pandas uses it as the x-axis labels, then select and rename the three component columns — these names will appear directly in the legend.
2
A dashed line at zero makes it easy to see whether a component is positive or negative, which matters especially when components are negative.
# Test the functionplot_growth_decomposition(countries=['Switzerland', 'United States'], summary_df = summary)
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[20], line 2 1# Test the function----> 2 plot_growth_decomposition(countries=['Switzerland', 'United States'], summary_df = summary)
NameError: name 'summary' is not defined
You have now completed the full pipeline as a plain script: load → compute growth rates → estimate \(\alpha\) → Solow residual → summarize → plot. In the next section, you will reorganize exactly this logic into a class, so that the data, the intermediate results, and the methods that produce them all live together in one object.
Growth Accounting - Class
2.1. Create a growth accounting class
This is the foundation of your class. Looking at the script, the first thing that happens is loading and sorting the data — that is what this step translates into class form.
Create a class called GrowthAccountingAnalyzer and write the __init__ method. It should take no arguments and initialize a single attribute self.data = None.
Add a method read_data(self, filepath) that reads a CSV file from the given filepath, sorts it by country and year, and stores the result in self.data.
Solution 2.1: Create a growth accounting class
class GrowthAccountingAnalyzer:def__init__(self):self.data =Nonedef read_data(self, filepath):"""Read data from a CSV file and store it as an attribute.""" df = pd.read_csv(filepath)self.data = df.sort_values(['Country', 'year'])
1
The data has not been loaded yet, so we initialise data to None
2
Read the CSV from the provided filepath
3
Sort by country and year — important for growth rate calculations later — and store the result as an attribute
Test your code:
Create an instance and print analyzer.data — what do you expect to see, and why?
Call read_data() and print the first few rows of analyzer.data to confirm the data loaded correctly.
Test your Code
# 1. Create an instanceanalyzer = GrowthAccountingAnalyzer()print(analyzer.data) # Expected: None — the data hasn't been loaded yet
None
# 2. Load the data and check againanalyzer.read_data('data/pwt_data_ex7.csv')analyzer.data.head()
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[23], line 2 1# 2. Load the data and check again----> 2analyzer.read_data('data/pwt_data_ex7.csv') 3 analyzer.data.head()
CellIn[21], line 8, in GrowthAccountingAnalyzer.read_data(self, filepath) 6defread_data(self, filepath):
7"""Read data from a CSV file and store it as an attribute."""----> 8 df = pd.read_csv(filepath) #<2> 9self.data = df.sort_values(['Country', 'year'])
NameError: name 'pd' is not defined
2.2 Prepare the Data
Add a method prepare_data(self) that computes all the variables needed for growth accounting and adds them directly to self.data. Looking at the script, this method covers steps 2 through 4. Specifically it should:
Calculate the growth rates of GDP, capital and labor (g_y, g_k, g_l) using .pct_change(), grouped by country.
Compute the capital share alpha as the country-level mean of 1 - labsh. Use groupby and transform.
Calculate TFP growth g_tfp using the Solow residual formula: \(g_A = g_Y - [\alpha g_K + (1-\alpha) g_L]\)
Solution 2.2: Add method prepare_data(self)
class GrowthAccountingAnalyzer:def__init__(self):self.data =Nonedef read_data(self, filepath):"""Read data from a CSV file and store it as an attribute.""" df = pd.read_csv(filepath) self.data = df.sort_values(['Country', 'year']) def prepare_data(self):"""Compute growth rates, capital share and TFP growth."""# 1. Calculate growth ratesself.data['g_y'] =self.data.groupby('Country')['rgdpna'].pct_change()self.data['g_k'] =self.data.groupby('Country')['rnna'].pct_change()self.data['g_l'] =self.data.groupby('Country')['emp'].pct_change()# 2. Estimate capital share from the dataself.data['one_minus_labsh'] =1-self.data['labsh']self.data['alpha'] =self.data.groupby('Country')['one_minus_labsh'].transform('mean')# 3. Calculate TFP growthself.data['g_tfp'] =self.data['g_y'] - (self.data['alpha'] *self.data['g_k'] + (1-self.data['alpha']) *self.data['g_l'])
Test your code: run the full pipeline up to this point and inspect analyzer.data.
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[25], line 2 1 analyzer = GrowthAccountingAnalyzer()
----> 2analyzer.read_data('data/pwt_data_ex7.csv') 3 analyzer.prepare_data()
4 analyzer.data.head()
CellIn[24], line 9, in GrowthAccountingAnalyzer.read_data(self, filepath) 7defread_data(self, filepath):
8"""Read data from a CSV file and store it as an attribute."""----> 9 df = pd.read_csv(filepath)
10self.data = df.sort_values(['Country', 'year'])
NameError: name 'pd' is not defined
2.3 Create the Country Summary Table
The data in self.data has one row per country per year. To compare countries — how much of their growth came from capital accumulation, labor, or TFP — we aggregate it into a single row per country.
Go back to __init__ and add a second attribute self.summary = None. Just like self.data, it should exist from the moment the object is created — even before the method that populates it is called.
Add a method create_country_summary(self) that aggregates the prepared data into a country-level summary table. It should:
Compute the country-level mean of g_y, g_k, g_l, alpha and g_tfp
Add columns contrib_k and contrib_l for the weighted contributions of capital and labor to GDP growth: \(\alpha g_K\) and \((1-\alpha) g_L\) respectively
Store the result as self.summary
Solution 2.3: Create the Country Summary Table
class GrowthAccountingAnalyzer:def__init__(self):self.data =Noneself.summary =Nonedef read_data(self, filepath):"""Read data from a CSV file and store it as an attribute.""" df = pd.read_csv(filepath) self.data = df.sort_values(['Country', 'year']) def prepare_data(self):"""Compute growth rates, capital share and TFP growth."""# 1. Calculate growth ratesself.data['g_y'] =self.data.groupby('Country')['rgdpna'].pct_change()self.data['g_k'] =self.data.groupby('Country')['rnna'].pct_change()self.data['g_l'] =self.data.groupby('Country')['emp'].pct_change()# 2. Estimate capital share from the dataself.data['one_minus_labsh'] =1-self.data['labsh']self.data['alpha'] =self.data.groupby('Country')['one_minus_labsh'].transform('mean')# 3. Calculate TFP growthself.data['g_tfp'] =self.data['g_y'] - (self.data['alpha'] *self.data['g_k'] + (1-self.data['alpha']) *self.data['g_l'])def create_country_summary(self):"""Aggregate the data into a country-level summary table.""" summary =self.data.groupby('Country')[['g_y', 'g_k', 'g_l', 'alpha', 'g_tfp']].mean() summary['contrib_k'] = summary['alpha'] * summary['g_k'] summary['contrib_l'] = (1- summary['alpha']) * summary['g_l']self.summary = summary.reset_index()
1
Initialise summary to None alongside data — the attribute exists from the start, even before create_country_summary() is called
2
Aggregate to country level by taking the mean of all growth variables across years
3
Compute the weighted contributions of capital and labor: alpha*g_K and (1-alpha)*g_L
4
Reset the index so Country becomes a regular column, then store as an attribute
Test your code: run the full pipeline and display analyzer.summary.
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[27], line 2 1 analyzer = GrowthAccountingAnalyzer()
----> 2analyzer.read_data('data/pwt_data_ex7.csv') 3 analyzer.prepare_data()
4 analyzer.create_country_summary()
CellIn[26], line 10, in GrowthAccountingAnalyzer.read_data(self, filepath) 8defread_data(self, filepath):
9"""Read data from a CSV file and store it as an attribute."""---> 10 df = pd.read_csv(filepath)
11self.data = df.sort_values(['Country', 'year'])
NameError: name 'pd' is not defined
2.4 Visualize the Growth Decomposition
The last step is to visualize the results. In the script, you wrapped this chart in a standalone function plot_growth_decomposition that needed summary_df passed in as an explicit argument. As a method, it can access self.summary directly — no argument needed beyond the list of countries to plot.
Add a method plot_growth_decomposition(self, countries) that takes a list of country names and plots the growth accounting decomposition as a stacked bar chart. The chart should show:
A stacked bar for each country with the contributions of capital, labor and TFP
A dashed horizontal line at zero
Labeled axes and a title
A legend identifying each component
Solution 2.4: Visualize the Growth Decomposition
class GrowthAccountingAnalyzer:def__init__(self):self.data =Noneself.summary =Nonedef read_data(self, filepath):"""Read data from a CSV file and store it as an attribute.""" df = pd.read_csv(filepath) self.data = df.sort_values(['Country', 'year']) def prepare_data(self):"""Compute growth rates, capital share and TFP growth."""# 1. Calculate growth ratesself.data['g_y'] =self.data.groupby('Country')['rgdpna'].pct_change()self.data['g_k'] =self.data.groupby('Country')['rnna'].pct_change()self.data['g_l'] =self.data.groupby('Country')['emp'].pct_change()# 2. Estimate capital share from the dataself.data['one_minus_labsh'] =1-self.data['labsh']self.data['alpha'] =self.data.groupby('Country')['one_minus_labsh'].transform('mean')# 3. Calculate TFP growthself.data['g_tfp'] =self.data['g_y'] - (self.data['alpha'] *self.data['g_k'] + (1-self.data['alpha']) *self.data['g_l'])def create_country_summary(self):"""Aggregate the data into a country-level summary table.""" summary =self.data.groupby('Country')[['g_y', 'g_k', 'g_l', 'alpha', 'g_tfp']].mean() summary['contrib_k'] = summary['alpha'] * summary['g_k'] summary['contrib_l'] = (1- summary['alpha']) * summary['g_l'] self.summary = summary.reset_index() def plot_growth_decomposition(self, countries):"""Plot the growth decomposition as a stacked bar chart."""# Filter the summary table to the requested countries data =self.summary[self.summary['Country'].isin(countries)]# Select and rename columns for the legend labels plot_data = data.set_index('Country')[['contrib_k', 'contrib_l', 'g_tfp']] plot_data.columns = ['Capital contribution', 'Labour contribution', 'TFP growth']# stacked=True handles negative values correctly plot_data.plot(kind='bar', stacked=True, figsize=(8, 5)) plt.axhline(0, color='black', linestyle='--', linewidth=0.8) plt.ylabel('Average annual growth rate') plt.title('Growth Accounting Decomposition') plt.legend() plt.xticks(rotation=0) # keep country names horizontal plt.show()
Test your code: run the full pipeline and plot the decomposition for Switzerland, Germany and France. Look at the chart: which country has the highest TFP contribution?
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[29], line 2 1 analyzer = GrowthAccountingAnalyzer()
----> 2analyzer.read_data('data/pwt_data_ex7.csv') 3 analyzer.prepare_data()
4 analyzer.create_country_summary()
CellIn[28], line 10, in GrowthAccountingAnalyzer.read_data(self, filepath) 8defread_data(self, filepath):
9"""Read data from a CSV file and store it as an attribute."""---> 10 df = pd.read_csv(filepath)
11self.data = df.sort_values(['Country', 'year'])
NameError: name 'pd' is not defined
2.5 Functionality and Robustness
You now have a working class that can load data, run the full growth accounting pipeline, and visualize the results for any selection of countries. Before we call it done, it is worth stepping back and thinking critically about the tool you have built.
Take the plot_growth_decomposition method as an example and think about two things:
1. Functionality — could you extend the method to make it more useful? Think about what a researcher using this tool might want to do that it currently cannot. Are there arguments you could add to give users more control over what the chart shows?
2. Robustness — what could go wrong when someone uses the method? Think about two kinds of problems:
Valid but tricky inputs: a user provides legitimate inputs that nonetheless produce a misleading or broken chart. For example, what happens if a user requests a country that is in the dataset but has a lot of missing values? Or requests only one country?
Invalid inputs: a user makes a mistake — for example, a typo in a country name, or passes a single string instead of a list. Does the method fail silently, produce a confusing error, or give a clear and helpful message?
Discuss with your neighbor: what would you change or add, and how would you implement it?
Ideas 2.5: Functionality and Robustness
Functionality
At the moment the chart shows the growth decomposition averaged over the full sample period for each country. A user might instead be interested in a specific year. Adding an optional year argument that filters the data to that year before plotting would make the method much more flexible.
Add a sort_by argument that lets users sort the bars by a chosen component (e.g. by TFP growth), making it easier to compare across many countries.
Add a top_n argument that automatically selects the n countries with the highest or lowest TFP growth, so users don’t have to specify a long list manually.
Add a save_path argument that saves the figure to a file if provided, rather than just displaying it.
…
Robustness
If a user passes a country name that does not exist in the data (e.g. a typo like 'Switzeland'), the method currently produces an empty chart with no error message — it should raise a clear error listing which country names were not found and suggest checking analyzer.summary['Country'].unique().
If a user passes a single string instead of a list (e.g. 'Switzerland' instead of ['Switzerland']), the method will iterate over individual characters rather than country names — it should detect this and either raise an informative error or convert it to a list automatically.
If prepare_data() has not been called before plot_growth_decomposition(), the columns g_tfp, contrib_k and contrib_l will not exist in self.data yet and the method will crash — it should check for this and tell the user which method to call first.
If create_country_summary() has not been called before plot_growth_decomposition(), self.summary is still None and the method will crash, it should check for this and tell the user which method to call first.
…
Note on the Group Project
In the group project you will build a class for a data analysis task of your own choice, similar to the GrowthAccountingAnalyzer you built here. For the methods that produce output, like figures or tables, you should think carefully about two things, just as we did in this exercise:
Functionality: Your main output methods should go beyond the standard output by offering at least one way for users to customize what it produces. As the discussion above shows, there are many ways to do this: optional arguments that filter the data, change what is shown, or control how the output is formatted. Think about what a researcher using your tool might actually want to do with it.
Robustness: You should anticipate things that can go wrong and handle them gracefully. This includes invalid inputs (typos, wrong types), missing data, and cases where methods are called in the wrong order. Rather than letting the method crash with an unhelpful error, you should check for these problems explicitly and provide a clear message that tells the user what went wrong and how to fix it.
2.6 Making the Class Easier to Use
At the moment a user has to remember to call read_data() as a separate step before doing anything else. A more natural design is to load the data at the moment the object is created — that way the analyzer is ready to use immediately after instantiation.
It is also easy to accidentally call create_country_summary() before prepare_data() has been run, which will produce an unhelpful error. A simple flag attribute can guard against this.
Update __init__ to take a filepath argument and call self.read_data(filepath) directly, so the data is loaded as soon as the object is created.
Add a boolean attribute self.data_prepared to __init__, initialised to False. Set it to True at the end of prepare_data(). Then add an early exit at the top of create_country_summary() that prints a clear message and returns if prepare_data() has not been called yet.
Solution 2.6: Making the Class Easier to Use
class GrowthAccountingAnalyzer:def__init__(self, filepath):self.data =Noneself.summary =Noneself.data_prepared =Falseself.read_data(filepath)def read_data(self, filepath):"""Read data from a CSV file and store it as an attribute.""" df = pd.read_csv(filepath) self.data = df.sort_values(['Country', 'year']) def prepare_data(self):"""Compute growth rates, capital share and TFP growth."""# 1. Calculate growth ratesself.data['g_y'] =self.data.groupby('Country')['rgdpna'].pct_change()self.data['g_k'] =self.data.groupby('Country')['rnna'].pct_change()self.data['g_l'] =self.data.groupby('Country')['emp'].pct_change()# 2. Estimate capital share from the dataself.data['one_minus_labsh'] =1-self.data['labsh']self.data['alpha'] =self.data.groupby('Country')['one_minus_labsh'].transform('mean')# 3. Calculate TFP growthself.data['g_tfp'] =self.data['g_y'] - (self.data['alpha'] *self.data['g_k'] + (1-self.data['alpha']) *self.data['g_l'])self.data_prepared =Truedef create_country_summary(self):"""Aggregate the data into a country-level summary table."""ifnotself.data_prepared:print("Please call prepare_data() before create_country_summary().")return summary =self.data.groupby('Country')[['g_y', 'g_k', 'g_l', 'alpha', 'g_tfp']].mean() summary['contrib_k'] = summary['alpha'] * summary['g_k'] summary['contrib_l'] = (1- summary['alpha']) * summary['g_l'] self.summary = summary.reset_index() def plot_growth_decomposition(self, countries):"""Plot the growth decomposition as a stacked bar chart."""# Filter the summary table to the requested countries data =self.summary[self.summary['Country'].isin(countries)]# Select and rename columns for the legend labels plot_data = data.set_index('Country')[['contrib_k', 'contrib_l', 'g_tfp']] plot_data.columns = ['Capital contribution', 'Labour contribution', 'TFP growth']# stacked=True handles negative values correctly plot_data.plot(kind='bar', stacked=True, figsize=(8, 5)) plt.axhline(0, color='black', linestyle='--', linewidth=0.8) plt.ylabel('Average annual growth rate') plt.title('Growth Accounting Decomposition') plt.legend() plt.xticks(rotation=0) # keep country names horizontal plt.show()
1
A boolean flag initialised to False — it records whether prepare_data() has been called yet. This gives other methods a reliable way to check whether the data is ready.
2
self.read_data(filepath) is a method call inside __init__ — this means that every time a new instance of the class is created, read_data() is automatically executed as part of the initialization. You can call any method of the class from within __init__ this way, which is useful for setup steps that should always happen at the moment an object is created.
3
Once all the computations are complete, the flag is flipped to True — from this point on, methods that depend on the prepared data can safely proceed.
4
If prepare_data() has not been called yet, we print a helpful message telling the user what to do next and immediately exit with return — the rest of the method is skipped and self.summary stays None.
Test your code: verify that the error handling works by creating an instance and calling create_country_summary() without calling prepare_data() first. Then run the full pipeline correctly.
Test your Code
# The data is now loaded at instantiation — no separate read_data() call neededanalyzer = GrowthAccountingAnalyzer('data/pwt_data_ex7.csv')# Try calling create_country_summary() before prepare_data() — should raise a clear erroranalyzer.create_country_summary()
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[31], line 2 1# The data is now loaded at instantiation — no separate read_data() call needed----> 2 analyzer = GrowthAccountingAnalyzer('data/pwt_data_ex7.csv') 4# Try calling create_country_summary() before prepare_data() — should raise a clear error 5 analyzer.create_country_summary()
CellIn[30], line 7, in GrowthAccountingAnalyzer.__init__(self, filepath) 5self.summary = None 6self.data_prepared = False#<1>----> 7self.read_data(filepath)CellIn[30], line 11, in GrowthAccountingAnalyzer.read_data(self, filepath) 9defread_data(self, filepath):
10"""Read data from a CSV file and store it as an attribute."""---> 11 df = pd.read_csv(filepath)
12self.data = df.sort_values(['Country', 'year'])
NameError: name 'pd' is not defined
# Now run the full pipeline correctlyanalyzer = GrowthAccountingAnalyzer('data/pwt_data_ex7.csv')analyzer.prepare_data()analyzer.create_country_summary()analyzer.plot_growth_decomposition(countries=['Switzerland', 'Germany', 'France'])
---------------------------------------------------------------------------NameError Traceback (most recent call last)
CellIn[32], line 2 1# Now run the full pipeline correctly----> 2 analyzer = GrowthAccountingAnalyzer('data/pwt_data_ex7.csv') 3 analyzer.prepare_data()
4 analyzer.create_country_summary()
CellIn[30], line 7, in GrowthAccountingAnalyzer.__init__(self, filepath) 5self.summary = None 6self.data_prepared = False#<1>----> 7self.read_data(filepath)CellIn[30], line 11, in GrowthAccountingAnalyzer.read_data(self, filepath) 9defread_data(self, filepath):
10"""Read data from a CSV file and store it as an attribute."""---> 11 df = pd.read_csv(filepath)
12self.data = df.sort_values(['Country', 'year'])
NameError: name 'pd' is not defined