Let’s make a simple linear model first.
library(partialling.out)
library(tinytable)
library(tinyplot)
library(palmerpenguins)
library(fwlplot)
lm(bill_length_mm ~ bill_depth_mm + species, data = penguins)
model <-summary(model)
#>
#> Call:
#> lm(formula = bill_length_mm ~ bill_depth_mm + species, data = penguins)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -8.0300 -1.5828 0.0733 1.6925 10.0313
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 13.2164 2.2475 5.88 9.83e-09 ***
#> bill_depth_mm 1.3940 0.1220 11.43 < 2e-16 ***
#> speciesChinstrap 9.9390 0.3678 27.02 < 2e-16 ***
#> speciesGentoo 13.4033 0.5118 26.19 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.518 on 338 degrees of freedom
#> (2 observations deleted due to missingness)
#> Multiple R-squared: 0.7892, Adjusted R-squared: 0.7874
#> F-statistic: 421.9 on 3 and 338 DF, p-value: < 2.2e-16
Using the partialling_out
function, you can get the residualised variable of interest (bill_length_mm
) and of the first explanatory variable (bill_depth_mm
), i.e. it would return the residuals of the following two regressions.
lm(bill_length_mm ~ species, data = penguins)
modely <-
lm(bill_depth_mm ~ species, data = penguins) modelx <-
partialling_out(model, data = penguins)
res <-
tt(head(res)) |>
format_tt(digits = 2) |>
style_tt(align = "c")
res_bill_length_mm | res_bill_depth_mm |
---|---|
0.31 | 0.35 |
0.71 | -0.95 |
1.51 | -0.35 |
-2.09 | 0.95 |
0.51 | 2.25 |
0.11 | -0.55 |
Accordingly, the coefficient of res_bill_depth_mm
in the model lm(res_bill_length_mm ~ res_bill_depth_mm)
will be the same of the coefficient of bill_depth_mm
in the original model.
lm(res_bill_length_mm ~ res_bill_depth_mm, data = res)
resmodel <-
print(c(model$coefficients[2], resmodel$coefficients[2]))
#> bill_depth_mm res_bill_depth_mm
#> 1.394011 1.394011
If both
is set to FALSE
, the function will return the actual Y values and the residualised X values.
tt(head(partialling_out(model, penguins, both = FALSE))) |>
format_tt(digits = 2) |>
style_tt(align = "c")
bill_length_mm | res_bill_depth_mm |
---|---|
39 | 0.35 |
40 | -0.95 |
40 | -0.35 |
37 | 0.95 |
39 | 2.25 |
39 | -0.55 |
If weights
are specified in partialling_out
they will be used to weight the underlying partial models. The weights will also be returned as a column in the result data.frame. If the original model is weighted but the partial models aren’t, the function will throw a warning (also if the original model is not weighted but partial models are).
lm(
model <-~ bill_depth_mm + species,
bill_length_mm data = penguins,
weights = penguins$body_mass_g
) partialling_out(model, data = penguins, weights = penguins$body_mass_g)
res <-
tt(head(res)) |>
format_tt(digits = 2) |>
style_tt(align = "c")
res_bill_length_mm | res_bill_depth_mm | weights |
---|---|---|
0.129 | 0.27 | 3750 |
0.529 | -1.03 | 3800 |
1.329 | -0.43 | 3250 |
-2.271 | 0.87 | 3450 |
0.329 | 2.17 | 3650 |
-0.071 | -0.63 | 3625 |
Note that, for the FWL theorem to hold, if the partial models are weighted, the linear model between the two residualised variables must be weighted as well.
lm(res_bill_length_mm ~ res_bill_depth_mm, data = res)
unweighted_model <-
lm(
weighted_model <-~ res_bill_depth_mm,
res_bill_length_mm weights = res$weights,
data = res
)
data.frame(
"original model" = model$coefficients[2],
"unweighted model" = unweighted_model$coefficients[2],
"weighted_model" = weighted_model$coefficients[2]
|>
) tt() |>
format_tt(digits = 4) |>
style_tt(align = "c")
original.model | unweighted.model | weighted_model |
---|---|---|
1.442 | 1.394 | 1.442 |
As stated, the model will also work with feols
or felm
models
library(fixest)
feols(bill_length_mm ~ bill_depth_mm | species, data = penguins)
model_fixest <-
partialling_out(model_fixest, data = penguins)
res_fixest <-tt(head(res_fixest)) |>
format_tt(digits = 2) |>
style_tt(align = "c")
res_bill_length_mm | res_bill_depth_mm |
---|---|
0.31 | 0.35 |
0.71 | -0.95 |
1.51 | -0.35 |
-2.09 | 0.95 |
0.51 | 2.25 |
0.11 | -0.55 |
library(lfe)
felm(bill_length_mm ~ bill_depth_mm | species, data = penguins)
model_lfe <-
partialling_out(model_lfe, data = penguins)
res_lfe <-
tt(head(res_lfe)) |>
format_tt(digits = 2) |>
style_tt(align = "c")
res_bill_length_mm | res_bill_depth_mm |
---|---|
0.31 | 0.35 |
0.71 | -0.95 |
1.51 | -0.35 |
-2.09 | 0.95 |
0.51 | 2.25 |
0.11 | -0.55 |
So far, interaction terms with :
, *
, or ^
(in feols
) as well as varying slopes in feols
with [
or [[
are supported only for variables other than the main explanatory variable, i.e.
feols(bill_depth_mm ~ bill_length_mm | species^sex, data = penguins) |>
partialling_out(data = penguins)
is supported, but
feols(bill_depth_mm ~ i(bill_length_mm, species) | sex, data = penguins) |>
partialling_out(data = penguins)
will yield an error. Therefore, if your main explanatory variable is an interaction, it should be built outside the model if you intend to partial it out afterwards.
So far, AsIs expressions, and polynomial terms in the formula are not yet supported, so these two expressions will yield an error.
partialling_out(lm(bill_depth_mm ~ I(bill_length_mm^2) + species,
data = penguins))
partialling_out(lm(bill_depth_mm ~ poly(bill_length_mm, 2) + species,
data = penguins[!is.na(penguins$bill_length_mm), ]))
The library includes the function plot_partial_residuals()
to plot a scatterplot of the residualised variables with a tendence line (which can be disabled if the parametre add_lm
is set to FALSE
) using tinyplot
as backend. If the parametre quantile
is set to TRUE
it will divide the main explanatory variable in the specified number of quantiles (by default, 50) and plot the mean values for X and Y for each quantile.
tinytheme("tufte")
par(mfrow = c(2, 1))
plot_partial_residuals(res_fixest)
plot_partial_residuals(res_fixest, quantile = TRUE)
Plotting all residuals is equivalent to the following approach in fwlplot
fwlplot(bill_length_mm ~ bill_depth_mm | species, data = penguins)
Any parameters that could be passed to lm()
, feols()
, or felm()
, can be passed to partialling_out()
.
feols(
model_fixest <-~ bill_depth_mm | species + island,
bill_length_mm data = penguins,
cluster = ~species
)
tt(head(partialling_out(model_fixest, data = penguins, cluster = ~species))) |>
format_tt(digits = 2) |>
style_tt(align = "c")
res_bill_length_mm | res_bill_depth_mm |
---|---|
0.149 | 0.27 |
0.549 | -1.03 |
1.349 | -0.43 |
-2.251 | 0.87 |
0.349 | 2.17 |
-0.051 | -0.63 |
To the authors of the fwlplot package, Kyle Butts and Grant McDermott, which has provided inspiration and ideas for this project. To my colleague Andreu Arenas-Jal for his insight and guiding.