The SplitWise
package provides tools for transforming
numeric variables in regression models by either applying a single-split
dummy encoding or retaining them as linear terms. This vignette
demonstrates the application of SplitWise
using the
mtcars
dataset, showcasing both univariate and iterative
transformation approaches.
The mtcars
dataset is a built-in R dataset that
comprises fuel consumption and 10 aspects of automobile design and
performance for 32 automobiles (1973–74 models).
The iterative transformation approach evaluates each variable’s transformation in the context of variables already added to the model. Here is an example using forward stepwise selection:
# Apply iterative transformations with forward stepwise selection
model_iter <- splitwise(
mpg ~ .,
data = mtcars,
transformation_mode = "iterative",
direction = "backward",
trace = 0
)
# Display the summary of the model
summary(model_iter)
#>
#> Call:
#> stats::lm(formula = mpg ~ cyl + disp_dummy + hp + drat_dummy +
#> am, data = df_final)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -3.7656 -1.1218 -0.1794 1.2778 2.8054
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 31.181550 2.141748 14.559 5.17e-14 ***
#> cyl -0.819230 0.394147 -2.078 0.04767 *
#> disp_dummy -6.542518 1.074109 -6.091 1.95e-06 ***
#> hp -0.029006 0.009164 -3.165 0.00393 **
#> drat_dummy 3.608055 0.983141 3.670 0.00110 **
#> am 1.467368 0.866328 1.694 0.10226
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1.71 on 26 degrees of freedom
#> Multiple R-squared: 0.9325, Adjusted R-squared: 0.9195
#> F-statistic: 71.82 on 5 and 26 DF, p-value: 2.222e-14
#>
#> Transformation Mode: iterative
#>
#> Dummy-Encoded Variables:
#> - disp : 1 if x >= 101.550 ; else 0
#> - drat : 1 if x >= 3.035 ; else 0
#> - wt : 1 if 1.885 < x < 3.013 ; else 0
#> - qsec : 1 if 16.580 < x < 17.175 ; else 0
#> - carb : 1 if 1.500 < x < 5.000 ; else 0
#>
#> Final AIC: 132.5
#> Final BIC: 142.76
# Print the model details
print(model_iter)
#> SplitWise Linear Models
#> Transformation mode: iterative
#> Call:
#> splitwise(formula = mpg ~ ., data = mtcars, transformation_mode = "iterative",
#> direction = "backward", trace = 0)
#>
#> Dummy-Encoded Variables:
#> - disp : 1 if x >= 101.550 ; else 0
#> - drat : 1 if x >= 3.035 ; else 0
#> - wt : 1 if 1.885 < x < 3.013 ; else 0
#> - qsec : 1 if 16.58 < x < 17.175 ; else 0
#> - carb : 1 if 1.5 < x < 5 ; else 0
#>
#> Coefficients:
#> (Intercept) cyl disp_dummy hp drat_dummy am
#> 31.18155004 -0.81922986 -6.54251800 -0.02900627 3.60805503 1.46736833
#>
#> Model Fit Statistics:
#> AIC: 132.5
#> BIC: 142.76
In the univariate transformation approach, each numeric predictor is transformed independently without considering the context of other variables. Below is an example of applying univariate transformations with backward stepwise selection:
# Apply univariate transformations with backward stepwise selection
model_uni <- splitwise(
mpg ~ .,
data = mtcars,
transformation_mode = "univariate",
direction = "backward",
trace = 0
)
# Display the summary of the model
summary(model_uni)
#>
#> Call:
#> stats::lm(formula = mpg ~ wt + qsec + am, data = df_final)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -3.4811 -1.5555 -0.7257 1.4110 4.6610
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 9.6178 6.9596 1.382 0.177915
#> wt -3.9165 0.7112 -5.507 6.95e-06 ***
#> qsec 1.2259 0.2887 4.247 0.000216 ***
#> am 2.9358 1.4109 2.081 0.046716 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.459 on 28 degrees of freedom
#> Multiple R-squared: 0.8497, Adjusted R-squared: 0.8336
#> F-statistic: 52.75 on 3 and 28 DF, p-value: 1.21e-11
#>
#> Transformation Mode: univariate
#>
#> Dummy-Encoded Variables:
#> - qsec : 1 if 17.175 < x < 18.410 ; else 0
#> - gear : 1 if x >= 3.500 ; else 0
#>
#> Final AIC: 154.12
#> Final BIC: 161.45
# Print the model details
print(model_uni)
#> SplitWise Linear Models
#> Transformation mode: univariate
#> Call:
#> splitwise(formula = mpg ~ ., data = mtcars, transformation_mode = "univariate",
#> direction = "backward", trace = 0)
#>
#> Dummy-Encoded Variables:
#> - qsec : 1 if 17.175 < x < 18.41 ; else 0
#> - gear : 1 if x >= 3.500 ; else 0
#>
#> Coefficients:
#> (Intercept) wt qsec am
#> 9.617781 -3.916504 1.225886 2.935837
#>
#> Model Fit Statistics:
#> AIC: 154.12
#> BIC: 161.45
This vignette illustrated how to utilize the SplitWise
package to perform both univariate and iterative transformations on the
mtcars
dataset. Depending on the analysis requirements,
users can choose the appropriate transformation approach to enhance
their regression models.