Introduction

The SplitWise package provides tools for transforming numeric variables in regression models by either applying a single-split dummy encoding or retaining them as linear terms. This vignette demonstrates the application of SplitWise using the mtcars dataset, showcasing both univariate and iterative transformation approaches.

The mtcars Dataset

The mtcars dataset is a built-in R dataset that comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

# Load the mtcars dataset
data(mtcars)

Iterative Transformations

The iterative transformation approach evaluates each variable’s transformation in the context of variables already added to the model. Here is an example using forward stepwise selection:

# Apply iterative transformations with forward stepwise selection
model_iter <- splitwise(
  mpg ~ .,
  data = mtcars,
  transformation_mode = "iterative",
  direction = "backward",
  trace = 0
)

# Display the summary of the model
summary(model_iter)
#> 
#> Call:
#> stats::lm(formula = mpg ~ cyl + disp_dummy + hp + drat_dummy + 
#>     am, data = df_final)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.7656 -1.1218 -0.1794  1.2778  2.8054 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 31.181550   2.141748  14.559 5.17e-14 ***
#> cyl         -0.819230   0.394147  -2.078  0.04767 *  
#> disp_dummy  -6.542518   1.074109  -6.091 1.95e-06 ***
#> hp          -0.029006   0.009164  -3.165  0.00393 ** 
#> drat_dummy   3.608055   0.983141   3.670  0.00110 ** 
#> am           1.467368   0.866328   1.694  0.10226    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.71 on 26 degrees of freedom
#> Multiple R-squared:  0.9325, Adjusted R-squared:  0.9195 
#> F-statistic: 71.82 on 5 and 26 DF,  p-value: 2.222e-14
#> 
#> Transformation Mode: iterative 
#> 
#> Dummy-Encoded Variables:
#>   - disp : 1 if x >= 101.550 ; else 0
#>   - drat : 1 if x >= 3.035 ; else 0
#>   - wt : 1 if 1.885 < x < 3.013 ; else 0
#>   - qsec : 1 if 16.580 < x < 17.175 ; else 0
#>   - carb : 1 if 1.500 < x < 5.000 ; else 0
#> 
#> Final AIC: 132.5 
#> Final BIC: 142.76

# Print the model details
print(model_iter)
#> SplitWise Linear Models
#> Transformation mode: iterative 
#> Call:
#> splitwise(formula = mpg ~ ., data = mtcars, transformation_mode = "iterative", 
#>     direction = "backward", trace = 0)
#> 
#> Dummy-Encoded Variables:
#>   - disp : 1 if x >=  101.550 ; else 0
#>   - drat : 1 if x >=  3.035 ; else 0
#>   - wt : 1 if  1.885  < x <  3.013 ; else 0
#>   - qsec : 1 if  16.58  < x <  17.175 ; else 0
#>   - carb : 1 if  1.5  < x <  5 ; else 0
#> 
#> Coefficients:
#> (Intercept)         cyl  disp_dummy          hp  drat_dummy          am 
#> 31.18155004 -0.81922986 -6.54251800 -0.02900627  3.60805503  1.46736833 
#> 
#> Model Fit Statistics:
#>   AIC: 132.5 
#>   BIC: 142.76

Univariate Transformations

In the univariate transformation approach, each numeric predictor is transformed independently without considering the context of other variables. Below is an example of applying univariate transformations with backward stepwise selection:

# Apply univariate transformations with backward stepwise selection
model_uni <- splitwise(
  mpg ~ .,
  data = mtcars,
  transformation_mode = "univariate",
  direction = "backward",
  trace = 0
)

# Display the summary of the model
summary(model_uni)
#> 
#> Call:
#> stats::lm(formula = mpg ~ wt + qsec + am, data = df_final)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.4811 -1.5555 -0.7257  1.4110  4.6610 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   9.6178     6.9596   1.382 0.177915    
#> wt           -3.9165     0.7112  -5.507 6.95e-06 ***
#> qsec          1.2259     0.2887   4.247 0.000216 ***
#> am            2.9358     1.4109   2.081 0.046716 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.459 on 28 degrees of freedom
#> Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
#> F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11
#> 
#> Transformation Mode: univariate 
#> 
#> Dummy-Encoded Variables:
#>   - qsec : 1 if 17.175 < x < 18.410 ; else 0
#>   - gear : 1 if x >= 3.500 ; else 0
#> 
#> Final AIC: 154.12 
#> Final BIC: 161.45

# Print the model details
print(model_uni)
#> SplitWise Linear Models
#> Transformation mode: univariate 
#> Call:
#> splitwise(formula = mpg ~ ., data = mtcars, transformation_mode = "univariate", 
#>     direction = "backward", trace = 0)
#> 
#> Dummy-Encoded Variables:
#>   - qsec : 1 if  17.175  < x <  18.41 ; else 0
#>   - gear : 1 if x >=  3.500 ; else 0
#> 
#> Coefficients:
#> (Intercept)          wt        qsec          am 
#>    9.617781   -3.916504    1.225886    2.935837 
#> 
#> Model Fit Statistics:
#>   AIC: 154.12 
#>   BIC: 161.45

Conclusion

This vignette illustrated how to utilize the SplitWise package to perform both univariate and iterative transformations on the mtcars dataset. Depending on the analysis requirements, users can choose the appropriate transformation approach to enhance their regression models.

mirror server hosted at Truenetwork, Russian Federation.