Package {svmodt}


Type: Package
Title: Linear SVM-Based Recursive Decision Trees
Version: 0.1.0
Description: Implements Support Vector Machine Oblique Decision Trees (SVMODT). Recursively builds classification trees using linear Support Vector Machines (SVM) hyperplanes at each node instead of axis-parallel splits, creating oblique decision boundaries. Features include multiple feature selection methods, dynamic feature subset strategies, class weight support for imbalanced datasets, pruning and feature penalization.
License: GPL (≥ 3)
Encoding: UTF-8
LazyData: true
Suggests: knitr, rmarkdown, bookdown, testthat (≥ 3.0.0), rpart, rsample, gridExtra, tidyr, kableExtra, palmerpenguins, dplyr
VignetteBuilder: knitr
Depends: R (≥ 3.5)
Imports: rlang, e1071, FSelectorRcpp, ggplot2
RoxygenNote: 7.3.3
Config/testthat/edition: 3
URL: https://github.com/AneeshAgarwala/svmodt
BugReports: https://github.com/AneeshAgarwala/svmodt/issues
NeedsCompilation: no
Packaged: 2026-06-24 09:10:26 UTC; AneeshAG
Author: Aneesh Agarwal [aut, cre, cph], Jack Jewson [aut, ths], Erik Sverdrup [aut, ths]
Maintainer: Aneesh Agarwal <aaga0022@student.monash.edu>
Repository: CRAN
Date/Publication: 2026-06-30 11:10:02 UTC

Apply a scaler transformation to a data frame

Description

This internal helper function applies a scaling transformation to a data frame using a provided scaler object. It returns the unscaled data in case of failure.

Usage

apply_scaler(df, scaler)

Arguments

df

A data frame containing numeric features to be scaled.

scaler

A scaler object with a 'transform' method or function used to scale the data.

Details

This function is intended for internal use within the package and is not exported. It wraps the scaler's 'transform()' call in error handling to prevent failures from interrupting higher-level processes.

Value

A scaled data frame. If scaling fails or invalid inputs are provided, the original (unscaled) data frame is returned.


Check whether a decision-value vector crosses zero

Description

Check whether a decision-value vector crosses zero

Usage

boundary_in_grid(dec_values)

Dynamically determine the number of features to consider at a node

Description

Computes the number of features to be used for splitting at a given tree depth based on the specified strategy. Supports constant, decreasing, and random feature selection strategies.

Usage

calculate_dynamic_max_features(
  data,
  response,
  base_max_features,
  depth,
  strategy = "constant",
  decrease_rate = 0.8,
  random_range = c(0.3, 1),
  verbose = FALSE
)

Arguments

data

A data frame containing the predictor variables and the response variable.

response

A character string specifying the name of the response variable to exclude from the feature set.

base_max_features

Integer; the base number of features to consider. If 'NULL', all available features (excluding the response) are used.

depth

Integer; the current depth of the node in the tree (used for depth-dependent strategies).

strategy

Character string specifying how to determine the number of features. One of:

  • '"constant"' <U+2013> always use 'base_max_features' (default).

  • '"decrease"' <U+2013> exponentially decrease the number of features with depth.

  • '"random"' <U+2013> randomly select the number of features within a range.

decrease_rate

Numeric; factor (0<U+2013>1] controlling how fast the number of features decreases with depth when 'strategy = "decrease"'. Default is 0.8.

random_range

Numeric vector of length 2 specifying the lower and upper bounds (as proportions of total features) for random selection when 'strategy = "random"'. Default is 'c(0.3, 1.0)'.

verbose

Logical; if 'TRUE', prints details about the chosen strategy and resulting feature count.

Details

This function helps control model complexity and randomness by varying the number of features used at each split.

Input parameters are validated to ensure sensible defaults. The result is capped to avoid exceeding the total number of available features.

Value

Integer; the number of features to consider at the current node. The value is always constrained between 1 and the total number of available features.


Calculate feature associations with a response variable

Description

Computes the association strength between each predictor and the response variable. For numeric predictors, the absolute Pearson correlation is used. For categorical predictors, association is estimated using an ANOVA-based pseudo-R^2 measure.

Usage

calculate_feature_associations(data, response, predictors)

Arguments

data

A data frame containing the response and predictor variables.

response

A string specifying the response variable name.

predictors

A character vector of predictor names to evaluate.

Details

- **Numeric predictors:** Computed using the absolute Pearson correlation. - **Categorical predictors:** Uses the square root of the ratio of between-group sum of squares to total sum of squares from an ANOVA model.

Value

A named numeric vector of association values (0 to 1) for each predictor.


Calculate node impurity

Description

Computes the impurity of a node using either Gini impurity or entropy.

Usage

calculate_impurity(y, method = c("gini", "entropy"))

Arguments

y

A vector of class labels for the node.

method

A string specifying the impurity measure: either "gini" or "entropy".

Details

If method = "gini", the impurity is calculated as:

G = 1 - \sum_i p_i^2

where p_i is the proportion of samples in class i in the node.

If method = "entropy", the impurity is calculated as:

H = - \sum_i p_i \log(p_i)

Value

A numeric value representing the impurity of the node.


Calculate class weights for a node

Description

Computes class weights for a given set of target values based on the chosen weighting strategy. Supports unweighted, balanced, balanced subsample, and custom weighting schemes, with optional verbosity for diagnostic output.

Usage

calculate_node_class_weights(
  y,
  class_weights = "none",
  custom_class_weights = NULL,
  verbose = FALSE
)

Arguments

y

A vector of class labels at the current node.

class_weights

Character string specifying the weighting strategy. Options are:

  • '"none"' <U+2013> no weighting (default).

  • '"balanced"' <U+2013> weights inversely proportional to class frequencies.

  • '"custom"' <U+2013> user-provided custom weights.

custom_class_weights

Named numeric vector of custom class weights (used only if 'class_weights = "custom"'). Names must match the unique class labels in 'y'.

verbose

Logical; if 'TRUE', prints detailed information about computed weights.

Details

The function caps computed class weights at 10 to avoid excessively large scaling factors.

Value

A named numeric vector of class weights for each unique class in 'y', or 'NULL' if equal weights are used ('class_weights = "none"') or if the custom weights are invalid.


Select a subset of features based on correlation, mutual information, or randomness

Description

Chooses up to a specified number of features from a dataset using one of three methods: random sampling, correlation with the response, or mutual information ranking.

Usage

choose_features(
  data,
  response,
  max_features,
  method = c("random", "mutual", "cor"),
  n_subsets = 1
)

Arguments

data

A data frame containing the response and predictor variables.

response

A string specifying the response variable name.

max_features

Integer specifying the maximum number of features to select.

method

Selection strategy. One of:

  • "random" <U+2013> randomly selects features.

  • "mutual" <U+2013> ranks features by mutual information with the response (requires FSelectorRcpp).

  • "cor" <U+2013> ranks features by absolute correlation with the response.

Details

- If the number of predictors is less than or equal to max_features, all are returned. - If method = "mutual" and FSelectorRcpp is not installed or fails, the function gracefully falls back to the correlation-based method. - The correlation method internally calls calculate_feature_associations.

Value

A character vector of selected feature names.


Select features with optional penalty for previously used features

Description

Internal helper function to select a subset of features while optionally penalizing features that have been used in ancestor nodes. Supports random selection, mutual information, or correlation-based ranking.

Usage

choose_features_with_penalty(
  data,
  response,
  max_features,
  method = c("random", "mutual", "cor"),
  penalize_used = FALSE,
  penalty_weight = 0.5,
  used_features = character(0),
  n_subsets = 1,
  verbose = FALSE
)

Arguments

data

A data frame containing predictors and the response.

response

Name of the response variable.

max_features

Maximum number of features to select.

method

Feature selection method; one of "random", "mutual", or "cor".

penalize_used

Logical; if TRUE, previously used features are penalized.

penalty_weight

Numeric (0<U+2013>0.99); fraction by which to reduce the score/weight of used features.

used_features

Character vector of features previously used in the tree.

verbose

Logical; if TRUE, prints information about penalties applied.

Details

- Penalized features have their selection weight or score reduced by multiplying by (1 - penalty_weight). - For method = "random", the penalty reduces the probability of sampling a feature. - For method = "mutual" or "cor", the penalty reduces feature importance or correlation. - If no valid features are available for correlation, the function falls back to random selection with penalty. - Ensures that no feature is entirely excluded; penalty_weight is capped below 1.

Value

Character vector of selected feature names.

See Also

choose_features, calculate_feature_associations


Convert SVM decision values to probabilities

Description

Converts numeric SVM decision values into probabilities using a logistic/sigmoid transformation. Optionally uses the model's training decision values for calibration. Intended for internal use within the SVM tree prediction workflow.

Usage

convert_decision_to_probs(decision_values, model = NULL)

Arguments

decision_values

Numeric vector of decision values.

model

Optional svm object; if provided, training decision values are used to calibrate scaling.

Value

Numeric vector of probabilities, clipped between 0.001 and 0.999.


Build a 2-D prediction grid in ORIGINAL (unscaled) feature space

Description

The two plot features vary over their observed range plus padding; every other node feature is fixed at its median. Returned unscaled so axis labels stay readable; callers scale it themselves before predicting.

Usage

create_decision_grid(
  data,
  plot_features,
  all_node_features,
  resolution = 100,
  pad_factor = 0.5
)

Calculate Entropy

Description

Computes the entropy for a vector of class labels.

Usage

entropy(y)

Arguments

y

A vector of class labels.

Value

Numeric value representing entropy (0 = pure, higher = more impure).


Evaluate Multiple Random Feature Subsets Using SVM Information Gain

Description

Generates and evaluates multiple random feature subsets, ranking them by the information gain achieved through SVM-based splits.

Usage

evaluate_random_subsets(
  data,
  predictors,
  response,
  n_subsets = 5,
  subset_size = 4,
  metric = c("entropy", "gini"),
  verbose = FALSE
)

Arguments

data

A data frame containing predictors and the response variable.

predictors

Character vector of available predictor names.

response

Character string specifying the response variable name.

n_subsets

Integer; number of random feature subsets to evaluate.

subset_size

Integer; number of features in each subset.

metric

Impurity measure for information gain. One of "entropy" or "gini".

verbose

Logical; if TRUE, prints evaluation progress.

Details

This function randomly samples n_subsets different combinations of subset_size features from the predictor pool, evaluates each subset using svm_info_gain, and returns them ranked by performance.

If subset_size is greater than the number of available predictors, it is automatically reduced to match the predictor count.

Value

A data frame with two columns:

features

List column containing character vectors of feature names.

info_gain

Numeric vector of information gain values.

The data frame is sorted in descending order by information gain.


Fit a linear SVM model with optional class weights

Description

Fits a linear Support Vector Machine (SVM) classifier using the e1071 package, with optional class-specific weights to handle class imbalance.

Usage

fit_svm_with_weights(X_scaled, y, class_weights_vec, verbose = FALSE, ...)

Arguments

X_scaled

A data frame or matrix of predictor variables.

y

A vector of class labels corresponding to the rows of X.

class_weights_vec

Optional named numeric vector of class weights. Names must match the unique class labels in y. Weights are capped at 10 to prevent instability.

verbose

Logical; if TRUE, prints diagnostic messages during fitting.

...

Additional arguments passed to svm.

Details

- Uses a **linear kernel** by default. - Enables decision values and probability estimates. - Scaling is disabled (scale = FALSE). - When class_weights is supplied, weights are capped at 10 and passed to svm via its class.weights parameter. - Returns NULL if data is empty or model fitting fails.

Value

A fitted svm model object (of class "svm") on success, or NULL if fitting fails.


Retrieve all class labels from a decision tree

Description

Recursively extracts all unique class labels stored in a decision tree<U+2019>s leaf nodes.

Usage

get_all_classes(tree)

Arguments

tree

A decision tree object, where each node may contain:

  • is_leaf <U+2013> logical; TRUE if the node is a leaf.

  • class_prob <U+2013> named numeric vector of class probabilities.

  • left, right <U+2013> child node objects.

Value

A character vector of all unique class labels present in the tree.


Fallback predictions for SVM decision tree nodes

Description

Generates class predictions and probabilities when SVM predictions are unavailable or insufficient. This function is intended for internal use within the SVM tree.

Usage

get_fallback_predictions(
  model,
  X_scaled,
  decision_values,
  svm_probs = NULL,
  all_classes,
  calibrate = TRUE
)

Arguments

model

An svm object fitted with training data.

X_scaled

Scaled predictor matrix for the current node.

decision_values

Numeric vector of SVM decision values.

svm_probs

Optional SVM probability matrix (from predict(..., probability=TRUE)).

all_classes

Character vector of all possible classes.

calibrate

Logical; if TRUE, calibrates decision values into probabilities.

Value

A list with elements:


Collect every feature name used anywhere in the tree (depth-first)

Description

Collect every feature name used anywhere in the tree (depth-first)

Usage

get_tree_features(tree)

Calculate Gini Impurity

Description

Computes the Gini impurity for a vector of class labels.

Usage

gini(y)

Arguments

y

A vector of class labels.

Value

Numeric value representing Gini impurity (0 = pure, higher = more impure).


Handle small child nodes in tree splitting

Description

Internal helper function to handle situations where one or both child nodes resulting from a split have fewer samples than min_samples. Depending on which child is too small, it may stop splitting, create only one child, or return a flag to continue normal processing.

Usage

handle_small_children(
  left_idx,
  right_idx,
  min_samples,
  data,
  response,
  depth,
  max_depth,
  max_features,
  feature_method,
  impurity_measure,
  max_features_strategy,
  max_features_decrease_rate,
  max_features_random_range,
  penalize_used_features,
  feature_penalty_weight,
  n_subsets,
  used_features,
  class_weights,
  custom_class_weights,
  min_impurity_decrease = 0.001,
  features,
  scaler,
  all_classes,
  verbose,
  ...
)

Arguments

left_idx

Indices of samples assigned to the left child.

right_idx

Indices of samples assigned to the right child.

min_samples

Minimum number of samples required for a node to be valid.

data

The full dataset being split.

response

Name of the response variable.

depth

Current depth of the node.

max_depth

Maximum allowed depth for the tree.

max_features

Maximum number of features to consider at each split.

feature_method

Feature selection method (e.g., "random", "cor", "mutual").

max_features_strategy

Strategy for dynamic feature selection ("constant", "decrease", "random").

max_features_decrease_rate

Numeric; factor controlling feature decrease with depth.

max_features_random_range

Numeric vector of length 2 specifying min/max proportion for random features.

penalize_used_features

Logical; whether to penalize previously used features.

feature_penalty_weight

Numeric weight for penalizing used features.

used_features

Character vector of features used in ancestor nodes.

class_weights

Named numeric vector of class weights.

custom_class_weights

Optional custom class weights.

features

Character vector of features used at this node.

scaler

Optional scaler applied to features at this node.

all_classes

Character vector of all possible classes.

verbose

Logical; if TRUE, prints messages for debugging.

...

Additional arguments passed to svm_split.

Details

- If both children are smaller than min_samples, a leaf node is created. - If only one child is too small, the other child is recursively split. - This function ensures that tree nodes respect the minimum sample requirement, avoiding invalid splits that could destabilize the SVM-based tree.

Value

A list with components:


Calculate Information Gain for a Feature Split

Description

Computes the reduction in impurity (information gain) when splitting a target variable by a categorical feature.

Usage

info_gain(feature, target, metric = c("entropy", "gini"))

Arguments

feature

A vector representing the splitting feature (categorical or factor).

target

A vector of class labels for the target variable.

metric

The impurity measure to use: either "entropy" or "gini".

Details

Information gain is computed as:

IG = H(parent) - \sum_{v \in Values} \frac{n_v}{n} H(child_v)

where:

Value

A numeric value representing the information gain.


Create a leaf node for a decision tree

Description

Constructs a leaf node object containing class probabilities, predicted class, and metadata.

Usage

leaf_node(y, n, all_classes = NULL, features = character(0), scaler = NULL)

Arguments

y

Vector of class labels for the samples in the node.

n

Number of samples in the node.

all_classes

Optional character vector of all possible classes. If NULL, classes are inferred from y.

features

Character vector of features used at this node (default empty).

scaler

Optional scaler object applied to the features at this node.

Details

- If some classes are missing in y, probabilities for those classes are set to 0. - If all probabilities are 0 or NA, a uniform probability distribution is used. - Probabilities are normalized to sum to 1.

Value

A list representing a leaf node with components:


Plot method for svmodt_node objects

Description

Thin S3 wrapper that dispatches to plot_boundary or plot_surface depending on plot.type.

Usage

## S3 method for class 'svmodt_node'
plot(
  x,
  y = NULL,
  ...,
  data = NULL,
  response = NULL,
  plot.type = c("surface", "boundary"),
  features = NULL,
  max_depth = NULL,
  check_accuracy = TRUE,
  resolution = NULL
)

Arguments

x

An svmodt_node returned by svm_split.

y

Ignored; present only to satisfy the graphics::plot generic signature.

...

Currently unused.

data

The original training data frame (required).

response

Character string naming the response column (required).

plot.type

One of "surface" (default) or "boundary".

features

Length-2 character vector of axis features ("surface" only; default uses root node features).

max_depth

Maximum depth to visualize ("boundary" only; default NULL = full tree).

check_accuracy

Logical; show per-node accuracy ("boundary" only; default TRUE).

resolution

Grid resolution per axis. Default 100 for "boundary", 200 for "surface".

Value

Examples


tree <- svm_split(wdbc, response = "diagnosis", max_depth = 3)

# All-node boundary panels - prints first, returns list
viz <- plot(tree,
  data = wdbc, response = "diagnosis",
  plot.type = "boundary"
)
viz$plots[[2]] # second node

# Global decision surface
plot(tree,
  data = wdbc, response = "diagnosis",
  plot.type = "surface"
)

# Surface with explicit feature axes
plot(tree,
  data = wdbc, response = "diagnosis",
  plot.type = "surface",
  features = c("radius_mean", "concavity_mean")
)



Plot SVM decision boundaries for every node in the tree

Description

Traverses the tree recursively and produces one plot per internal node, showing the SVM hyperplane for that node's binary split, the background region colouring, and the actual data points (coloured by true class). Each node receives only the subset of data that reaches it during training.

Usage

plot_boundary(
  tree,
  data,
  response_col = NULL,
  max_depth = NULL,
  check_accuracy = TRUE,
  resolution = 100
)

Arguments

tree

An svmodt_node object returned by svm_split.

data

The original training data frame.

response_col

Character string naming the response column in data. Auto-detected when NULL (first factor/character column not used as a predictor).

max_depth

Maximum tree depth to visualize. NULL (default) shows all nodes.

check_accuracy

Logical; if TRUE (default), compute and display training accuracy at each node.

resolution

Integer; grid resolution per axis (default 100). Increase for smoother boundaries at the cost of speed.

Value

Invisibly returns a list with four elements:

plots

Named list of ggplot2 objects, one per node. Names encode depth and path, e.g. "depth_1_Root", "depth_2_Root_L".

grid_data

Named list of data frames (full expanded grid used for each node's contour calculation).

accuracy_info

Named list of per-node metadata: depth, path, sample count, accuracy, features, whether the boundary was visible, and the pad factor that was needed.

response_col

The response column name used.


Plot the SVM decision boundary for a single internal node

Description

Internal workhorse called by plot_boundary for each node during tree traversal. Builds the grid in original space, scales it with the node's own scaler, predicts decision values, and returns a ggplot2 object together with metadata. The grid is expanded automatically (up to pad_factor = 3) if the hyperplane falls outside the data range.

Usage

plot_node_boundary(
  data,
  node_features,
  svm_model,
  scaler,
  response_col,
  title = "SVM Decision Boundary",
  resolution = 100
)

Plot the global decision surface of the full tree

Description

Predicts class labels across a 2-D grid using the complete tree (not individual node SVMs), then overlays the original data points. Because predictions come from svm_predict_tree, multiclass trees are handled correctly - each grid cell receives the final leaf prediction which respects all OVR splits along the path.

Usage

plot_surface(tree, data, response, features = NULL, resolution = 200)

Arguments

tree

An svmodt_node object returned by svm_split.

data

The original training data frame.

response

Character string naming the response column in data.

features

Character vector of length 2 giving the two features to plot on the x and y axes. Defaults to the first two features used at the root.

resolution

Integer; grid resolution per axis (default 200). Higher values give smoother region boundaries.

Details

All features not used as plot axes are held fixed at their in-sample median (numeric) or mode (categorical). You choose which two features to plot via features; if omitted the first two features used at the root node are used.

Value

A ggplot2 object. The background tiles show the predicted class for each grid cell; points show true class labels.


Predict method for svmodt_node objects

Description

Predict method for svmodt_node objects

Usage

## S3 method for class 'svmodt_node'
predict(object, newdata, return_probs = FALSE, calibrate_probs = TRUE, ...)

Arguments

object

An object of class svmodt_node.

newdata

A data frame of new predictor values.

return_probs

Logical; if TRUE, returns predictions and probabilities.

calibrate_probs

Logical; if TRUE, uses logistic calibration on decision values.

...

Currently unused.

Value

If return_probs = FALSE (the default), a character vector of predicted class labels, one element per row of newdata.

If return_probs = TRUE, a named list with two elements:

predictions

Character vector of predicted class labels (length = nrow(newdata)).

probabilities

Numeric matrix of class probabilities with nrow(newdata) rows and one column per class. Column names are the class labels; each row sums to 1. When calibrate_probs = TRUE, probabilities are derived from the SVM decision value via logistic calibration; otherwise empirical class frequencies at the leaf node are used.

Examples


# Train DTSVM tree
tree <- svm_split(
  data = wdbc,
  response = "diagnosis",
  max_depth = 3,
  max_features = 2,
  feature_method = "cor"
)

# Predict on WDBC data (returns a character vector of class labels)
preds <- predict(tree, newdata = wdbc)

# Predict with probabilities and logistic calibration
result <- predict(tree, newdata = wdbc,
  return_probs = TRUE, calibrate_probs = TRUE
)
head(result$predictions)
head(result$probabilities)


' Print method for svmodt_node objects

Description

' Print method for svmodt_node objects

Usage

## S3 method for class 'svmodt_node'
print(x, ...)

Arguments

x

An object of class svmodt_node.

...

Further arguments passed to print_svm_tree.

Value

Invisibly returns x (the svmodt_node object), called for its side effect of printing a human-readable summary of the tree structure to the console.

Examples


tree <- svm_split(
  data = wdbc,
  response = "diagnosis",
  max_features = 2,
  max_depth = 3,
  min_samples = 5,
  feature_method = "random",
  verbose = TRUE
)
print(tree)


Description

Recursively prints the structure of an SVM-based decision tree.

Usage

print_svm_tree(
  tree,
  indent = "",
  show_probabilities = FALSE,
  show_feature_info = TRUE,
  show_penalties = TRUE
)

Arguments

tree

An object of class svmodt_node (leaf or tree).

indent

String used for indentation (for recursive calls).

show_probabilities

Logical; whether to display class probabilities at leaf nodes.

show_feature_info

Logical; whether to show features used at nodes.

show_penalties

Logical; whether to show penalty flags at nodes.

Value

Invisibly returns NULL. Prints to console.


Scale Numeric Features for Tree Nodes

Description

Internal utility function to standardize numeric features (zero mean, unit variance) and remove constant columns. Returns both the scaled training data and a transformer function for applying the same scaling to new data.

Usage

scale_node(df)

Arguments

df

A data frame containing numeric (or factor) features to be scaled.

Details

- Constant features (zero variance or only one unique value) are automatically removed. - Standard deviation of zero is replaced with 1 to prevent division by zero. - Designed for internal use in SVM tree building and prediction pipelines.

Value

A list with two elements:

train

The scaled training data frame.

transform

A function that applies the same scaling to a new data frame.


Check Stopping Conditions for Tree Splitting

Description

Internal utility function to determine if a node in a tree should stop splitting based on depth, purity, or minimum sample size.

Usage

stop_conditions_met(data, y, depth, max_depth, min_samples, verbose)

Arguments

data

A data frame of predictor features at the current node.

y

A vector of target values corresponding to data.

depth

Current depth of the node in the tree.

max_depth

Maximum allowed depth for the tree.

min_samples

Minimum number of samples required to split a node.

verbose

Logical; if TRUE, prints the reason for stopping.

Details

- Stops if the node reaches max_depth. - Stops if all target values in the node are identical (pure node). - Stops if the number of samples is less than min_samples.

Value

Logical; TRUE if the node meets any stopping condition, FALSE otherwise.


Calculate Information Gain Using SVM-based Splits

Description

Computes the information gain achieved by splitting data using a linear SVM trained on a subset of features. The SVM's decision values determine the split, and information gain is calculated based on the resulting partitions.

Usage

svm_info_gain(
  feature_subset,
  data,
  response,
  metric = c("entropy", "gini"),
  verbose = FALSE
)

Arguments

feature_subset

Character vector of feature names to use for the SVM split.

data

A data frame containing predictors and the response variable.

response

Character string specifying the response variable name.

metric

Impurity measure for information gain calculation. One of:

  • "entropy" <U+2013> entropy-based information gain (default).

  • "gini" <U+2013> Gini impurity-based information gain.

verbose

Logical; if TRUE, prints diagnostic information.

Details

This function:

  1. Fits a linear SVM using the specified feature subset.

  2. Extracts decision values (distances from the hyperplane).

  3. Creates a binary split: samples with negative decision values go left, positive values go right.

  4. Calculates information gain using the info_gain function.

The SVM split creates an oblique (non-axis-aligned) partition, potentially capturing more complex decision boundaries than single-feature splits.

Value

Numeric value representing the information gain achieved by the SVM split.


Predict Using a Support Vector Machine Oblique Decision Tree

Description

Predicts class labels or class probabilities for new data using a tree constructed with SVM splits. Handles leaf nodes, internal nodes, recursive traversal, and fallback mechanisms when SVM predictions or scaling fail.

Usage

svm_predict_tree(tree, newdata, return_probs = FALSE, calibrate_probs = TRUE)

Arguments

tree

A tree node object (leaf or internal) created by svm_split or svm_split_enhanced.

newdata

A data frame of new predictor values. **Must contain the same features** as those used to fit the tree. Any additional columns (including responses) are ignored.

return_probs

Logical; if TRUE, returns both predicted class labels and class probabilities.

calibrate_probs

Logical; if TRUE, converts SVM decision values to probabilities using logistic calibration (sigmoid) based on the distance from the hyperplane. If FALSE, fallback probabilities are computed from class frequencies at the leaf node.

Details

The function traverses the SVM-based oblique decision tree recursively and predicts class labels or probabilities. Key behaviors:

Value

If return_probs = FALSE, a character vector of predicted class labels. If return_probs = TRUE, a list with elements:


Build an Oblique Decision Tree Using SVM Splits

Description

Constructs a decision tree where each internal node uses a Support Vector Machine (SVM) to determine the split. Supports dynamic feature selection, feature penalization, scaling, and class weighting.

Usage

svm_split(
  data,
  response,
  depth = 1,
  max_depth = 10,
  min_samples = 5,
  max_features = NULL,
  feature_method = c("random", "mutual", "cor"),
  impurity_measure = c("entropy", "gini"),
  max_features_strategy = c("constant", "random", "decrease"),
  max_features_decrease_rate = 0.8,
  max_features_random_range = c(0.3, 1),
  penalize_used_features = FALSE,
  feature_penalty_weight = 0.5,
  n_subsets = 1,
  used_features = character(0),
  class_weights = c("none", "balanced", "custom"),
  custom_class_weights = NULL,
  min_impurity_decrease = 0.001,
  verbose = FALSE,
  all_classes = NULL,
  ...
)

Arguments

data

A data frame containing predictors and the response variable.

response

Character string specifying the response column in 'data'. All other columns are treated as predictors.

depth

Integer indicating the current recursion depth (used internally; default is 1).

max_depth

Maximum depth of the tree.

min_samples

Minimum number of samples required to attempt a split.

max_features

Maximum number of features to consider at each split.

feature_method

Feature selection method at each node. One of:

  • '"random"': randomly select features,

  • '"mutual"': select based on mutual information with the response,

  • '"cor"': select based on correlation with the response.

impurity_measure

Information Gain evaluation criteria

  • '"gini"': use Gini ratio

  • '"entropy"': use Shannon entropy

max_features_strategy

Strategy to adjust the number of features per node:

  • '"constant"': keep 'max_features' constant,

  • '"decrease"': reduce features with depth,

  • '"random"': randomly vary number of features within a range.

max_features_decrease_rate

Numeric fraction for decreasing features if 'max_features_strategy = "decrease"'.

max_features_random_range

Numeric vector of length 2 specifying min and max fraction of features if 'max_features_strategy = "random"'.

penalize_used_features

Logical; if TRUE, features used in ancestor nodes are penalized to encourage diversity.

feature_penalty_weight

Numeric (0<U+2013>1) weight for penalizing previously used features.

n_subsets

Number of Evaluated Random Feature combinations at each node when 'feature_method = "random'

used_features

Character vector of features already used in ancestor nodes (used internally).

class_weights

Character string specifying how to handle class imbalance. One of:

  • '"none"': no weighting,

  • '"balanced"': weight classes inversely proportional to their frequency,

  • '"custom"': use 'custom_class_weights'.

custom_class_weights

Optional named numeric vector specifying custom weights per class.

min_impurity_decrease

Required decrease in impurity by a split to be considered valid

verbose

Logical; if TRUE, prints information about each node during tree construction.

all_classes

Optional character vector of all possible response classes (used internally).

...

Additional arguments passed to the underlying SVM fitting function.

Details

This function recursively splits the dataset using an SVM at each node. Splitting stops when maximum depth is reached, the node contains fewer than 'min_samples', or all samples belong to the same class. Features are scaled and selected dynamically at each node, and previously used features can be penalized to promote diversity. Class weighting schemes support handling imbalanced datasets. This approach allows construction of an **oblique decision tree**, where splits are linear hyperplanes rather than axis-aligned.

Value

A nested list representing the decision tree. Each node contains:

is_leaf

Logical; TRUE if the node is a leaf.

model

Fitted SVM model at this node (for internal nodes).

features

Vector of features selected for this node.

scaler

Scaling information used at this node.

left

Left child node (decision value > 0).

right

Right child node (decision value <U+2264> 0).

depth

Depth of this node in the tree.

n

Number of samples at this node.

max_features_used

Number of features considered at this node.

penalty_applied

Logical; TRUE if feature penalization was applied.

class_weights_used

Class weights applied at this node.

Examples


data(wdbc)
tree <- svm_split(
  data = wdbc,
  response = "diagnosis",
  max_depth = 3,
  min_samples = 5,
  feature_method = "random",
  verbose = TRUE
)



Trace the prediction path of a sample through an svmodt tree

Description

Generic function that walks the tree for a single row of new data, printing the SVM decision value and chosen branch at every internal node and the final predicted class at the leaf.

Usage

trace_path(object, ...)

## S3 method for class 'svmodt_node'
trace_path(object, sample_data, sample_idx = 1, ...)

Arguments

object

An svmodt_node returned by svm_split.

...

Currently unused.

sample_data

A data frame of new predictor values (one or more rows).

sample_idx

Integer; which row to trace (default 1).

Value

Invisibly returns the predicted class label (character string).

Methods (by class)

Examples


tree <- svm_split(wdbc, response = "diagnosis", max_depth = 3)
trace_path(tree, wdbc, sample_idx = 5)



Trace Prediction Path for a Sample

Description

Shows the path taken by a single sample through the SVM tree, including decision values, branches, and final prediction.

Usage

trace_prediction_path(tree, sample_data, sample_idx = 1)

Arguments

tree

The tree object.

sample_data

Data frame containing the sample(s).

sample_idx

Index of the sample to trace (default 1).

Value

The predicted class for the sample (a character string). Called primarily for its side effect of printing the full decision path to the console, including node features, SVM decision values, branch directions, and the final predicted class label.


Wisconsin Diagnostic Breast Cancer Dataset

Description

The WDBC dataset contains quantitative measurements from digitized images of fine needle aspirates (FNA) of breast masses. It is commonly used for classification tasks to distinguish between benign and malignant tumors.

Usage

wdbc

Format

A data frame with 569 rows and 32 columns:

radius_mean

Mean of radius

radius_se

Standard error of radius

radius_worst

Worst (largest) radius

texture_mean

Mean of texture

texture_se

Standard error of texture

texture_worst

Worst texture

perimeter_mean

Mean of perimeter

perimeter_se

Standard error of perimeter

perimeter_worst

Worst perimeter

area_mean

Mean area

area_se

Standard error of area

area_worst

Worst area

smoothness_mean

Mean smoothness

smoothness_se

Standard error of smoothness

smoothness_worst

Worst smoothness

compactness_mean

Mean compactness

compactness_se

Standard error of compactness

compactness_worst

Worst compactness

concavity_mean

Mean concavity

concavity_se

Standard error of concavity

concavity_worst

Worst concavity

concave.points_mean

Mean concave points

concave.points_se

Standard error of concave points

concave.points_worst

Worst concave points

symmetry_mean

Mean symmetry

symmetry_se

Standard error of symmetry

symmetry_worst

Worst symmetry

fractal_dimension_mean

Mean fractal dimension

fractal_dimension_se

Standard error of fractal dimension

fractal_dimension_worst

Worst fractal dimension

diagnosis

Factor with levels 'B' and 'M'

Source

Dr. William H. Wolberg, W. Nick Street, and Olvi L. Mangasarian, University of Wisconsin<U+2013>Madison. Original dataset available at: <https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic>


Wine Dataset

Description

The Wine dataset contains the results of a chemical analysis of wines derived from three different cultivars grown in the same region of Italy. The dataset is commonly used for multiclass classification tasks, where the objective is to identify the cultivar of origin based on physicochemical properties.

Usage

wine

Format

A data frame with 178 rows and 14 columns:

class

Factor with levels 1, 2, and 3 indicating cultivar

alcohol

Alcohol content

malic_acid

Malic acid concentration

ash

Ash content

alcalinity_of_ash

Alcalinity of ash

magnesium

Magnesium content

total_phenols

Total phenols

flavanoids

Flavonoid content

nonflavanoid_phenols

Nonflavanoid phenols

proanthocyanins

Proanthocyanin content

color_intensity

Color intensity

hue

Hue

od280_od315

OD280/OD315 of diluted wines

proline

Proline concentration

Source

Aeberhard, S. & Forina, M. (1992). Wine Dataset. UCI Machine Learning Repository. Original dataset available at: <https://archive.ics.uci.edu/dataset/109/wine>

mirror server hosted at Truenetwork, Russian Federation.