Help for package svmodt

Type:

Package

Title:

Linear SVM-Based Recursive Decision Trees

Version:

0.1.0

Description:

Implements Support Vector Machine Oblique Decision Trees (SVMODT). Recursively builds classification trees using linear Support Vector Machines (SVM) hyperplanes at each node instead of axis-parallel splits, creating oblique decision boundaries. Features include multiple feature selection methods, dynamic feature subset strategies, class weight support for imbalanced datasets, pruning and feature penalization.

License:

GPL (≥ 3)

Encoding:

UTF-8

LazyData:

true

Suggests:

knitr, rmarkdown, bookdown, testthat (≥ 3.0.0), rpart, rsample, gridExtra, tidyr, kableExtra, palmerpenguins, dplyr

VignetteBuilder:

knitr

Depends:

R (≥ 3.5)

Imports:

rlang, e1071, FSelectorRcpp, ggplot2

RoxygenNote:

7.3.3

Config/testthat/edition:

URL:

https://github.com/AneeshAgarwala/svmodt

BugReports:

https://github.com/AneeshAgarwala/svmodt/issues

NeedsCompilation:

Packaged:

2026-06-24 09:10:26 UTC; AneeshAG

Author:

Aneesh Agarwal [aut, cre, cph], Jack Jewson [aut, ths], Erik Sverdrup [aut, ths]

Maintainer:

Aneesh Agarwal <aaga0022@student.monash.edu>

Repository:

CRAN

Date/Publication:

2026-06-30 11:10:02 UTC

Apply a scaler transformation to a data frame

Description

This internal helper function applies a scaling transformation to a data frame using a provided scaler object. It returns the unscaled data in case of failure.

Usage

apply_scaler(df, scaler)

Arguments

df

A data frame containing numeric features to be scaled.

scaler

A scaler object with a 'transform' method or function used to scale the data.

Details

This function is intended for internal use within the package and is not exported. It wraps the scaler's 'transform()' call in error handling to prevent failures from interrupting higher-level processes.

Value

A scaled data frame. If scaling fails or invalid inputs are provided, the original (unscaled) data frame is returned.

Check whether a decision-value vector crosses zero

Description

Check whether a decision-value vector crosses zero

Usage

boundary_in_grid(dec_values)

Dynamically determine the number of features to consider at a node

Description

Computes the number of features to be used for splitting at a given tree depth based on the specified strategy. Supports constant, decreasing, and random feature selection strategies.

Usage

calculate_dynamic_max_features(
  data,
  response,
  base_max_features,
  depth,
  strategy = "constant",
  decrease_rate = 0.8,
  random_range = c(0.3, 1),
  verbose = FALSE
)

Arguments

data

A data frame containing the predictor variables and the response variable.

response

A character string specifying the name of the response variable to exclude from the feature set.

base_max_features

Integer; the base number of features to consider. If 'NULL', all available features (excluding the response) are used.

depth

Integer; the current depth of the node in the tree (used for depth-dependent strategies).

strategy

Character string specifying how to determine the number of features. One of:

'"constant"' <U+2013> always use 'base_max_features' (default).
'"decrease"' <U+2013> exponentially decrease the number of features with depth.
'"random"' <U+2013> randomly select the number of features within a range.

decrease_rate

Numeric; factor (0<U+2013>1] controlling how fast the number of features decreases with depth when 'strategy = "decrease"'. Default is 0.8.

random_range

Numeric vector of length 2 specifying the lower and upper bounds (as proportions of total features) for random selection when 'strategy = "random"'. Default is 'c(0.3, 1.0)'.

verbose

Logical; if 'TRUE', prints details about the chosen strategy and resulting feature count.

Details

This function helps control model complexity and randomness by varying the number of features used at each split.

Input parameters are validated to ensure sensible defaults. The result is capped to avoid exceeding the total number of available features.

Value

Integer; the number of features to consider at the current node. The value is always constrained between 1 and the total number of available features.

Calculate feature associations with a response variable

Description

Computes the association strength between each predictor and the response variable. For numeric predictors, the absolute Pearson correlation is used. For categorical predictors, association is estimated using an ANOVA-based pseudo-R^2 measure.

Usage

calculate_feature_associations(data, response, predictors)

Arguments

data

A data frame containing the response and predictor variables.

response

A string specifying the response variable name.

predictors

A character vector of predictor names to evaluate.

Details

- **Numeric predictors:** Computed using the absolute Pearson correlation. - **Categorical predictors:** Uses the square root of the ratio of between-group sum of squares to total sum of squares from an ANOVA model.

Value

A named numeric vector of association values (0 to 1) for each predictor.

Calculate node impurity

Description

Computes the impurity of a node using either Gini impurity or entropy.

Usage

calculate_impurity(y, method = c("gini", "entropy"))

Arguments

y

A vector of class labels for the node.

method

A string specifying the impurity measure: either "gini" or "entropy".

Details

If method = "gini", the impurity is calculated as:

G = 1 - \sum_i p_i^2

where p_i is the proportion of samples in class i in the node.

If method = "entropy", the impurity is calculated as:

H = - \sum_i p_i \log(p_i)

Value

A numeric value representing the impurity of the node.

Calculate class weights for a node

Description

Computes class weights for a given set of target values based on the chosen weighting strategy. Supports unweighted, balanced, balanced subsample, and custom weighting schemes, with optional verbosity for diagnostic output.

Usage

calculate_node_class_weights(
  y,
  class_weights = "none",
  custom_class_weights = NULL,
  verbose = FALSE
)

Arguments

y

A vector of class labels at the current node.

class_weights

Character string specifying the weighting strategy. Options are:

'"none"' <U+2013> no weighting (default).
'"balanced"' <U+2013> weights inversely proportional to class frequencies.
'"custom"' <U+2013> user-provided custom weights.

custom_class_weights

Named numeric vector of custom class weights (used only if 'class_weights = "custom"'). Names must match the unique class labels in 'y'.

verbose

Logical; if 'TRUE', prints detailed information about computed weights.

Details

The function caps computed class weights at 10 to avoid excessively large scaling factors.

Value

A named numeric vector of class weights for each unique class in 'y', or 'NULL' if equal weights are used ('class_weights = "none"') or if the custom weights are invalid.

Select a subset of features based on correlation, mutual information, or randomness

Description

Chooses up to a specified number of features from a dataset using one of three methods: random sampling, correlation with the response, or mutual information ranking.

Usage

choose_features(
  data,
  response,
  max_features,
  method = c("random", "mutual", "cor"),
  n_subsets = 1
)

Arguments

data

A data frame containing the response and predictor variables.

response

A string specifying the response variable name.

max_features

Integer specifying the maximum number of features to select.

method

Selection strategy. One of:

"random" <U+2013> randomly selects features.
"mutual" <U+2013> ranks features by mutual information with the response (requires FSelectorRcpp).
"cor" <U+2013> ranks features by absolute correlation with the response.

Details

- If the number of predictors is less than or equal to max_features, all are returned. - If method = "mutual" and FSelectorRcpp is not installed or fails, the function gracefully falls back to the correlation-based method. - The correlation method internally calls calculate_feature_associations.

Value

A character vector of selected feature names.

Select features with optional penalty for previously used features

Description

Internal helper function to select a subset of features while optionally penalizing features that have been used in ancestor nodes. Supports random selection, mutual information, or correlation-based ranking.

Usage

choose_features_with_penalty(
  data,
  response,
  max_features,
  method = c("random", "mutual", "cor"),
  penalize_used = FALSE,
  penalty_weight = 0.5,
  used_features = character(0),
  n_subsets = 1,
  verbose = FALSE
)

Arguments

data

A data frame containing predictors and the response.

response

Name of the response variable.

max_features

Maximum number of features to select.

method

Feature selection method; one of "random", "mutual", or "cor".

penalize_used

Logical; if TRUE, previously used features are penalized.

penalty_weight

Numeric (0<U+2013>0.99); fraction by which to reduce the score/weight of used features.

used_features

Character vector of features previously used in the tree.

verbose

Logical; if TRUE, prints information about penalties applied.

Details

- Penalized features have their selection weight or score reduced by multiplying by (1 - penalty_weight). - For method = "random", the penalty reduces the probability of sampling a feature. - For method = "mutual" or "cor", the penalty reduces feature importance or correlation. - If no valid features are available for correlation, the function falls back to random selection with penalty. - Ensures that no feature is entirely excluded; penalty_weight is capped below 1.

Value

Character vector of selected feature names.

Convert SVM decision values to probabilities

Description

Converts numeric SVM decision values into probabilities using a logistic/sigmoid transformation. Optionally uses the model's training decision values for calibration. Intended for internal use within the SVM tree prediction workflow.

Usage

convert_decision_to_probs(decision_values, model = NULL)

Arguments

decision_values

Numeric vector of decision values.

model

Optional svm object; if provided, training decision values are used to calibrate scaling.

Value

Numeric vector of probabilities, clipped between 0.001 and 0.999.

Build a 2-D prediction grid in ORIGINAL (unscaled) feature space

Description

The two plot features vary over their observed range plus padding; every other node feature is fixed at its median. Returned unscaled so axis labels stay readable; callers scale it themselves before predicting.

Usage

create_decision_grid(
  data,
  plot_features,
  all_node_features,
  resolution = 100,
  pad_factor = 0.5
)

Calculate Entropy

Description

Computes the entropy for a vector of class labels.

Usage

entropy(y)

Arguments

y

A vector of class labels.

Value

Numeric value representing entropy (0 = pure, higher = more impure).

Evaluate Multiple Random Feature Subsets Using SVM Information Gain

Description

Generates and evaluates multiple random feature subsets, ranking them by the information gain achieved through SVM-based splits.

Usage

evaluate_random_subsets(
  data,
  predictors,
  response,
  n_subsets = 5,
  subset_size = 4,
  metric = c("entropy", "gini"),
  verbose = FALSE
)

Arguments

data

A data frame containing predictors and the response variable.

predictors

Character vector of available predictor names.

response

Character string specifying the response variable name.

n_subsets

Integer; number of random feature subsets to evaluate.

subset_size

Integer; number of features in each subset.

metric

Impurity measure for information gain. One of "entropy" or "gini".

verbose

Logical; if TRUE, prints evaluation progress.

Details

This function randomly samples n_subsets different combinations of subset_size features from the predictor pool, evaluates each subset using svm_info_gain, and returns them ranked by performance.

If subset_size is greater than the number of available predictors, it is automatically reduced to match the predictor count.

Value

A data frame with two columns:

features: List column containing character vectors of feature names.
info_gain: Numeric vector of information gain values.

The data frame is sorted in descending order by information gain.

Fit a linear SVM model with optional class weights

Description

Fits a linear Support Vector Machine (SVM) classifier using the e1071 package, with optional class-specific weights to handle class imbalance.

Usage

fit_svm_with_weights(X_scaled, y, class_weights_vec, verbose = FALSE, ...)

Arguments

X_scaled

A data frame or matrix of predictor variables.

y

A vector of class labels corresponding to the rows of X.

class_weights_vec

Optional named numeric vector of class weights. Names must match the unique class labels in y. Weights are capped at 10 to prevent instability.

verbose

Logical; if TRUE, prints diagnostic messages during fitting.

...

Additional arguments passed to svm.

Details

- Uses a **linear kernel** by default. - Enables decision values and probability estimates. - Scaling is disabled (scale = FALSE). - When class_weights is supplied, weights are capped at 10 and passed to svm via its class.weights parameter. - Returns NULL if data is empty or model fitting fails.

Value

A fitted svm model object (of class "svm") on success, or NULL if fitting fails.

Retrieve all class labels from a decision tree

Description

Recursively extracts all unique class labels stored in a decision tree<U+2019>s leaf nodes.

Usage

get_all_classes(tree)

Arguments

tree

A decision tree object, where each node may contain:

is_leaf <U+2013> logical; TRUE if the node is a leaf.
class_prob <U+2013> named numeric vector of class probabilities.
left, right <U+2013> child node objects.

Value

A character vector of all unique class labels present in the tree.

Fallback predictions for SVM decision tree nodes

Description

Generates class predictions and probabilities when SVM predictions are unavailable or insufficient. This function is intended for internal use within the SVM tree.

Usage

get_fallback_predictions(
  model,
  X_scaled,
  decision_values,
  svm_probs = NULL,
  all_classes,
  calibrate = TRUE
)

Arguments

model

An svm object fitted with training data.

X_scaled

Scaled predictor matrix for the current node.

decision_values

Numeric vector of SVM decision values.

svm_probs

Optional SVM probability matrix (from predict(..., probability=TRUE)).

all_classes

Character vector of all possible classes.

calibrate

Logical; if TRUE, calibrates decision values into probabilities.

Value

A list with elements:

predictions: Character vector of predicted classes.
probabilities: Matrix of class probabilities (rows = samples, columns = classes).

Collect every feature name used anywhere in the tree (depth-first)

Description

Collect every feature name used anywhere in the tree (depth-first)

Usage

get_tree_features(tree)

Calculate Gini Impurity

Description

Computes the Gini impurity for a vector of class labels.

Usage

gini(y)

Arguments

y

A vector of class labels.

Value

Numeric value representing Gini impurity (0 = pure, higher = more impure).

Handle small child nodes in tree splitting

Description

Internal helper function to handle situations where one or both child nodes resulting from a split have fewer samples than min_samples. Depending on which child is too small, it may stop splitting, create only one child, or return a flag to continue normal processing.

Usage

handle_small_children(
  left_idx,
  right_idx,
  min_samples,
  data,
  response,
  depth,
  max_depth,
  max_features,
  feature_method,
  impurity_measure,
  max_features_strategy,
  max_features_decrease_rate,
  max_features_random_range,
  penalize_used_features,
  feature_penalty_weight,
  n_subsets,
  used_features,
  class_weights,
  custom_class_weights,
  min_impurity_decrease = 0.001,
  features,
  scaler,
  all_classes,
  verbose,
  ...
)

Arguments

left_idx

Indices of samples assigned to the left child.

right_idx

Indices of samples assigned to the right child.

min_samples

Minimum number of samples required for a node to be valid.

data

The full dataset being split.

response

Name of the response variable.

depth

Current depth of the node.

max_depth

Maximum allowed depth for the tree.

max_features

Maximum number of features to consider at each split.

feature_method

Feature selection method (e.g., "random", "cor", "mutual").

max_features_strategy

Strategy for dynamic feature selection ("constant", "decrease", "random").

max_features_decrease_rate

Numeric; factor controlling feature decrease with depth.

max_features_random_range

Numeric vector of length 2 specifying min/max proportion for random features.

penalize_used_features

Logical; whether to penalize previously used features.

feature_penalty_weight

Numeric weight for penalizing used features.

used_features

Character vector of features used in ancestor nodes.

class_weights

Named numeric vector of class weights.

custom_class_weights

Optional custom class weights.

features

Character vector of features used at this node.

scaler

Optional scaler applied to features at this node.

all_classes

Character vector of all possible classes.

verbose

Logical; if TRUE, prints messages for debugging.

...

Additional arguments passed to svm_split.

Details

- If both children are smaller than min_samples, a leaf node is created. - If only one child is too small, the other child is recursively split. - This function ensures that tree nodes respect the minimum sample requirement, avoiding invalid splits that could destabilize the SVM-based tree.

Value

A list with components:

stop Logical; TRUE if splitting should stop at this node.
node Either a leaf node object (if stopping) or a partially built internal node with only one child (if one child is too small).

Calculate Information Gain for a Feature Split

Description

Computes the reduction in impurity (information gain) when splitting a target variable by a categorical feature.

Usage

info_gain(feature, target, metric = c("entropy", "gini"))

Arguments

feature

A vector representing the splitting feature (categorical or factor).

target

A vector of class labels for the target variable.

metric

The impurity measure to use: either "entropy" or "gini".

Details

Information gain is computed as:

IG = H(parent) - \sum_{v \in Values} \frac{n_v}{n} H(child_v)

where:

H(parent) is the impurity of the original target vector,
H(child_v) is the impurity of the subset of target where feature = v,
n_v is the number of samples where feature = v,
n is the total number of samples.

Value

A numeric value representing the information gain.

Create a leaf node for a decision tree

Description

Constructs a leaf node object containing class probabilities, predicted class, and metadata.

Usage

leaf_node(y, n, all_classes = NULL, features = character(0), scaler = NULL)

Arguments

y

Vector of class labels for the samples in the node.

n

Number of samples in the node.

all_classes

Optional character vector of all possible classes. If NULL, classes are inferred from y.

features

Character vector of features used at this node (default empty).

scaler

Optional scaler object applied to the features at this node.

Details

- If some classes are missing in y, probabilities for those classes are set to 0. - If all probabilities are 0 or NA, a uniform probability distribution is used. - Probabilities are normalized to sum to 1.

Value

A list representing a leaf node with components:

is_leaf Logical; TRUE.
prediction Predicted class (majority class in the node).
n Number of samples in the node.
features Features used at this node.
scaler Optional scaler applied to the node features.
class_prob Named numeric vector of class probabilities (sums to 1).

Plot method for svmodt_node objects

Description

Thin S3 wrapper that dispatches to plot_boundary or plot_surface depending on plot.type.

Usage

## S3 method for class 'svmodt_node'
plot(
  x,
  y = NULL,
  ...,
  data = NULL,
  response = NULL,
  plot.type = c("surface", "boundary"),
  features = NULL,
  max_depth = NULL,
  check_accuracy = TRUE,
  resolution = NULL
)

Arguments

x

An svmodt_node returned by svm_split.

y

Ignored; present only to satisfy the graphics::plot generic signature.

...

Currently unused.

data

The original training data frame (required).

response

Character string naming the response column (required).

plot.type

One of "surface" (default) or "boundary".

features

Length-2 character vector of axis features ("surface" only; default uses root node features).

max_depth

Maximum depth to visualize ("boundary" only; default NULL = full tree).

check_accuracy

Logical; show per-node accuracy ("boundary" only; default TRUE).

resolution

Grid resolution per axis. Default 100 for "boundary", 200 for "surface".

Value

"boundary": invisibly returns the list from plot_boundary.
"surface": invisibly returns the ggplot2 object from plot_surface.

Examples


tree <- svm_split(wdbc, response = "diagnosis", max_depth = 3)

# All-node boundary panels - prints first, returns list
viz <- plot(tree,
  data = wdbc, response = "diagnosis",
  plot.type = "boundary"
)
viz$plots[[2]] # second node

# Global decision surface
plot(tree,
  data = wdbc, response = "diagnosis",
  plot.type = "surface"
)

# Surface with explicit feature axes
plot(tree,
  data = wdbc, response = "diagnosis",
  plot.type = "surface",
  features = c("radius_mean", "concavity_mean")
)

Plot SVM decision boundaries for every node in the tree

Description

Traverses the tree recursively and produces one plot per internal node, showing the SVM hyperplane for that node's binary split, the background region colouring, and the actual data points (coloured by true class). Each node receives only the subset of data that reaches it during training.

Usage

plot_boundary(
  tree,
  data,
  response_col = NULL,
  max_depth = NULL,
  check_accuracy = TRUE,
  resolution = 100
)

Arguments

tree

An svmodt_node object returned by svm_split.

data

The original training data frame.

response_col

Character string naming the response column in data. Auto-detected when NULL (first factor/character column not used as a predictor).

max_depth

Maximum tree depth to visualize. NULL (default) shows all nodes.

check_accuracy

Logical; if TRUE (default), compute and display training accuracy at each node.

resolution

Integer; grid resolution per axis (default 100). Increase for smoother boundaries at the cost of speed.

Value

Invisibly returns a list with four elements:

plots: Named list of ggplot2 objects, one per node. Names encode depth and path, e.g. "depth_1_Root", "depth_2_Root_L".
grid_data: Named list of data frames (full expanded grid used for each node's contour calculation).
accuracy_info: Named list of per-node metadata: depth, path, sample count, accuracy, features, whether the boundary was visible, and the pad factor that was needed.
response_col: The response column name used.

Plot the SVM decision boundary for a single internal node

Description

Internal workhorse called by plot_boundary for each node during tree traversal. Builds the grid in original space, scales it with the node's own scaler, predicts decision values, and returns a ggplot2 object together with metadata. The grid is expanded automatically (up to pad_factor = 3) if the hyperplane falls outside the data range.

Usage

plot_node_boundary(
  data,
  node_features,
  svm_model,
  scaler,
  response_col,
  title = "SVM Decision Boundary",
  resolution = 100
)

Plot the global decision surface of the full tree

Description

Predicts class labels across a 2-D grid using the complete tree (not individual node SVMs), then overlays the original data points. Because predictions come from svm_predict_tree, multiclass trees are handled correctly - each grid cell receives the final leaf prediction which respects all OVR splits along the path.

Usage

plot_surface(tree, data, response, features = NULL, resolution = 200)

Arguments

tree

An svmodt_node object returned by svm_split.

data

The original training data frame.

response

Character string naming the response column in data.

features

Character vector of length 2 giving the two features to plot on the x and y axes. Defaults to the first two features used at the root.

resolution

Integer; grid resolution per axis (default 200). Higher values give smoother region boundaries.

Details

All features not used as plot axes are held fixed at their in-sample median (numeric) or mode (categorical). You choose which two features to plot via features; if omitted the first two features used at the root node are used.

Value

A ggplot2 object. The background tiles show the predicted class for each grid cell; points show true class labels.

Predict method for svmodt_node objects

Description

Predict method for svmodt_node objects

Usage

## S3 method for class 'svmodt_node'
predict(object, newdata, return_probs = FALSE, calibrate_probs = TRUE, ...)

Arguments

object

An object of class svmodt_node.

newdata

A data frame of new predictor values.

return_probs

Logical; if TRUE, returns predictions and probabilities.

calibrate_probs

Logical; if TRUE, uses logistic calibration on decision values.

...

Currently unused.

Value

If return_probs = FALSE (the default), a character vector of predicted class labels, one element per row of newdata.

If return_probs = TRUE, a named list with two elements:

predictions: Character vector of predicted class labels (length = nrow(newdata)).
probabilities: Numeric matrix of class probabilities with nrow(newdata) rows and one column per class. Column names are the class labels; each row sums to 1. When calibrate_probs = TRUE, probabilities are derived from the SVM decision value via logistic calibration; otherwise empirical class frequencies at the leaf node are used.

Examples


# Train DTSVM tree
tree <- svm_split(
  data = wdbc,
  response = "diagnosis",
  max_depth = 3,
  max_features = 2,
  feature_method = "cor"
)

# Predict on WDBC data (returns a character vector of class labels)
preds <- predict(tree, newdata = wdbc)

# Predict with probabilities and logistic calibration
result <- predict(tree, newdata = wdbc,
  return_probs = TRUE, calibrate_probs = TRUE
)
head(result$predictions)
head(result$probabilities)

' Print method for svmodt_node objects

Description

' Print method for svmodt_node objects

Usage

## S3 method for class 'svmodt_node'
print(x, ...)

Arguments

x

An object of class svmodt_node.

...

Further arguments passed to print_svm_tree.

Value

Invisibly returns x (the svmodt_node object), called for its side effect of printing a human-readable summary of the tree structure to the console.

Examples


tree <- svm_split(
  data = wdbc,
  response = "diagnosis",
  max_features = 2,
  max_depth = 3,
  min_samples = 5,
  feature_method = "random",
  verbose = TRUE
)
print(tree)

Print an SVM Decision Tree

Description

Recursively prints the structure of an SVM-based decision tree.

Usage

print_svm_tree(
  tree,
  indent = "",
  show_probabilities = FALSE,
  show_feature_info = TRUE,
  show_penalties = TRUE
)

Arguments

tree

An object of class svmodt_node (leaf or tree).

indent

String used for indentation (for recursive calls).

show_probabilities

Logical; whether to display class probabilities at leaf nodes.

show_feature_info

Logical; whether to show features used at nodes.

show_penalties

Logical; whether to show penalty flags at nodes.

Value

Invisibly returns NULL. Prints to console.

Scale Numeric Features for Tree Nodes

Description

Internal utility function to standardize numeric features (zero mean, unit variance) and remove constant columns. Returns both the scaled training data and a transformer function for applying the same scaling to new data.

Usage

scale_node(df)

Arguments

df

A data frame containing numeric (or factor) features to be scaled.

Details

- Constant features (zero variance or only one unique value) are automatically removed. - Standard deviation of zero is replaced with 1 to prevent division by zero. - Designed for internal use in SVM tree building and prediction pipelines.

Value

A list with two elements:

train: The scaled training data frame.
transform: A function that applies the same scaling to a new data frame.

Check Stopping Conditions for Tree Splitting

Description

Internal utility function to determine if a node in a tree should stop splitting based on depth, purity, or minimum sample size.

Usage

stop_conditions_met(data, y, depth, max_depth, min_samples, verbose)

Arguments

data

A data frame of predictor features at the current node.

y

A vector of target values corresponding to data.

depth

Current depth of the node in the tree.

max_depth

Maximum allowed depth for the tree.

min_samples

Minimum number of samples required to split a node.

verbose

Logical; if TRUE, prints the reason for stopping.

Details

- Stops if the node reaches max_depth. - Stops if all target values in the node are identical (pure node). - Stops if the number of samples is less than min_samples.

Value

Logical; TRUE if the node meets any stopping condition, FALSE otherwise.

Calculate Information Gain Using SVM-based Splits

Description

Computes the information gain achieved by splitting data using a linear SVM trained on a subset of features. The SVM's decision values determine the split, and information gain is calculated based on the resulting partitions.

Usage

svm_info_gain(
  feature_subset,
  data,
  response,
  metric = c("entropy", "gini"),
  verbose = FALSE
)

Arguments

feature_subset

Character vector of feature names to use for the SVM split.

data

A data frame containing predictors and the response variable.

response

Character string specifying the response variable name.

metric

Impurity measure for information gain calculation. One of:

"entropy" <U+2013> entropy-based information gain (default).
"gini" <U+2013> Gini impurity-based information gain.

verbose

Logical; if TRUE, prints diagnostic information.

Details

This function:

Fits a linear SVM using the specified feature subset.
Extracts decision values (distances from the hyperplane).
Creates a binary split: samples with negative decision values go left, positive values go right.
Calculates information gain using the info_gain function.

The SVM split creates an oblique (non-axis-aligned) partition, potentially capturing more complex decision boundaries than single-feature splits.

Value

Numeric value representing the information gain achieved by the SVM split.

Predict Using a Support Vector Machine Oblique Decision Tree

Description

Predicts class labels or class probabilities for new data using a tree constructed with SVM splits. Handles leaf nodes, internal nodes, recursive traversal, and fallback mechanisms when SVM predictions or scaling fail.

Usage

svm_predict_tree(tree, newdata, return_probs = FALSE, calibrate_probs = TRUE)

Arguments

tree

A tree node object (leaf or internal) created by svm_split or svm_split_enhanced.

newdata

A data frame of new predictor values. **Must contain the same features** as those used to fit the tree. Any additional columns (including responses) are ignored.

return_probs

Logical; if TRUE, returns both predicted class labels and class probabilities.

calibrate_probs

Logical; if TRUE, converts SVM decision values to probabilities using logistic calibration (sigmoid) based on the distance from the hyperplane. If FALSE, fallback probabilities are computed from class frequencies at the leaf node.

Details

The function traverses the SVM-based oblique decision tree recursively and predicts class labels or probabilities. Key behaviors:

Leaf nodes: Return the majority class stored in the node, along with class probabilities.
Internal nodes:
- Scale features according to the node's scaling parameters.
- Compute SVM decision values.
- Recursively traverse left and right children depending on the sign of the decision value.
Binary support:
- Binary SVMs produce a single decision value per node.
Fallback predictions: If scaling fails, SVM predictions are unavailable, or child nodes are missing, predictions are generated in this order:
- SVM-provided probabilities (if available).
- Calibrated decision values using a logistic/sigmoid function (if calibrate_probs = TRUE).
- Leaf node class distribution (empirical frequencies) or uniform probabilities as a last resort.
Probability normalization: All returned probabilities are normalized so that each row sums to 1.
Feature requirement: newdata must contain exactly the features used to train the tree; any extra columns, including responses, are ignored.
Calibration behavior:
- calibrate_probs = FALSE returns class frequencies at the leaf node.
- calibrate_probs = TRUE uses the distance from the hyperplane for logistic post-processing into probabilities.

Value

If return_probs = FALSE, a character vector of predicted class labels. If return_probs = TRUE, a list with elements:

predictions: Character vector of predicted class labels.
probabilities: Numeric matrix of class probabilities (rows = samples, columns = classes).

Build an Oblique Decision Tree Using SVM Splits

Description

Constructs a decision tree where each internal node uses a Support Vector Machine (SVM) to determine the split. Supports dynamic feature selection, feature penalization, scaling, and class weighting.

Usage

svm_split(
  data,
  response,
  depth = 1,
  max_depth = 10,
  min_samples = 5,
  max_features = NULL,
  feature_method = c("random", "mutual", "cor"),
  impurity_measure = c("entropy", "gini"),
  max_features_strategy = c("constant", "random", "decrease"),
  max_features_decrease_rate = 0.8,
  max_features_random_range = c(0.3, 1),
  penalize_used_features = FALSE,
  feature_penalty_weight = 0.5,
  n_subsets = 1,
  used_features = character(0),
  class_weights = c("none", "balanced", "custom"),
  custom_class_weights = NULL,
  min_impurity_decrease = 0.001,
  verbose = FALSE,
  all_classes = NULL,
  ...
)

Arguments

data

A data frame containing predictors and the response variable.

response

Character string specifying the response column in 'data'. All other columns are treated as predictors.

depth

Integer indicating the current recursion depth (used internally; default is 1).

max_depth

Maximum depth of the tree.

min_samples

Minimum number of samples required to attempt a split.

max_features

Maximum number of features to consider at each split.

feature_method

Feature selection method at each node. One of:

'"random"': randomly select features,
'"mutual"': select based on mutual information with the response,
'"cor"': select based on correlation with the response.

impurity_measure

Information Gain evaluation criteria

'"gini"': use Gini ratio
'"entropy"': use Shannon entropy

max_features_strategy

Strategy to adjust the number of features per node:

'"constant"': keep 'max_features' constant,
'"decrease"': reduce features with depth,
'"random"': randomly vary number of features within a range.

max_features_decrease_rate

Numeric fraction for decreasing features if 'max_features_strategy = "decrease"'.

max_features_random_range

Numeric vector of length 2 specifying min and max fraction of features if 'max_features_strategy = "random"'.

penalize_used_features

Logical; if TRUE, features used in ancestor nodes are penalized to encourage diversity.

feature_penalty_weight

Numeric (0<U+2013>1) weight for penalizing previously used features.

n_subsets

Number of Evaluated Random Feature combinations at each node when 'feature_method = "random'

used_features

Character vector of features already used in ancestor nodes (used internally).

class_weights

Character string specifying how to handle class imbalance. One of:

'"none"': no weighting,
'"balanced"': weight classes inversely proportional to their frequency,
'"custom"': use 'custom_class_weights'.

custom_class_weights

Optional named numeric vector specifying custom weights per class.

min_impurity_decrease

Required decrease in impurity by a split to be considered valid

verbose

Logical; if TRUE, prints information about each node during tree construction.

all_classes

Optional character vector of all possible response classes (used internally).

...

Additional arguments passed to the underlying SVM fitting function.

Details

This function recursively splits the dataset using an SVM at each node. Splitting stops when maximum depth is reached, the node contains fewer than 'min_samples', or all samples belong to the same class. Features are scaled and selected dynamically at each node, and previously used features can be penalized to promote diversity. Class weighting schemes support handling imbalanced datasets. This approach allows construction of an **oblique decision tree**, where splits are linear hyperplanes rather than axis-aligned.

Value

A nested list representing the decision tree. Each node contains:

is_leaf: Logical; TRUE if the node is a leaf.
model: Fitted SVM model at this node (for internal nodes).
features: Vector of features selected for this node.
scaler: Scaling information used at this node.
left: Left child node (decision value > 0).
right: Right child node (decision value <U+2264> 0).
depth: Depth of this node in the tree.
n: Number of samples at this node.
max_features_used: Number of features considered at this node.
penalty_applied: Logical; TRUE if feature penalization was applied.
class_weights_used: Class weights applied at this node.

Examples


data(wdbc)
tree <- svm_split(
  data = wdbc,
  response = "diagnosis",
  max_depth = 3,
  min_samples = 5,
  feature_method = "random",
  verbose = TRUE
)

Trace the prediction path of a sample through an svmodt tree

Description

Generic function that walks the tree for a single row of new data, printing the SVM decision value and chosen branch at every internal node and the final predicted class at the leaf.

Usage

trace_path(object, ...)

## S3 method for class 'svmodt_node'
trace_path(object, sample_data, sample_idx = 1, ...)

Arguments

object

An svmodt_node returned by svm_split.

...

Currently unused.

sample_data

A data frame of new predictor values (one or more rows).

sample_idx

Integer; which row to trace (default 1).

Value

Invisibly returns the predicted class label (character string).

Methods (by class)

trace_path(svmodt_node): Method for svmodt_node objects.

Examples


tree <- svm_split(wdbc, response = "diagnosis", max_depth = 3)
trace_path(tree, wdbc, sample_idx = 5)

Trace Prediction Path for a Sample

Description

Shows the path taken by a single sample through the SVM tree, including decision values, branches, and final prediction.

Usage

trace_prediction_path(tree, sample_data, sample_idx = 1)

Arguments

tree

The tree object.

sample_data

Data frame containing the sample(s).

sample_idx

Index of the sample to trace (default 1).

Value

The predicted class for the sample (a character string). Called primarily for its side effect of printing the full decision path to the console, including node features, SVM decision values, branch directions, and the final predicted class label.

Wisconsin Diagnostic Breast Cancer Dataset

Description

The WDBC dataset contains quantitative measurements from digitized images of fine needle aspirates (FNA) of breast masses. It is commonly used for classification tasks to distinguish between benign and malignant tumors.

Usage

wdbc

Format

A data frame with 569 rows and 32 columns:

radius_mean: Mean of radius
radius_se: Standard error of radius
radius_worst: Worst (largest) radius
texture_mean: Mean of texture
texture_se: Standard error of texture
texture_worst: Worst texture
perimeter_mean: Mean of perimeter
perimeter_se: Standard error of perimeter
perimeter_worst: Worst perimeter
area_mean: Mean area
area_se: Standard error of area
area_worst: Worst area
smoothness_mean: Mean smoothness
smoothness_se: Standard error of smoothness
smoothness_worst: Worst smoothness
compactness_mean: Mean compactness
compactness_se: Standard error of compactness
compactness_worst: Worst compactness
concavity_mean: Mean concavity
concavity_se: Standard error of concavity
concavity_worst: Worst concavity
concave.points_mean: Mean concave points
concave.points_se: Standard error of concave points
concave.points_worst: Worst concave points
symmetry_mean: Mean symmetry
symmetry_se: Standard error of symmetry
symmetry_worst: Worst symmetry
fractal_dimension_mean: Mean fractal dimension
fractal_dimension_se: Standard error of fractal dimension
fractal_dimension_worst: Worst fractal dimension
diagnosis: Factor with levels 'B' and 'M'

Source

Dr. William H. Wolberg, W. Nick Street, and Olvi L. Mangasarian, University of Wisconsin<U+2013>Madison. Original dataset available at: <https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic>

Wine Dataset

Description

The Wine dataset contains the results of a chemical analysis of wines derived from three different cultivars grown in the same region of Italy. The dataset is commonly used for multiclass classification tasks, where the objective is to identify the cultivar of origin based on physicochemical properties.

Usage

wine

Format

A data frame with 178 rows and 14 columns:

class: Factor with levels 1, 2, and 3 indicating cultivar
alcohol: Alcohol content
malic_acid: Malic acid concentration
ash: Ash content
alcalinity_of_ash: Alcalinity of ash
magnesium: Magnesium content
total_phenols: Total phenols
flavanoids: Flavonoid content
nonflavanoid_phenols: Nonflavanoid phenols
proanthocyanins: Proanthocyanin content
color_intensity: Color intensity
hue: Hue
od280_od315: OD280/OD315 of diluted wines
proline: Proline concentration

Source

Aeberhard, S. & Forina, M. (1992). Wine Dataset. UCI Machine Learning Repository. Original dataset available at: <https://archive.ics.uci.edu/dataset/109/wine>

Package {svmodt}

Apply a scaler transformation to a data frame

Description

Usage

Arguments

Details

Value

Check whether a decision-value vector crosses zero

Description

Usage

Dynamically determine the number of features to consider at a node

Description

Usage

Arguments

Details

Value

Calculate feature associations with a response variable

Description

Usage

Arguments

Details

Value

Calculate node impurity

Description

Usage

Arguments

Details

Value

Calculate class weights for a node

Description

Usage

Arguments

Details

Value

Select a subset of features based on correlation, mutual information, or randomness

Description

Usage

Arguments

Details

Value

Select features with optional penalty for previously used features

Description

Usage

Arguments

Details

Value

See Also

Convert SVM decision values to probabilities

Description

Usage

Arguments

Value

Build a 2-D prediction grid in ORIGINAL (unscaled) feature space

Description

Usage

Calculate Entropy

Description

Usage

Arguments

Value

Evaluate Multiple Random Feature Subsets Using SVM Information Gain

Description

Usage

Arguments

Details

Value

Fit a linear SVM model with optional class weights

Description

Usage

Arguments

Details

Value

Retrieve all class labels from a decision tree

Description

Usage

Arguments

Value

Fallback predictions for SVM decision tree nodes

Description

Usage