| Title: | High-Performance Machine Learning Framework with C++ Acceleration |
|---|---|
| Description: | Machine learning utilities for fast vectorized model training. Methods are based on standard statistical learning references such as Hastie et al. (2009) <doi:10.1007/978-0-387-84858-7>. |
| Authors: | Musheer Mohd [aut, cre] |
| Maintainer: | Musheer Mohd <[email protected]> |
| License: | Apache License (>= 2) |
| Version: | 0.1.0 |
| Built: | 2026-06-05 08:03:40 UTC |
| Source: | https://github.com/mohd-musheer/vectorforgeml |
Computes classification accuracy.
accuracy_score(y_true, y_pred)accuracy_score(y_true, y_pred)
y_true |
true labels |
y_pred |
predicted labels |
Provides functionality for accuracy_score operations.
numeric accuracy
y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) accuracy_score(y_true, y_pred)y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) accuracy_score(y_true, y_pred)
Applies transformations to specific columns.
Provides functionality for ColumnTransformer operations.
ColumnTransformer object
model <- ColumnTransformer$new(num_cols="A", cat_cols="B")model <- ColumnTransformer$new(num_cols="A", cat_cols="B")
Computes confusion matrix.
confusion_matrix(y_true, y_pred)confusion_matrix(y_true, y_pred)
y_true |
true labels |
y_pred |
predicted labels |
Provides functionality for confusion_matrix operations.
matrix
y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) confusion_matrix(y_true, y_pred)y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) confusion_matrix(y_true, y_pred)
Calculates accuracy, precision, recall, F1 from confusion matrix.
confusion_stats(cm)confusion_stats(cm)
cm |
confusion matrix |
Provides functionality for confusion_stats operations.
list
cm <- matrix(c(10, 2, 1, 15), nrow=2) try({ confusion_stats(cm) })cm <- matrix(c(10, 2, 1, 15), nrow=2) try({ confusion_stats(cm) })
Tree-based classification/regression algorithm.
Provides functionality for DecisionTree operations.
DecisionTree object
model <- DecisionTree$new() X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)model <- DecisionTree$new() X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)
Removes columns with zero variance.
drop_constant_columns(X, eps = 1e-12)drop_constant_columns(X, eps = 1e-12)
X |
input matrix/dataframe |
eps |
for param eps |
Provides functionality for drop_constant_columns operations.
cleaned matrix
x <- data.frame(a=c(1,1,1), b=c(1,2,3)) drop_constant_columns(x)x <- data.frame(a=c(1,1,1), b=c(1,2,3)) drop_constant_columns(x)
Harmonic mean of precision and recall.
f1_score(y_true, y_pred, positive = NULL)f1_score(y_true, y_pred, positive = NULL)
y_true |
true labels |
y_pred |
predicted labels |
positive |
positive class label |
Provides functionality for f1_score operations.
numeric f1 score
y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) f1_score(y_true, y_pred)y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) f1_score(y_true, y_pred)
Finds optimal K value for KNN.
find_best_k(X, y, k_values = seq(1, 15, 2))find_best_k(X, y, k_values = seq(1, 15, 2))
X |
features |
y |
labels |
k_values |
for k value |
Provides functionality for find_best_k operations.
numeric best k
x <- matrix(rnorm(200), nrow=100) y <- sample(0:1, 100, replace=TRUE) find_best_k(x, y, k_values=c(1,3,5))x <- matrix(rnorm(200), nrow=100) y <- sample(0:1, 100, replace=TRUE) find_best_k(x, y, k_values=c(1,3,5))
Internal helper for linear regression training.
fit_linear_model(X, y)fit_linear_model(X, y)
X |
numeric matrix |
y |
numeric vector |
Provides functionality for fit_linear_model operations.
model object
X <- matrix(rnorm(20), nrow=10) y <- rnorm(10) try({ fit_linear_model(X, y) })X <- matrix(rnorm(20), nrow=10) y <- rnorm(10) try({ fit_linear_model(X, y) })
Unsupervised clustering algorithm.
Provides functionality for KMeans operations.
KMeans object
x <- matrix(rnorm(20), nrow=10) model <- KMeans$new() model$fit(x)x <- matrix(rnorm(20), nrow=10) model <- KMeans$new() model$fit(x)
Instance-based learning algorithm.
Provides functionality for KNN operations.
KNN object
model <- KNN$new(k=3, mode="classification") X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)model <- KNN$new(k=3, mode="classification") X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)
Converts categorical labels into numeric values.
Provides functionality for LabelEncoder operations.
LabelEncoder object
enc <- LabelEncoder$new() x <- c("a", "b", "a") enc$fit(x) enc$transform(x)enc <- LabelEncoder$new() x <- c("a", "b", "a") enc$fit(x) enc$transform(x)
Fast linear regression implemented in C++ backend.
Provides functionality for LinearRegression operations.
LinearRegression object
model <- LinearRegression$new() X <- matrix(rnorm(100),50,2) y <- rnorm(50) model$fit(X,y) model$predict(X)model <- LinearRegression$new() X <- matrix(rnorm(100),50,2) y <- rnorm(50) model$fit(X,y) model$predict(X)
Binary classification logistic regression.
Provides functionality for LogisticRegression operations.
LogisticRegression object
model <- LogisticRegression$new() X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)model <- LogisticRegression$new() X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)
Computes macro-averaged precision.
macro_f1(y_true, y_pred)macro_f1(y_true, y_pred)
y_true |
true labels |
y_pred |
predicted labels |
Provides functionality for macro_f1 operations.
numeric score
Computes macro-averaged precision.
macro_precision(y_true, y_pred)macro_precision(y_true, y_pred)
y_true |
true labels |
y_pred |
predicted labels |
Provides functionality for macro_precision operations.
numeric score
Computes macro-averaged precision.
macro_recall(y_true, y_pred)macro_recall(y_true, y_pred)
y_true |
true labels |
y_pred |
predicted labels |
Provides functionality for macro_recall operations.
numeric score
Standardizes features by removing mean and scaling to unit variance.
Provides functionality for MinMaxScaler operations.
StandardScaler object
s <- MinMaxScaler$new() x <- matrix(rnorm(20), nrow=10) s$fit(x) s$transform(x)s <- MinMaxScaler$new() x <- matrix(rnorm(20), nrow=10) s$fit(x) s$transform(x)
Calculates regression error.
mse(y_true, y_pred)mse(y_true, y_pred)
y_true |
true values |
y_pred |
predicted values |
Provides functionality for mse operations.
numeric mse
Converts categorical variables into binary vectors.
Provides functionality for OneHotEncoder operations.
OneHotEncoder object
enc <- OneHotEncoder$new() df <- data.frame(a=c("x","y","x")) enc$fit(df) enc$transform(df)enc <- OneHotEncoder$new() df <- data.frame(a=c("x","y","x")) enc$fit(df) enc$transform(df)
Dimensionality reduction technique.
Provides functionality for PCA operations.
PCA object
model <- PCA$new(n_components=2) X <- matrix(rnorm(30), nrow=10) model$fit(X) model$transform(X)model <- PCA$new(n_components=2) X <- matrix(rnorm(30), nrow=10) model$fit(X) model$transform(X)
Chains preprocessing and model steps.
Provides functionality for Pipeline operations.
Pipeline object
model <- Pipeline$new(list(StandardScaler$new()))model <- Pipeline$new(list(StandardScaler$new()))
Visualizes confusion matrix.
plot_confusion_matrix(cm, normalize = TRUE)plot_confusion_matrix(cm, normalize = TRUE)
cm |
confusion matrix |
normalize |
Normlize |
Provides functionality for plot_confusion_matrix operations.
plot
cm <- matrix(c(10, 2, 1, 15), nrow=2) try({ plot_confusion_matrix(cm) })cm <- matrix(c(10, 2, 1, 15), nrow=2) try({ plot_confusion_matrix(cm) })
Computes precision metric.
precision_score(y_true, y_pred, positive = NULL)precision_score(y_true, y_pred, positive = NULL)
y_true |
true labels |
y_pred |
predicted labels |
positive |
positive class label |
Provides functionality for precision_score operations.
numeric precision
y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) precision_score(y_true, y_pred)y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) precision_score(y_true, y_pred)
Predict values using trained linear model.
predict_linear_model(model, X)predict_linear_model(model, X)
model |
trained model |
X |
matrix |
Provides functionality for predict_linear_model operations.
numeric vector
X <- matrix(rnorm(20), nrow=10) y <- rnorm(10) model <- fit_linear_model(X, y) predict_linear_model(model, X)X <- matrix(rnorm(20), nrow=10) y <- rnorm(10) model <- fit_linear_model(X, y) predict_linear_model(model, X)
Coefficient of determination.
r2_score(y_true, y_pred)r2_score(y_true, y_pred)
y_true |
true values |
y_pred |
predicted values |
Provides functionality for r2_score operations.
numeric r2 score
Ensemble of decision trees.
Provides functionality for RandomForest operations.
RandomForest object
model <- RandomForest$new(ntrees=5) X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)model <- RandomForest$new(ntrees=5) X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) model$fit(X,y) model$predict(X)
Computes recall metric.
recall_score(y_true, y_pred, positive = NULL)recall_score(y_true, y_pred, positive = NULL)
y_true |
true labels |
y_pred |
predicted labels |
positive |
positive class label |
Provides functionality for recall_score operations.
numeric recall
y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) recall_score(y_true, y_pred)y_true <- c(1,0,1,1) y_pred <- c(1,0,0,1) recall_score(y_true, y_pred)
Linear regression with L2 regularization.
Provides functionality for RidgeRegression operations.
RidgeRegression object
model <- RidgeRegression$new() X <- matrix(rnorm(20), nrow=10) y <- rnorm(10) model$fit(X,y,lambda=1.0) model$predict(X)model <- RidgeRegression$new() X <- matrix(rnorm(20), nrow=10) y <- rnorm(10) model$fit(X,y,lambda=1.0) model$predict(X)
Square root of MSE.
rmse(y_true, y_pred)rmse(y_true, y_pred)
y_true |
true values |
y_pred |
predicted values |
Provides functionality for rmse operations.
numeric rmse
Multiclass logistic regression.
Provides functionality for SoftmaxRegression operations.
SoftmaxRegression object
model <- SoftmaxRegression$new() X <- matrix(rnorm(20), nrow=10) y <- sample(0:2, 10, replace=TRUE) model$fit(X,y) model$predict(X)model <- SoftmaxRegression$new() X <- matrix(rnorm(20), nrow=10) y <- sample(0:2, 10, replace=TRUE) model$fit(X,y) model$predict(X)
Removes columns with zero variance.
X |
input matrix/dataframe |
Provides functionality for StandardScaler operations.
cleaned matrix
s <- StandardScaler$new() x <- matrix(rnorm(20), nrow=10) s$fit(x) s$transform(x)s <- StandardScaler$new() x <- matrix(rnorm(20), nrow=10) s$fit(x) s$transform(x)
Splits dataset into training and testing sets.
train_test_split(X, y, test_size = 0.2, seed = NULL)train_test_split(X, y, test_size = 0.2, seed = NULL)
X |
features |
y |
labels |
test_size |
proportion for test set |
seed |
for random seed |
Provides functionality for train_test_split operations.
list
X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) train_test_split(X, y, test_size=0.2)X <- matrix(rnorm(20), nrow=10) y <- sample(0:1, 10, replace=TRUE) train_test_split(X, y, test_size=0.2)