Advanced ML Models

Beyond linear models, Deepbox provides a full suite of classical ML algorithms. This example demonstrates four distinct paradigms: KMeans clustering, K-Nearest Neighbors for classification and regression, PCA for dimensionality reduction, and Gaussian Naive Bayes for probabilistic classification.

Deepbox Modules Used

deepbox/mldeepbox/ndarraydeepbox/metricsdeepbox/preprocess

What You Will Learn

KMeans discovers k clusters by minimizing within-cluster variance (inertia)
KNN classifies by majority vote of the k nearest training points
PCA projects data onto directions of maximum variance
GaussianNB applies Bayes' theorem assuming Gaussian feature distributions

Source Code

10-advanced-ml-models/index.ts

1import { isNumericTypedArray, isTypedArray } from "deepbox/core";2import { accuracy } from "deepbox/metrics";3import { GaussianNB, KMeans, KNeighborsClassifier, KNeighborsRegressor, PCA } from "deepbox/ml";4import { tensor } from "deepbox/ndarray";5import { trainTestSplit } from "deepbox/preprocess";67const expectNumericTypedArray = (8  value: unknown9): Float32Array | Float64Array | Int32Array | Uint8Array => {10  if (!isTypedArray(value) || !isNumericTypedArray(value)) {11    throw new Error("Expected numeric typed array");12  }13  return value;14};1516console.log("=".repeat(60));17console.log("Example 21: Advanced ML Models");18console.log("=".repeat(60));1920// ============================================================================21// Part 1: KMeans Clustering22// ============================================================================23console.log("\n📦 Part 1: KMeans Clustering");24console.log("-".repeat(60));2526const clusterData = tensor([27  [1, 2],28  [1.5, 1.8],29  [5, 8],30  [8, 8],31  [1, 0.6],32  [9, 11],33  [8, 2],34  [10, 2],35  [9, 3],36]);3738const kmeans = new KMeans({ nClusters: 3, randomState: 42 });39kmeans.fit(clusterData);4041const clusterLabels = kmeans.predict(clusterData);42console.log("Cluster labels:", clusterLabels.toString());43console.log("Cluster centers shape:", kmeans.clusterCenters.shape);44console.log("Inertia:", kmeans.inertia.toFixed(4));45console.log("Number of iterations:", kmeans.nIter);4647// ============================================================================48// Part 2: K-Nearest Neighbors Classification49// ============================================================================50console.log("\n📦 Part 2: K-Nearest Neighbors Classification");51console.log("-".repeat(60));5253const XClass = tensor([54  [0, 0],55  [1, 1],56  [2, 2],57  [3, 3],58  [4, 4],59  [5, 5],60  [6, 6],61  [7, 7],62]);63const yClass = tensor([0, 0, 0, 0, 1, 1, 1, 1]);6465const [XTrainKNN, XTestKNN, yTrainKNN, yTestKNN] = trainTestSplit(XClass, yClass, {66  testSize: 0.25,67  randomState: 42,68});6970const knnClassifier = new KNeighborsClassifier({ nNeighbors: 3 });71knnClassifier.fit(XTrainKNN, yTrainKNN);7273const yPredKNN = knnClassifier.predict(XTestKNN);74const knnAccuracy = accuracy(yTestKNN, yPredKNN);7576console.log("KNN Classifier trained with k=3");77console.log("Test accuracy:", `${(Number(knnAccuracy) * 100).toFixed(2)}%`);7879const probabilities = knnClassifier.predictProba(XTestKNN);80console.log("Prediction probabilities shape:", probabilities.shape);8182// ============================================================================83// Part 3: K-Nearest Neighbors Regression84// ============================================================================85console.log("\n📦 Part 3: K-Nearest Neighbors Regression");86console.log("-".repeat(60));8788const XReg = tensor([[0], [1], [2], [3], [4], [5]]);89const yReg = tensor([0, 1, 4, 9, 16, 25]);9091const knnRegressor = new KNeighborsRegressor({ nNeighbors: 2 });92knnRegressor.fit(XReg, yReg);9394const yPredReg = knnRegressor.predict(tensor([[2.5], [3.5]]));95console.log("Predictions for [2.5] and [3.5]:", yPredReg.toString());9697const r2Score = knnRegressor.score(XReg, yReg);98console.log("R² score:", r2Score.toFixed(4));99100// ============================================================================101// Part 4: PCA (Dimensionality Reduction)102// ============================================================================103console.log("\n📦 Part 4: PCA - Dimensionality Reduction");104console.log("-".repeat(60));105106const XPca = tensor([107  [2.5, 2.4, 1.1],108  [0.5, 0.7, 0.3],109  [2.2, 2.9, 1.5],110  [1.9, 2.2, 0.9],111  [3.1, 3.0, 1.8],112  [2.3, 2.7, 1.2],113  [2.0, 1.6, 0.8],114  [1.0, 1.1, 0.5],115  [1.5, 1.6, 0.7],116  [1.1, 0.9, 0.4],117]);118119const pca = new PCA({ nComponents: 2 });120pca.fit(XPca);121122const XTransformed = pca.transform(XPca);123console.log("Original shape:", XPca.shape);124console.log("Transformed shape:", XTransformed.shape);125console.log("Explained variance ratio:", pca.explainedVarianceRatio.toString());126127const varianceData = expectNumericTypedArray(pca.explainedVarianceRatio.data);128const totalVariance = Array.from(varianceData).reduce((a, b) => a + b, 0);129console.log("Total variance explained:", `${(totalVariance * 100).toFixed(2)}%`);130131// Reconstruct data132const XReconstructed = pca.inverseTransform(XTransformed);133console.log("Reconstructed shape:", XReconstructed.shape);134135// ============================================================================136// Part 5: Gaussian Naive Bayes137// ============================================================================138console.log("\n📦 Part 5: Gaussian Naive Bayes");139console.log("-".repeat(60));140141const XNB = tensor([142  [1, 2],143  [2, 3],144  [3, 4],145  [4, 5],146  [5, 6],147  [6, 7],148  [7, 8],149  [8, 9],150]);151const yNB = tensor([0, 0, 0, 0, 1, 1, 1, 1]);152153const [XTrainNB, XTestNB, yTrainNB, yTestNB] = trainTestSplit(XNB, yNB, {154  testSize: 0.25,155  randomState: 42,156});157158const nb = new GaussianNB();159nb.fit(XTrainNB, yTrainNB);160161const yPredNB = nb.predict(XTestNB);162const nbAccuracy = accuracy(yTestNB, yPredNB);163164console.log("Gaussian Naive Bayes trained");165console.log("Test accuracy:", `${(Number(nbAccuracy) * 100).toFixed(2)}%`);166167const nbProba = nb.predictProba(XTestNB);168console.log("Prediction probabilities shape:", nbProba.shape);169170// ============================================================================171// Summary172// ============================================================================173console.log("\n💡 Key Takeaways");174console.log("-".repeat(60));175console.log("• KMeans: Unsupervised clustering for grouping similar data points");176console.log("• KNN: Instance-based learning for classification and regression");177console.log("• PCA: Dimensionality reduction while preserving variance");178console.log("• Naive Bayes: Probabilistic classifier based on Bayes' theorem");179console.log("• All models follow the fit/predict/score API (fit/predict/score)");180181console.log("\n✅ Advanced ML Models Example Complete!");182console.log("=".repeat(60));

Console Output

$ npx tsx 10-advanced-ml-models/index.ts

============================================================
Example 21: Advanced ML Models
============================================================

📦 Part 1: KMeans Clustering
------------------------------------------------------------
Cluster labels: tensor([2, 2, 1, ..., 0, 0, 0], dtype=int32)
Cluster centers shape: [ 3, 2 ]
Inertia: 18.6467
Number of iterations: 2

📦 Part 2: K-Nearest Neighbors Classification
------------------------------------------------------------
KNN Classifier trained with k=3
Test accuracy: 100.00%
Prediction probabilities shape: [ 2, 2 ]

📦 Part 3: K-Nearest Neighbors Regression
------------------------------------------------------------
Predictions for [2.5] and [3.5]: tensor([6.500, 12.50], dtype=float32)
R² score: 0.9126

📦 Part 4: PCA - Dimensionality Reduction
------------------------------------------------------------
Original shape: [ 10, 3 ]
Transformed shape: [ 10, 2 ]
Explained variance ratio: tensor([0.9602, 0.03214], dtype=float32)
Total variance explained: 99.23%
Reconstructed shape: [ 10, 3 ]

📦 Part 5: Gaussian Naive Bayes
------------------------------------------------------------
Gaussian Naive Bayes trained
Test accuracy: 100.00%
Prediction probabilities shape: [ 2, 2 ]

💡 Key Takeaways
------------------------------------------------------------
• KMeans: Unsupervised clustering for grouping similar data points
• KNN: Instance-based learning for classification and regression
• PCA: Dimensionality reduction while preserving variance
• Naive Bayes: Probabilistic classifier based on Bayes' theorem
• All models follow the fit/predict/score API (fit/predict/score)

✅ Advanced ML Models Example Complete!
============================================================

Ridge & Lasso RegressionPrevious Tree-Based & Ensemble ModelsNext