18
Preprocessing
Scaling
Feature Engineering
Preprocessing — Scalers
Feature scaling ensures that all features contribute equally to the model. This example demonstrates all 7 Deepbox scalers: StandardScaler, MinMaxScaler, RobustScaler, MaxAbsScaler, Normalizer, PowerTransformer, and QuantileTransformer.
Deepbox Modules Used
deepbox/ndarraydeepbox/preprocessWhat You Will Learn
- StandardScaler is the default — centers to zero mean and unit variance
- MinMaxScaler scales to [0, 1] — good for neural networks
- RobustScaler uses median/IQR — use when data has outliers
- Always fit on training data only, then transform both train and test
Source Code
18-preprocessing-scalers/index.ts
1import { tensor } from "deepbox/ndarray";2import {3 MaxAbsScaler,4 MinMaxScaler,5 Normalizer,6 PowerTransformer,7 QuantileTransformer,8 RobustScaler,9 StandardScaler,10} from "deepbox/preprocess";1112console.log("=== Preprocessing: Scalers ===\n");1314// Sample data with different scales15const X = tensor([16 [1, 100, 0.01],17 [2, 200, 0.02],18 [3, 300, 0.03],19 [4, 400, 0.04],20 [5, 500, 0.05],21 [100, 50, 0.5],22]);2324// ---------------------------------------------------------------------------25// Part 1: StandardScaler — zero mean, unit variance26// ---------------------------------------------------------------------------27console.log("--- Part 1: StandardScaler ---");2829const ss = new StandardScaler();30ss.fit(X);31const XStd = ss.transform(X);32console.log("Scaled (first 3 rows):\n", XStd.toString());3334const XInv = ss.inverseTransform(XStd);35console.log("Inverse (first row):", XInv.toString());3637// ---------------------------------------------------------------------------38// Part 2: MinMaxScaler — scale to [0, 1]39// ---------------------------------------------------------------------------40console.log("\n--- Part 2: MinMaxScaler ---");4142const mms = new MinMaxScaler();43mms.fit(X);44const XMinMax = mms.transform(X);45console.log("Scaled (first 3 rows):\n", XMinMax.toString());4647// ---------------------------------------------------------------------------48// Part 3: RobustScaler — uses median and IQR (robust to outliers)49// ---------------------------------------------------------------------------50console.log("\n--- Part 3: RobustScaler ---");5152const rs = new RobustScaler();53rs.fit(X);54const XRobust = rs.transform(X);55console.log("Scaled (first 3 rows):\n", XRobust.toString());5657// ---------------------------------------------------------------------------58// Part 4: MaxAbsScaler — scale by maximum absolute value59// ---------------------------------------------------------------------------60console.log("\n--- Part 4: MaxAbsScaler ---");6162const mas = new MaxAbsScaler();63mas.fit(X);64const XMaxAbs = mas.transform(X);65console.log("Scaled (first 3 rows):\n", XMaxAbs.toString());6667// ---------------------------------------------------------------------------68// Part 5: Normalizer — normalize each sample (row) to unit norm69// ---------------------------------------------------------------------------70console.log("\n--- Part 5: Normalizer ---");7172const norm = new Normalizer();73const XNorm = norm.transform(X);74console.log("Normalized (first 3 rows):\n", XNorm.toString());7576// ---------------------------------------------------------------------------77// Part 6: PowerTransformer — Gaussian-like transformation78// ---------------------------------------------------------------------------79console.log("\n--- Part 6: PowerTransformer ---");8081const pt = new PowerTransformer();82pt.fit(X);83const XPower = pt.transform(X);84console.log("Transformed (first 3 rows):\n", XPower.toString());8586// ---------------------------------------------------------------------------87// Part 7: QuantileTransformer — map to uniform or normal distribution88// ---------------------------------------------------------------------------89console.log("\n--- Part 7: QuantileTransformer ---");9091const qt = new QuantileTransformer();92qt.fit(X);93const XQuantile = qt.transform(X);94console.log("Transformed (first 3 rows):\n", XQuantile.toString());9596console.log("\n=== Preprocessing: Scalers Complete ===");Console Output
$ npx tsx 18-preprocessing-scalers/index.ts
=== Preprocessing: Scalers ===
--- Part 1: StandardScaler ---
Scaled (first 3 rows):
tensor([[-0.5022, -0.9945, -0.5599]
[-0.4746, -0.3664, -0.5029]
[-0.4469, 0.2617, -0.4460]
[-0.4193, 0.8898, -0.3891]
[-0.3916, 1.518, -0.3321]
[2.235, -1.309, 2.230]], dtype=float64)
Inverse (first row): tensor([[1, 100, 0.01000]
[2, 200, 0.02000]
[3, 300, 0.03000]
[4, 400, 0.04000]
[5, 500, 0.05000]
[100, 50, 0.5000]], dtype=float64)
--- Part 2: MinMaxScaler ---
Scaled (first 3 rows):
tensor([[0, 0.1111, 0]
[0.01010, 0.3333, 0.02041]
[0.02020, 0.5556, 0.04082]
[0.03030, 0.7778, 0.06122]
[0.04040, 1, 0.08163]
[1, 0, 1]], dtype=float64)
--- Part 3: RobustScaler ---
Scaled (first 3 rows):
tensor([[-1, -0.6000, -1.000]
[-0.6000, -0.2000, -0.6000]
[-0.2000, 0.2000, -0.2000]
[0.2000, 0.6000, 0.2000]
[0.6000, 1, 0.6000]
[38.60, -0.8000, 18.60]], dtype=float64)
--- Part 4: MaxAbsScaler ---
Scaled (first 3 rows):
tensor([[0.01000, 0.2000, 0.02000]
[0.02000, 0.4000, 0.04000]
[0.03000, 0.6000, 0.06000]
[0.04000, 0.8000, 0.08000]
[0.05000, 1, 0.1000]
[1, 0.1000, 1]], dtype=float64)
--- Part 5: Normalizer ---
Normalized (first 3 rows):
tensor([[0.009999, 0.9999, 0.00009999]
[0.009999, 0.9999, 0.00009999]
[0.009999, 0.9999, 0.00009999]
[0.009999, 0.9999, 0.00009999]
[0.009999, 0.9999, 0.0001000]
[0.8944, 0.4472, 0.004472]], dtype=float64)
--- Part 6: PowerTransformer ---
Transformed (first 3 rows):
tensor([[0.5501, 17.37, 0.009707]
[0.7684, 25.09, 0.01885]
[0.8899, 30.99, 0.02748]
[0.9688, 35.94, 0.03561]
[1.025, 40.29, 0.04329]
[1.381, 11.87, 0.1737]], dtype=float64)
--- Part 7: QuantileTransformer ---
Transformed (first 3 rows):
tensor([[0, 0.2000, 0]
[0.2000, 0.4000, 0.2000]
[0.4000, 0.6000, 0.4000]
[0.6000, 0.8000, 0.6000]
[0.8000, 1, 0.8000]
[1, 0, 1]], dtype=float64)
=== Preprocessing: Scalers Complete ===