Preprocessing — Scalers

Feature scaling ensures that all features contribute equally to the model. This example demonstrates all 7 Deepbox scalers: StandardScaler, MinMaxScaler, RobustScaler, MaxAbsScaler, Normalizer, PowerTransformer, and QuantileTransformer.

Deepbox Modules Used

deepbox/ndarraydeepbox/preprocess

What You Will Learn

StandardScaler is the default — centers to zero mean and unit variance
MinMaxScaler scales to [0, 1] — good for neural networks
RobustScaler uses median/IQR — use when data has outliers
Always fit on training data only, then transform both train and test

Source Code

18-preprocessing-scalers/index.ts

1import { tensor } from "deepbox/ndarray";2import {3  MaxAbsScaler,4  MinMaxScaler,5  Normalizer,6  PowerTransformer,7  QuantileTransformer,8  RobustScaler,9  StandardScaler,10} from "deepbox/preprocess";1112console.log("=== Preprocessing: Scalers ===\n");1314// Sample data with different scales15const X = tensor([16  [1, 100, 0.01],17  [2, 200, 0.02],18  [3, 300, 0.03],19  [4, 400, 0.04],20  [5, 500, 0.05],21  [100, 50, 0.5],22]);2324// ---------------------------------------------------------------------------25// Part 1: StandardScaler — zero mean, unit variance26// ---------------------------------------------------------------------------27console.log("--- Part 1: StandardScaler ---");2829const ss = new StandardScaler();30ss.fit(X);31const XStd = ss.transform(X);32console.log("Scaled (first 3 rows):\n", XStd.toString());3334const XInv = ss.inverseTransform(XStd);35console.log("Inverse (first row):", XInv.toString());3637// ---------------------------------------------------------------------------38// Part 2: MinMaxScaler — scale to [0, 1]39// ---------------------------------------------------------------------------40console.log("\n--- Part 2: MinMaxScaler ---");4142const mms = new MinMaxScaler();43mms.fit(X);44const XMinMax = mms.transform(X);45console.log("Scaled (first 3 rows):\n", XMinMax.toString());4647// ---------------------------------------------------------------------------48// Part 3: RobustScaler — uses median and IQR (robust to outliers)49// ---------------------------------------------------------------------------50console.log("\n--- Part 3: RobustScaler ---");5152const rs = new RobustScaler();53rs.fit(X);54const XRobust = rs.transform(X);55console.log("Scaled (first 3 rows):\n", XRobust.toString());5657// ---------------------------------------------------------------------------58// Part 4: MaxAbsScaler — scale by maximum absolute value59// ---------------------------------------------------------------------------60console.log("\n--- Part 4: MaxAbsScaler ---");6162const mas = new MaxAbsScaler();63mas.fit(X);64const XMaxAbs = mas.transform(X);65console.log("Scaled (first 3 rows):\n", XMaxAbs.toString());6667// ---------------------------------------------------------------------------68// Part 5: Normalizer — normalize each sample (row) to unit norm69// ---------------------------------------------------------------------------70console.log("\n--- Part 5: Normalizer ---");7172const norm = new Normalizer();73const XNorm = norm.transform(X);74console.log("Normalized (first 3 rows):\n", XNorm.toString());7576// ---------------------------------------------------------------------------77// Part 6: PowerTransformer — Gaussian-like transformation78// ---------------------------------------------------------------------------79console.log("\n--- Part 6: PowerTransformer ---");8081const pt = new PowerTransformer();82pt.fit(X);83const XPower = pt.transform(X);84console.log("Transformed (first 3 rows):\n", XPower.toString());8586// ---------------------------------------------------------------------------87// Part 7: QuantileTransformer — map to uniform or normal distribution88// ---------------------------------------------------------------------------89console.log("\n--- Part 7: QuantileTransformer ---");9091const qt = new QuantileTransformer();92qt.fit(X);93const XQuantile = qt.transform(X);94console.log("Transformed (first 3 rows):\n", XQuantile.toString());9596console.log("\n=== Preprocessing: Scalers Complete ===");

Console Output

$ npx tsx 18-preprocessing-scalers/index.ts

=== Preprocessing: Scalers ===

--- Part 1: StandardScaler ---
Scaled (first 3 rows):
 tensor([[-0.5022, -0.9945, -0.5599]
       [-0.4746, -0.3664, -0.5029]
       [-0.4469, 0.2617, -0.4460]
       [-0.4193, 0.8898, -0.3891]
       [-0.3916, 1.518, -0.3321]
       [2.235, -1.309, 2.230]], dtype=float64)
Inverse (first row): tensor([[1, 100, 0.01000]
       [2, 200, 0.02000]
       [3, 300, 0.03000]
       [4, 400, 0.04000]
       [5, 500, 0.05000]
       [100, 50, 0.5000]], dtype=float64)

--- Part 2: MinMaxScaler ---
Scaled (first 3 rows):
 tensor([[0, 0.1111, 0]
       [0.01010, 0.3333, 0.02041]
       [0.02020, 0.5556, 0.04082]
       [0.03030, 0.7778, 0.06122]
       [0.04040, 1, 0.08163]
       [1, 0, 1]], dtype=float64)

--- Part 3: RobustScaler ---
Scaled (first 3 rows):
 tensor([[-1, -0.6000, -1.000]
       [-0.6000, -0.2000, -0.6000]
       [-0.2000, 0.2000, -0.2000]
       [0.2000, 0.6000, 0.2000]
       [0.6000, 1, 0.6000]
       [38.60, -0.8000, 18.60]], dtype=float64)

--- Part 4: MaxAbsScaler ---
Scaled (first 3 rows):
 tensor([[0.01000, 0.2000, 0.02000]
       [0.02000, 0.4000, 0.04000]
       [0.03000, 0.6000, 0.06000]
       [0.04000, 0.8000, 0.08000]
       [0.05000, 1, 0.1000]
       [1, 0.1000, 1]], dtype=float64)

--- Part 5: Normalizer ---
Normalized (first 3 rows):
 tensor([[0.009999, 0.9999, 0.00009999]
       [0.009999, 0.9999, 0.00009999]
       [0.009999, 0.9999, 0.00009999]
       [0.009999, 0.9999, 0.00009999]
       [0.009999, 0.9999, 0.0001000]
       [0.8944, 0.4472, 0.004472]], dtype=float64)

--- Part 6: PowerTransformer ---
Transformed (first 3 rows):
 tensor([[0.5501, 17.37, 0.009707]
       [0.7684, 25.09, 0.01885]
       [0.8899, 30.99, 0.02748]
       [0.9688, 35.94, 0.03561]
       [1.025, 40.29, 0.04329]
       [1.381, 11.87, 0.1737]], dtype=float64)

--- Part 7: QuantileTransformer ---
Transformed (first 3 rows):
 tensor([[0, 0.2000, 0]
       [0.2000, 0.4000, 0.2000]
       [0.4000, 0.6000, 0.4000]
       [0.6000, 0.8000, 0.6000]
       [0.8000, 1, 0.8000]
       [1, 0, 1]], dtype=float64)

=== Preprocessing: Scalers Complete ===

Preprocessing — EncodersPrevious Statistical AnalysisNext