dimRed | AshwiniPal

The Problem

High-dimensional datasets are difficult to visualize, interpret, and compare across experiments. Teams often need consistent dimensionality reduction implementations across Python, Scala, and Spark to move from exploration to production.

What I Built

A compact library of dimensionality reduction implementations in Python, Scala, and PySpark, built to support both notebook exploration and large-scale Spark jobs. The repo focuses on clear, reproducible implementations that make it easy to compare approaches across stacks.

Key Results

62 GitHub stars as a community reference
Cross-language implementations that mirror each other for easier benchmarking
Spark-friendly workflows for scaling to larger datasets