clustermon

UMAP of the Pokédex

Building a multi-lens UMAP projection of every base-form Pokémon, then clustering and visualizing the result. Notes from the workbench.

Try the Cross-Lens Explorer →

Step 01

UMAP

dimensionality reduction over the Pokédex feature matrix

We loaded the canonical PokéAPI tables into a single SQLite database, flattened every species into one wide feature vector (stats, types, abilities, move pool, egg groups, color, shape, habitat, growth rate, generation), z-scored the numeric block, then ran UMAP nine times (once per lens) at two output dimensionalities each. Four lenses come from the structured features above; two more (flavor, sprite) are dense embeddings of Pokédex text and official artwork; two are type-supervised fine-tunes of those encoders (flavor-ft, sprite-ft); and all is the L2-normalized concatenation of the four structured sub-blocks. The 2-D output is for the eye; the 10-D output is for HDBSCAN later.

Shape reduction · what UMAP actually did

Lens	Input (rows × features)	Output (rows × 2)	Reduction ratio
all	1,025 × 1,123	→1,025 × 2	561.5×
stats	1,025 × 6	→1,025 × 2	3×
types	1,025 × 36	→1,025 × 2	18×
abilities	1,025 × 284	→1,025 × 2	142×
moves	1,025 × 797	→1,025 × 2	398.5×
flavor	1,025 × 384	→1,025 × 2	192×
flavor-ft	1,025 × 384	→1,025 × 2	192×
sprite	1,025 × 512	→1,025 × 2	256×
sprite-ft	1,025 × 512	→1,025 × 2	256×

UMAP scatter of every base-form Pokémon, with five spotlight species labeled. — 1,025 grey dots = species. Labeled larger dots = the five spotlights below. The geometry is real: nearby dots are Pokémon whose 1,194-dim feature vectors UMAP placed close together.

five spotlight rows · line-level "what UMAP did"

PokémonStats · types · abilitiesFeature density (1,194-d)After (u0, u1)5 nearest UMAP neighbors

#0025pikachu

stats 35/55/40/50/50/90types electricability static, lightning rod

127/1,194 dims set (11% density)

(6.02, 4.13)

wattreld0.01blitzled0.05pawmod0.06elekidd0.07mareepd0.07

#0006charizard

stats 78/84/78/109/85/100types fire / flyingability blaze, solar power

154/1,194 dims set (13% density)

(-1.47, 3.60)

magmard0.06pyroard0.12delphoxd0.16typhlosiond0.16rapidashd0.18

#0150mewtwolegendary

stats 106/110/90/154/90/130types psychicability pressure, unnerve

189/1,194 dims set (16% density)

(3.40, 4.76)

calyrexd0.05celebid0.05azelfd0.08deoxysd0.09victinid0.11

#0129magikarp

stats 20/10/55/15/20/80types waterability swift swim, rattled

27/1,194 dims set (2% density)

(1.84, -2.18)

wiglettd0.09arrokudad0.10tympoled0.10poliwagd0.12goldeend0.12

#0143snorlax

stats 160/110/65/65/110/30types normalability immunity, thick fat

160/1,194 dims set (13% density)

(-0.17, 6.53)

kecleond0.02exploudd0.05oinkologned0.07stantlerd0.09greedentd0.10

# umap params for the 2-d viz pass (the 10-d clusterable pass uses min_dist=0.0)

umap.UMAP(
  n_components=2,
  min_dist=0.1,
  n_neighbors=30,
  metric="cosine",
  random_state=42,
  n_jobs=1,
  transform_seed=42,
).fit(X)  # X.shape = (1,025, up to 1,194)

Step 01b

UMAP parameter sweep

140 fits across n_neighbors × min_dist, per lens

UMAP has two knobs that dominate the look of the projection. n_neighbors sets how many nearest neighbors each point uses to build the local manifold; small values (5) preserve fine-grained local detail, large values (100) emphasize global structure. min_dist sets the floor on point separation in the embedding: 0.0 lets clusters pack tight (best for downstream HDBSCAN), 0.5 leaves breathing room (better for the eye). Each panel below is one full UMAP fit at those settings; points are colored by primary type so you can see when type structure resolves and when it dissolves.

UMAP parameter sweep small-multiples for all lens — Scan top-to-bottom to see *n_neighbors* trade local for global structure; scan left-to-right to see *min_dist* loosen tight clumps into breathable shapes. The headline 2-d projection on Step 01 uses `n_neighbors=30, min_dist=0.1` (row 3, col 2); the 10-d input to HDBSCAN uses `min_dist=0.0` to give the clusterer maximum density to work with.

Step 02

HDBSCAN

density-based clusters over the 10-d UMAP outputs

HDBSCAN finds clusters of varying density without being told how many to look for. We feed it the 10-d UMAP output (the 2-d is for the eye; 10-d gives the clusterer room to find structure), run it across all nine lenses, and score each result against primary type as ground truth. Headline numbers at mcs=15: the all lens (L2-normalized concat of the four structured sub-blocks) and the types lens both clear ARI ≈ 0.93; the moves lens recovers type at ARI ≈ 0.54 without ever seeing a type label.

dial min_cluster_sizemin_cluster_size = 15

8153060

smaller mcs = more, smaller clusters · larger mcs = fewer, broader clusters. The table, scatter, and spotlight rows below swap to precomputed values for the selected mcs.

cluster recovery per lens · mcs=15truth = primary type

Lens	Clusters	Noise	ARI vs type	ARI vs type-pair	NMI vs type
abilities	31	150 (14.6%)	0.139	0.104	0.377
all	18	35 (3.4%)	0.933	0.486	0.952
flavor	17	537 (52.4%)	0.108	0.076	0.281
flavor-ft	5	4 (0.4%)	0.032	0.014	0.199
moves	17	145 (14.1%)	0.561	0.303	0.633
sprite	14	592 (57.8%)	0.155	0.117	0.371
sprite-ft	21	75 (7.3%)	0.793	0.354	0.837
stats	2	0 (0.0%)	-0.002	-0.001	0.007
types	18	0 (0.0%)	0.958	0.463	0.985
abilities	22	131 (12.8%)	0.127	0.087	0.340
all	18	35 (3.4%)	0.933	0.486	0.952
flavor	11	495 (48.3%)	0.072	0.044	0.204
flavor-ft	5	4 (0.4%)	0.032	0.014	0.199
moves	14	143 (14.0%)	0.536	0.284	0.616
sprite	2	111 (10.8%)	-0.002	0.004	0.031
sprite-ft	19	58 (5.7%)	0.800	0.376	0.839
stats	2	0 (0.0%)	-0.002	-0.001	0.007
types	18	0 (0.0%)	0.958	0.463	0.985
abilities	14	97 (9.5%)	0.120	0.073	0.282
all	16	62 (6.0%)	0.969	0.452	0.962
flavor	8	518 (50.5%)	0.069	0.040	0.184
flavor-ft	4	1 (0.1%)	0.032	0.014	0.190
moves	8	25 (2.4%)	0.249	0.119	0.484
sprite	2	111 (10.8%)	-0.002	0.004	0.031
sprite-ft	15	35 (3.4%)	0.742	0.315	0.814
stats	2	0 (0.0%)	-0.002	-0.001	0.007
types	16	29 (2.8%)	0.994	0.434	0.995
abilities	2	43 (4.2%)	0.004	0.002	0.025
all	6	67 (6.5%)	0.332	0.110	0.694
flavor	3	536 (52.3%)	0.033	0.014	0.078
flavor-ft	2	55 (5.4%)	0.020	0.009	0.097
moves	6	14 (1.4%)	0.229	0.109	0.432
sprite	4	523 (51.0%)	0.101	0.044	0.166
sprite-ft	6	157 (15.3%)	0.414	0.134	0.642
stats	2	0 (0.0%)	-0.002	-0.001	0.007
types	7	99 (9.7%)	0.487	0.163	0.799

types lens trivially recovers types (the input one-hots ARE the labels). stats lens can't (6 dims have no density structure). moves is the genuine finding: move pools predict type without supervision.

baseline clusterers vs HDBSCAN · ARI vs primary typek-sweep ∈ {6, 11, 18, 30, 50}

Lens	HDBSCAN	k-means best	DBSCAN	GMM best	Best baseline	Δ vs HDBSCAN	Supervised UMAP + HDBSCAN	Δ vs best unsup
all	0.933	0.788k=18	0.878	0.892k=18	0.892gmm@18	+0.041	0.958	+0.026
stats	-0.002	0.025k=18	0.000	0.025k=30	0.025kmeans@18	-0.027	0.033	+0.008
types	0.958	1.000k=18	0.964	1.000k=18	1.000kmeans@18	-0.042	0.960	-0.040
abilities	0.127	0.116k=50	0.104	0.108k=30	0.116kmeans@50	+0.011	0.283	+0.156
moves	0.536	0.426k=11	0.231	0.377k=18	0.426kmeans@11	+0.110	0.899	+0.363
flavor	0.072	0.045k=30	0.001	0.046k=18	0.046gmm@18	+0.026	0.481	+0.409
flavor-ft	0.032	0.185k=18	0.032	0.167k=11	0.185kmeans@18	-0.153	0.608	+0.423
sprite	-0.002	0.094k=11	-0.001	0.093k=18	0.094kmeans@11	-0.096	0.537	+0.443
sprite-ft	0.800	0.745k=18	0.522	0.740k=18	0.745kmeans@18	+0.055	0.934	+0.134

baselines fit on the same 10-d UMAP coords as HDBSCAN. k-means + GMM are forced to pick k; HDBSCAN discovers it. Positive Δ means density-based clustering beat the best k-tuned baseline.

fine-tuned encoders · sprite + flavor → type1 contrastive

Two parallel fine-tunes ask the same question of two modalities: does in-domain type supervision unlock combat-type signal that the off-the-shelf encoder doesn't see? sprite-ft takes a ViT-B-32 vision tower and trains on 820 (sprite, type-prompt) contrastive pairs; flavor-ft takes MiniLM-L6-v2 and trains on the matching (Pokédex blurb, type-prompt) pairs. Both use 80/20 stratified splits, fp16 mixed precision, gradual unfreezing (last few blocks then full encoder), cosine LR with 100-step warmup, and early stopping on held-out test ARI. The prompt-encoder side stays frozen in both; the only thing changing is what the input encoder learns to look for.

before / after · HDBSCAN @ mcs=15 · ARI vs primary type

Lens	ARI off-the-shelf	ARI fine-tuned (test)	ARI fine-tuned (train)	Δ test
sprite → sprite-ft	-0.002	0.371	0.928	+0.373
flavor → flavor-ft	0.072	0.025	0.033	-0.047

all-1025 ARI is 0.800 for sprite-ft and 0.032 for flavor-ft (train + test pooled); the test column above is the only number that defends against overfitting on a 1,025-sample dataset.

CLIP fine-tune training curves: train/test cross-entropy and test ARI/NMI over epochs

MiniLM fine-tune training curves: train/test cross-entropy and test ARI/NMI over epochs

the two iconic HDBSCAN plots · mcs=15

HDBSCAN condensed-tree dendrogram for all lens — Vertical axis is **λ (lambda)**, the inverse density threshold at which clusters split or dissolve. Higher λ = denser neighborhood required. A cluster's *persistence* (its vertical width on this tree) shows how stable it is across density scales; the dashed blue boxes mark the clusters HDBSCAN ultimately selected: long persistent bars beat short transient ones.

HDBSCAN soft-membership heatmap for all lens — Each row is one Pokémon; each column is one HDBSCAN cluster. Cell brightness = the soft-membership probability that point belongs to that cluster. Rows are sorted by primary (argmax) cluster so coherent bands appear along the diagonal. Rows whose probabilities sum to noticeably less than 1 are *uncertain* points; they sit on cluster boundaries or in low-density regions where the model is hedging (i.e., the soft analog of HDBSCAN noise).

Step 02b

Library-canonical evals

GLOSH outlier scores · cluster persistence · UMAP diagnostics

Three eval surfaces the libraries themselves treat as canonical but the page didn't show until now. GLOSH outlier scores rank every point by how far it sits from any cluster's dense core (∈ [0,1]; 1 = extreme outlier). Cluster persistence quantifies how robustly each cluster survives across the density hierarchy; the numeric companion to the dendrogram above. UMAP diagnostics show per-region embedding quality so you can spot where the projection had to lie.

cluster persistence · all

LLM coherence2.09· 72% type-only· 83% agree

Sorted desc by cluster_persistence_. Color = relative persistence within this lens. Highest-persistence clusters tend to coincide with the highest LLM coherence scores (top-3 by persistence on the moves lens match clusters the judge rated 4 or 5), and noise-heavy lenses like flavor have lower median persistence. Numbers ground against domain truth, not just against each other.

cluster	size	persistence
#1	40	0.7865
#2	27	0.6666
#8	58	0.6177
#9	58	0.6031
#0	99	0.5423
#13	33	0.5379
#4	29	0.4703
#6	79	0.4555
#11	34	0.4492
#7	63	0.4128
#16	46	0.3867
#3	131	0.3675
#10	44	0.3600
#12	40	0.3278
#5	91	0.2973
#14	30	0.2964
#15	46	0.2365
#17	42	0.2125

top GLOSH outliers · all

mean=0.200 · p90=0.525 · max=0.846

#0250ho-ohfire

0.846

#0798kartanagrass

0.846

#0795pheromosabug

0.843

#0794buzzwolebug

0.840

#0426drifblimghost

0.831

UMAP neighborhood diagnostic for all lens — Per-point Jaccard preservation of local neighborhoods (input ↔ embedding). Darker / cooler = better preserved.

UMAP local_dim diagnostic for all lens — Local intrinsic dimensionality: regions where the embedding had to force high-d structure into 2-d.

UMAP pca diagnostic for all lens — Embedding colored by PCA-RGB of input space. Smooth color transitions = global structure preserved.

UMAP scatter colored by HDBSCAN cluster id, 18 clusters and 35 noise points. — Same 1,025 points as Step 01's grey scatter, now colored by the 18 clusters HDBSCAN found in the 10-d UMAP space. Noise points (×) are species too isolated to confidently belong to any cluster. Spotlights labeled with their cluster id; clusters defined in 10-d sometimes look overlapped in 2-d because dims 3-10 hide separations.

spotlight cluster membership · all lens · mcs=15

PokémonClusterProb.OutlierTop 5 same-cluster Pokémon

#0025pikachu

#8 · 580.9760.024

elekidtogedemaruemolgadedennezebstrika

#0006charizard

#7 · 631.0000.000

magmortarcentiskorchheatranmagmarninetales

#0150mewtwo

#9 · 581.0000.000

necrozmamewmespritcalyrexsolgaleo

#0129magikarp

#3 · 1311.0000.000

feebasfinneonarrokudatympoleluvdisc

#0143snorlax

#5 · 911.0000.000

terapagoskecleonblisseydudunsparceaudino

#0025pikachu

#8 · 581.0000.000

elekidtogedemaruemolgadedennezebstrika

#0006charizard

#7 · 631.0000.000

magmortarcentiskorchheatranmagmarninetales

#0150mewtwo

#9 · 580.7650.235

necrozmamewmespritcalyrexsolgaleo

#0129magikarp

#3 · 1311.0000.000

feebasfinneonarrokudatympoleluvdisc

#0143snorlax

#5 · 911.0000.000

terapagoskecleonblisseydudunsparceaudino

#0025pikachu

#6 · 581.0000.000

elekidtogedemaruemolgadedennezebstrika

#0006charizard

#5 · 631.0000.000

magmortarcentiskorchheatranmagmarninetales

#0150mewtwo

#7 · 580.8970.103

necrozmamewmespritcalyrexsolgaleo

#0129magikarp

#2 · 1311.0000.000

feebasfinneonarrokudatympoleluvdisc

#0143snorlax

#3 · 1201.0000.000

terapagoskecleonblisseydudunsparceaudino

#0025pikachu

#5 · 4660.6630.484

elekidtogedemaruemolgadedennezebstrika

#0006charizard

#4 · 631.0000.000

magmortarcentiskorchheatranmagmarninetales

#0150mewtwo

#5 · 4660.7090.449

necrozmamewmespritcalyrexsolgaleo

#0129magikarp

#1 · 1311.0000.000

feebasfinneonarrokudatympoleluvdisc

#0143snorlax

#2 · 1201.0000.000

terapagoskecleonblisseydudunsparceaudino

interactive · open in a new tab

Each lens has its own self-contained datamapplot HTML: zoom, pan, search by Pokémon name, hover any point for its type and cluster id, cluster centroids labeled with the dominant type.

all →stats →types →abilities →moves →flavor →flavor-ft →sprite →sprite-ft →

interactive · cross-lens explorer

Pick any Pokémon and see how each of the nine lenses describes its neighborhood. Same creature, nine different recommendation engines, watch them disagree. Click any neighbor sprite to navigate. This is the project's deliverable demo.

Open explorer →

What we learned

Three things the machinery told us

Findings that weren't visible until the pipeline produced numbers

Move pools alone encode most of type identity. The moves lens recovers primary type at ARI 0.54 unsupervised and 0.90 with supervised UMAP, without ever seeing a type label. After the L2-balanced rebuild, the combined all lens reaches 0.93 — close to the supervised ceiling — meaning the structured features taken together carry the type signal almost completely, with moves doing most of the heavy lifting. The dual-type (type1, type2) ARI is a stricter measure and lands around 0.49 for both, suggesting secondary type is genuinely harder to recover from any single feature surface.
Fine-tuning works for vision, but text remains harder — the modality matters. The pretrained sprite lens scored ARI -0.002 vs primary type; fine-tuning the ViT-B-32 vision tower on 820 (sprite, type-prompt) pairs lifted held-out test ARI to 0.371. Running the same recipe on the text side — fine-tuning MiniLM on (Pokédex blurb, type-prompt) pairs — only moves test ARI from 0.072 to 0.025. Pokédex blurbs are written without combat type in mind; even with supervision the encoder can only find what's in the input.
GLOSH outliers on the flavor lens cleanly surface Ultra Beasts. Top-5 by outlier score: Kartana (0.839), Celesteela (0.838), Pheromosa (0.837), Nihilego (0.837). Four-of-five are the canonical "weird" Pokémon subgroup, with no supervision and no mention of "Ultra Beast" anywhere in the feature matrix. The eval surface independently validated against domain truth.

Try the Cross-Lens Explorer →