Gemma4 31B - Layer Analysis Study

60-layer transformer analysis: which layers matter, which are redundant, and what can be safely pruned.

Total Layers

30.7B

Parameters

300

Probe Prompts (6 categories)

Architecture

Sliding Attention (50)

Full Attention (10): layers 5,11,17,23,29,35,41,47,53,59

Phase A: Block Influence & Activation Analysis

Measures how much each layer modifies the residual stream. Low BI = layer barely changes anything = redundant.

Block Influence (BI) Score per Layer

BI Score by Category (Heatmap)

Residual Stream Norm Evolution

Phase A Key Findings

Layers 10-22 have extremely low BI (<0.005) across ALL categories — they barely modify the residual stream
Layers 52-59 have the highest BI (0.1-0.95) — these are the "decision" layers
Full-attention layers 11, 17 are surprisingly redundant (BI <0.004)
Cross-category variance is near zero for the redundant layers — they're uniformly unused

Phase B: Logit Lens — Where Decisions Happen

Projects each layer's hidden state through the final norm+lm_head to see what token it would predict. Reveals at which layer the model "commits" to an answer.

Average Target Token Rank per Layer (log scale)

Decision Layer per Category

Average layer at which the ground-truth token first enters top-1

Phase B Key Findings

Until layer ~30, the target token rank is ~250k (random) — no decision has been made
Layer 57-59 is where the rank collapses: 51 → 3 → 1 (the "decision cascade")
Conversational decides earliest (layer 49), factual/multilingual latest (58-59)
The first 30 layers work in a latent space that doesn't project to vocabulary

Phase C: Ablation Study — What Breaks When You Remove a Layer

Each layer is individually disabled (skip) and the perplexity delta is measured. Negative delta = removing this layer IMPROVES the model.

Single-Layer Ablation: Perplexity Delta (% change from baseline)

Block Ablation: Dropping Multiple Layers Together

Phase C Key Findings

Layers 23-38 have NEGATIVE deltas — removing them individually IMPROVES perplexity by 20-74%
Layers 58, 59, 1, 0 are absolutely critical — removing any one destroys the model
But compound effects are huge: dropping 10 "safe" layers together doubles perplexity (+100%)
The gap between single-layer and block ablation shows layers compensate for each other

Synthesis: Evidence-Based Drop Plan

Combined Safety Score per Layer

Combining BI score (Phase A), cross-category variance, and ablation impact (Phase C)

All 60 Layers — Complete Profile

Layer	Type	BI Score	Delta (abs)	Logit Rank	Ablation %	Safety	Verdict

Final Recommendations

Drop 10 (conservative): layers [4,9,10,11,14,15,16,19,21,22] — PPL +100% (needs LoRA recovery)
Drop 14 (moderate): adds [8,12,13,18] — PPL +201% (needs strong LoRA)
Drop 18+ (aggressive): PPL +531%+ — likely irrecoverable with LoRA alone
Never touch: layers 0, 1, 58, 59 (PPL explosion >1000%)
Surprise: full-attention layer 11 is safe to drop (BI=0.004, ablation -12%)