Question
Does the suppressed part of a language model's output distribution (the non-argmax tokens) carry pragmatic and social meaning that the chosen tokens don't?
Setup
Model: Qwen2.5-7B (4-bit quantized) on RTX 3060 Ti (8 GB). Four analysis parts: (A) shadow narratives — generate text by following top-2 token at each step instead of top-1; (B) shadow divergence — compute Jensen-Shannon divergence on 14 minimal pairs across full distribution vs shadow-only (excluding top-1), measure amplification ratio; (C) entropy geography — per-token entropy across 8 text genres; (D) trace analysis — compare surface vs shadow JSD for negation, hedging, irony, quantifiers, and temporal shifts.
Results
| Pair Type | JSD Full Distribution | JSD Shadow Only | Amplification |
|---|---|---|---|
| Corporate euphemism (let go / fired) | 0.088 | 0.215 | 2.45x |
| Register shift (passed away / died) | 0.033 | 0.076 | 2.30x |
| Effort implicature (managed to / did) | 0.058 | 0.104 | 1.80x |
| Modal hedging (may be unsafe / is unsafe) | 0.120 | 0.208 | 1.73x |
| Irony vs literal (wonderful day in traffic / traffic is terrible) | 0.149 | 0.247 | 1.66x |
| Hedged vs bare request | 0.186 | 0.264 | 1.42x |
| Hint vs command | 0.174 | 0.227 | 1.31x |
| Sarcasm vs neutral | 0.187 | 0.233 | 1.25x |
| Negation vs affirmation (not guilty / innocent) | 0.008 | 0.009 | 1.13x |
| Active vs passive voice | 0.078 | 0.091 | 1.17x |
| Agentless passive (mistakes were made / CEO made mistakes) | 0.206 | 0.189 | 0.92x (dampened) |
Key findings
- Euphemism and register shifts amplify maximally in the shadow (2.3-2.5x). 'Let go' vs 'fired' differ modestly on the surface but diverge dramatically in what they suppress — the model encodes taboo alternatives without expressing them.
- Irony amplifies 1.66x — the literal meaning persists in the shadow distribution even when the model outputs the ironic interpretation.
- Pure semantic equivalences (negation, active/passive) don't amplify (<1.2x). Shadow amplification specifically detects pragmatic and social layers of meaning, not denotational semantics.
- 8 of 14 minimal pairs showed amplification >1.2x. Mean amplification across all pairs: 1.47x. Only one pair showed dampening (agentless passive at 0.92x), where surface JSD was already maximal (0.206).
- Shadow narratives (top-2 token path) produce text that is incoherent but semantically adjacent — like dream-like free association that orbits the same topic as the surface text. Surface token probability averages 5x shadow probability.
Lesson learned
The shadow (suppressed alternatives) is a rich, underexplored signal. Standard LLM analysis focuses on argmax or top-k tokens, but the shape of what the model rejects encodes social knowledge (taboo, register, politeness) that the surface output doesn't reveal. Shadow amplification could serve as a metric for detecting euphemism, irony, and register shifts without labeled data.
Tools used
Claude Opus 4 for experiment design and code generation. Qwen2.5-7B (4-bit) as the model under study. scipy for JSD computation.