EXP-0072026-04-08

Shadow Distributions Reveal Pragmatic Meaning in Suppressed Tokens (Derrida)

#shadow-distributions #pragmatics #euphemism #irony #llm-internals #qwen #philosophy #distributional-semantics

Question

Does the suppressed part of a language model's output distribution (the non-argmax tokens) carry pragmatic and social meaning that the chosen tokens don't?

Setup

Model: Qwen2.5-7B (4-bit quantized) on RTX 3060 Ti (8 GB). Four analysis parts: (A) shadow narratives — generate text by following top-2 token at each step instead of top-1; (B) shadow divergence — compute Jensen-Shannon divergence on 14 minimal pairs across full distribution vs shadow-only (excluding top-1), measure amplification ratio; (C) entropy geography — per-token entropy across 8 text genres; (D) trace analysis — compare surface vs shadow JSD for negation, hedging, irony, quantifiers, and temporal shifts.

Results

Pair Type	JSD Full Distribution	JSD Shadow Only	Amplification
Corporate euphemism (let go / fired)	0.088	0.215	2.45x
Register shift (passed away / died)	0.033	0.076	2.30x
Effort implicature (managed to / did)	0.058	0.104	1.80x
Modal hedging (may be unsafe / is unsafe)	0.120	0.208	1.73x
Irony vs literal (wonderful day in traffic / traffic is terrible)	0.149	0.247	1.66x
Hedged vs bare request	0.186	0.264	1.42x
Hint vs command	0.174	0.227	1.31x
Sarcasm vs neutral	0.187	0.233	1.25x
Negation vs affirmation (not guilty / innocent)	0.008	0.009	1.13x
Active vs passive voice	0.078	0.091	1.17x
Agentless passive (mistakes were made / CEO made mistakes)	0.206	0.189	0.92x (dampened)

Key findings

Euphemism and register shifts amplify maximally in the shadow (2.3-2.5x). 'Let go' vs 'fired' differ modestly on the surface but diverge dramatically in what they suppress — the model encodes taboo alternatives without expressing them.
Irony amplifies 1.66x — the literal meaning persists in the shadow distribution even when the model outputs the ironic interpretation.
Pure semantic equivalences (negation, active/passive) don't amplify (<1.2x). Shadow amplification specifically detects pragmatic and social layers of meaning, not denotational semantics.
8 of 14 minimal pairs showed amplification >1.2x. Mean amplification across all pairs: 1.47x. Only one pair showed dampening (agentless passive at 0.92x), where surface JSD was already maximal (0.206).
Shadow narratives (top-2 token path) produce text that is incoherent but semantically adjacent — like dream-like free association that orbits the same topic as the surface text. Surface token probability averages 5x shadow probability.

Lesson learned

The shadow (suppressed alternatives) is a rich, underexplored signal. Standard LLM analysis focuses on argmax or top-k tokens, but the shape of what the model rejects encodes social knowledge (taboo, register, politeness) that the surface output doesn't reveal. Shadow amplification could serve as a metric for detecting euphemism, irony, and register shifts without labeled data.

Tools used

Claude Opus 4 for experiment design and code generation. Qwen2.5-7B (4-bit) as the model under study. scipy for JSD computation.