terminus.inkterminus.ink
EXP-007

Shadow Distributions Reveal Pragmatic Meaning in Suppressed Tokens (Derrida)

@eazevedo

Question

Does the suppressed part of a language model's output distribution (the non-argmax tokens) carry pragmatic and social meaning that the chosen tokens don't?

Setup

Model: Qwen2.5-7B (4-bit quantized) on RTX 3060 Ti (8 GB). Four analysis parts: (A) shadow narratives — generate text by following top-2 token at each step instead of top-1; (B) shadow divergence — compute Jensen-Shannon divergence on 14 minimal pairs across full distribution vs shadow-only (excluding top-1), measure amplification ratio; (C) entropy geography — per-token entropy across 8 text genres; (D) trace analysis — compare surface vs shadow JSD for negation, hedging, irony, quantifiers, and temporal shifts.

Results

Pair TypeJSD Full DistributionJSD Shadow OnlyAmplification
Corporate euphemism (let go / fired)0.0880.2152.45x
Register shift (passed away / died)0.0330.0762.30x
Effort implicature (managed to / did)0.0580.1041.80x
Modal hedging (may be unsafe / is unsafe)0.1200.2081.73x
Irony vs literal (wonderful day in traffic / traffic is terrible)0.1490.2471.66x
Hedged vs bare request0.1860.2641.42x
Hint vs command0.1740.2271.31x
Sarcasm vs neutral0.1870.2331.25x
Negation vs affirmation (not guilty / innocent)0.0080.0091.13x
Active vs passive voice0.0780.0911.17x
Agentless passive (mistakes were made / CEO made mistakes)0.2060.1890.92x (dampened)

Key findings

  • Euphemism and register shifts amplify maximally in the shadow (2.3-2.5x). 'Let go' vs 'fired' differ modestly on the surface but diverge dramatically in what they suppress — the model encodes taboo alternatives without expressing them.
  • Irony amplifies 1.66x — the literal meaning persists in the shadow distribution even when the model outputs the ironic interpretation.
  • Pure semantic equivalences (negation, active/passive) don't amplify (<1.2x). Shadow amplification specifically detects pragmatic and social layers of meaning, not denotational semantics.
  • 8 of 14 minimal pairs showed amplification >1.2x. Mean amplification across all pairs: 1.47x. Only one pair showed dampening (agentless passive at 0.92x), where surface JSD was already maximal (0.206).
  • Shadow narratives (top-2 token path) produce text that is incoherent but semantically adjacent — like dream-like free association that orbits the same topic as the surface text. Surface token probability averages 5x shadow probability.

Lesson learned

The shadow (suppressed alternatives) is a rich, underexplored signal. Standard LLM analysis focuses on argmax or top-k tokens, but the shape of what the model rejects encodes social knowledge (taboo, register, politeness) that the surface output doesn't reveal. Shadow amplification could serve as a metric for detecting euphemism, irony, and register shifts without labeled data.

Tools used

Claude Opus 4 for experiment design and code generation. Qwen2.5-7B (4-bit) as the model under study. scipy for JSD computation.

EXP-007: Shadow Distributions Reveal Pragmatic Meaning in Suppressed Tokens (Derrida) — terminus.ink