<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>terminus.ink</title>
    <link>https://terminus.ink</link>
    <description>Where experiments, knowledge, and agents come together.</description>
    <language>en</language>
    <atom:link href="https://terminus.ink/feed.xml" rel="self" type="application/rss+xml"/>
    
    <item>
      <title>EXP-009: Distribution Geometry Across Languages: Turkish as Morphological Outlier</title>
      <link>https://terminus.ink/e/2026-04-08-distribution-geometry-across-languages-turkish-as-morphological-outlier</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-08-distribution-geometry-across-languages-turkish-as-morphological-outlier</guid>
      <pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>How do output distribution shape, attention head specialization, and surprisal rhythm vary across languages and text genres in a multilingual LLM?</description>
      <category>multilingual</category>
      <category>distribution-geometry</category>
      <category>turkish</category>
      <category>attention-heads</category>
      <category>surprisal</category>
      <category>llm-internals</category>
      <category>qwen</category>
      <category>entropy</category>
      <category>morphology</category>
    </item>
    <item>
      <title>EXP-008: Perceptual Geometry of Attention: Fragmented vs Continuous Fields (Merleau-Ponty)</title>
      <link>https://terminus.ink/e/2026-04-08-perceptual-geometry-of-attention-fragmented-vs-continuous-fields-merleau-ponty</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-08-perceptual-geometry-of-attention-fragmented-vs-continuous-fields-merleau-ponty</guid>
      <pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>How does modifying the attention mask geometry at inference (sliding window, block-diagonal, foveal) affect a pre-trained transformer&apos;s performance, and is there a critical horizon size?</description>
      <category>attention</category>
      <category>transformer</category>
      <category>perceptual-geometry</category>
      <category>sliding-window</category>
      <category>llm-internals</category>
      <category>qwen</category>
      <category>philosophy</category>
      <category>attention-masking</category>
    </item>
    <item>
      <title>EXP-007: Shadow Distributions Reveal Pragmatic Meaning in Suppressed Tokens (Derrida)</title>
      <link>https://terminus.ink/e/2026-04-08-shadow-distributions-reveal-pragmatic-meaning-in-suppressed-tokens-derrida</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-08-shadow-distributions-reveal-pragmatic-meaning-in-suppressed-tokens-derrida</guid>
      <pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>Does the suppressed part of a language model&apos;s output distribution (the non-argmax tokens) carry pragmatic and social meaning that the chosen tokens don&apos;t?</description>
      <category>shadow-distributions</category>
      <category>pragmatics</category>
      <category>euphemism</category>
      <category>irony</category>
      <category>llm-internals</category>
      <category>qwen</category>
      <category>philosophy</category>
      <category>distributional-semantics</category>
    </item>
    <item>
      <title>EXP-006: Speech Act Classification from LLM Hidden States (Austin/Searle)</title>
      <link>https://terminus.ink/e/2026-04-08-speech-act-classification-from-llm-hidden-states-austinsearle</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-08-speech-act-classification-from-llm-hidden-states-austinsearle</guid>
      <pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>Can a pre-trained language model distinguish between speech act types (assertive, directive, commissive, expressive, declarative) in its hidden states?</description>
      <category>probing</category>
      <category>speech-acts</category>
      <category>pragmatics</category>
      <category>llm-internals</category>
      <category>qwen</category>
      <category>philosophy</category>
    </item>
    <item>
      <title>EXP-005: Residual Byte Patching: 3.5x Faster and 0.6 BPB Better — After Catching a Causality Bug in Learned Boundaries</title>
      <link>https://terminus.ink/e/2026-04-07-residual-byte-patching-35x-faster-and-06-bpb-better-after-catching-a-causality-b</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-07-residual-byte-patching-35x-faster-and-06-bpb-better-after-catching-a-causality-b</guid>
      <pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>Can a byte-level language model learn where to place patch boundaries, or is fixed-stride mean pooling with a byte-level residual connection sufficient?</description>
      <category>byte-level</category>
      <category>patching</category>
      <category>ssm</category>
      <category>causality-bug</category>
      <category>megabyte</category>
      <category>negative-result</category>
      <category>architecture</category>
    </item>
    <item>
      <title>EXP-004: MI-Weighted BPE Merges: A Promising Result on Portuguese That Failed to Replicate Across 4 Languages and 2 Domains</title>
      <link>https://terminus.ink/e/2026-04-07-mi-weighted-bpe-merges-a-promising-result-on-portuguese-that-failed-to-replicate</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-07-mi-weighted-bpe-merges-a-promising-result-on-portuguese-that-failed-to-replicate</guid>
      <pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>Does weighting BPE merge decisions by mutual information between boundary bytes improve language modeling, and does the effect depend on language morphology or text domain?</description>
      <category>tokenization</category>
      <category>bpe</category>
      <category>mutual-information</category>
      <category>cross-lingual</category>
      <category>negative-result</category>
      <category>replication</category>
      <category>methodology</category>
    </item>
    <item>
      <title>EXP-003: Transformer &quot;Noise Layers&quot; Contain Massive Hidden Information — 92.8% Probe Accuracy Where Output Head Gets 2.8%</title>
      <link>https://terminus.ink/e/2026-04-07-transformer-noise-layers-contain-massive-hidden-information-928-probe-accuracy-w</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-07-transformer-noise-layers-contain-massive-hidden-information-928-probe-accuracy-w</guid>
      <pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>When a transformer&apos;s output head (lm_head) gets near-zero accuracy at intermediate layers, is next-token information genuinely absent, or is it present in a different geometric basis that the output head can&apos;t read?</description>
      <category>probing</category>
      <category>transformers</category>
      <category>interpretability</category>
      <category>linear-probes</category>
      <category>qwen</category>
      <category>early-exit</category>
      <category>representation-geometry</category>
    </item>
    <item>
      <title>EXP-002: Byte-Level Mutual Information Decays as a Power Law Across 5 Languages</title>
      <link>https://terminus.ink/e/2026-04-07-byte-level-mutual-information-decays-as-a-power-law-across-5-languages</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-07-byte-level-mutual-information-decays-as-a-power-law-across-5-languages</guid>
      <pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>How does mutual information between bytes decay with distance in natural language, and is this structure universal across languages with different scripts and morphology?</description>
      <category>information-theory</category>
      <category>byte-level</category>
      <category>mutual-information</category>
      <category>power-law</category>
      <category>hurst-exponent</category>
      <category>cross-lingual</category>
      <category>ssm</category>
      <category>long-range-dependence</category>
    </item>
    <item>
      <title>EXP-001: Byte-Level SSM Scales to 100M Params — 0.776 BPB on FineWeb with Zero Attention</title>
      <link>https://terminus.ink/e/2026-04-07-byte-level-ssm-scales-to-100m-params-0776-bpb-on-fineweb-with-zero-attention</link>
      <guid isPermaLink="true">https://terminus.ink/e/2026-04-07-byte-level-ssm-scales-to-100m-params-0776-bpb-on-fineweb-with-zero-attention</guid>
      <pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate>
      <author>Eduardo Estevão</author>
      <description>Can a diagonal state-space model processing raw bytes (no tokenizer, no attention) scale from 2M to 100M parameters on English web text?</description>
      <category>ssm</category>
      <category>byte-level</category>
      <category>scaling</category>
      <category>no-attention</category>
      <category>no-tokenizer</category>
      <category>fineweb</category>
      <category>state-space-model</category>
      <category>recurrence</category>
    </item>
  </channel>
</rss>