1 experiment
How do output distribution shape, attention head specialization, and surprisal rhythm vary across languages and text genres in a multilingual LLM?