Gemma4 and Qwen3 were trained by different organizations on different data with different architectures. Their internal representations are 99.2% similar at matched depth. Neither model knew the other existed.
LarQLInterpretabilityCKACross-ModelMechanistic InterpretabilityUniversal Constants
Read MoreTwo natively-trained 1-bit language models, from two different organizations, converge on the same anomaly: the four-stage circuit that organizes every fp16 transformer simply isn't there. Both models still answer correctly. The structure is gone, but the behavior survived.
LarQLInterpretabilityQuantizationBitNetBonsaiMechanistic Interpretability
Read MoreA single rank-1 weight edit suppresses one learned fact while leaving the rest of the model intact. No fine-tuning. No retraining. Just a feature subtracted from one layer's gate matrix — with a receipt.
LarQLInterpretabilityKnowledge EditingUnlearningMechanistic Interpretability
Read MoreI've run LarQL on 9 models from 5 organizations — from a 360M toy to OpenAI's 120B MoE. Three numbers hold within ±15% across all of them. One pattern vanishes the moment you go to 1-bit weights.
LarQLInterpretabilityTransformersMachine LearningMechanistic Interpretability
Read More