The Two Models That Never Met. Both Measured at the Same Depth.
Two natively-trained 1-bit LLMs converge on the same activation anomaly without ever sharing weights. A note on convergence under pressure.
Read More about The Two Models That Never Met. Both Measured at the Same Depth.