Regress, Don’t Guess – A Regression-like Loss on Number Tokens for Language ModelsJonas ZausingerLars Penniget al.2024NeurIPS 2024