DYNAMIC LOSS-BASED SAMPLE REWEIGHTING FOR IMPROVED LARGE LANGUAGE MODEL PRETRAININGDaouda A. SowHerbert Woisetschlägeret al.2025ICLR 2025Conference paper
Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics AnalysisHongru YangYingbin Lianget al.2025JMLRPaper