Workshop paper

MEAL: A Multi-dimensional Evaluation of Alignment Techniques for LLMs

Abstract

Alignment techniques are essential for making Large Language Models (LLMs) usable and useful for real-world applications and diverse approaches have been developed, each with distinct advantages and limitations. However, the lack of unified evaluation frameworks makes it difficult to systematically compare these paradigms and guide deployment decisions. This paper introduces MEAL (Multi-dimensional Evaluation of ALignment techniques), a comprehensive and systematic evaluation framework for alignment techniques. It focuses on four key dimensions: alignment detection, alignment quality, computational efficiency, and robustness. Through experiments of models with different alignment strategies, we demonstrate the utility of our framework in identifying their strengths and limitations, providing valuable insights for future research directions.