Conference paper

Benchmarking AI Agents for IT Automation Tasks with ITBench

Abstract

Modern IT infrastructures have grown exponentially in complexity with the adoption of cloud computing and agile development methodologies, making their management increasingly challenging. These management tasks span multiple domains, including site reliability engineering (SRE), compliance and security operations (CISO), and financial operations (FinOps). AI agents have shown initial promise for automating some of these complex tasks.