ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web AgentsIdo LevyBen Wieselet al.2026ICLR 2026Conference paper
From Benchmarks to Business Impact: Deploying IBM Generalist Agent in Enterprise ProductionSegev ShlomovAlon Ovedet al.2026IAAI 2026Conference paper
ST-WEBAGENTBENCH: A Benchmark for Evaluating Safety and Trustworthiness in Web AgentsIdo LevyBen Wieselet al.2025ICML 2025Workshop paper
Towards a Resilient Intelligent Automation SystemSegev ShlomovSami Marreedet al.2024IJCAI 2024Demo paper
The Second Resiliency of Intelligent Automation Systems ChallengeSegev ShlomovSami Marreedet al.2024IJCAI 2024Workshop
The Resiliency of Intelligent Automation Systems ChallengeSegev ShlomovSami Marreedet al.2023IJCAI 2023Workshop