Multi-modal agents for business intelligence
Given their installation on nearly a billion consumer devices around the world, consumers clearly enjoy using voice-driven assistants such as Alexa, Siri and Google Home, and find it compelling to interact with AI agents as quasi-human entities. Inevitably, people who are accustomed to using voice-driven assistants at home and in the car will expect to use such technologies in the workplace. What form will this take? Simple extrapolations from consumer space (e.g. running meetings or presentations) promise only modest value. I propose that AAMAS and the AI research community should pursue a bolder vision in which software agents act as quasi-human collaborators on core business intelligence tasks that entail analyzing data, diagnosing problems, and making decisions. Moreover, these business intelligence agents should communicate multi-modally, i.e. they must understand and employ speech complemented by nonverbal behaviors such as pointing, eye gaze, and facial expression. I outline requisite agent, MAS and AI technologies and pose several fundamental research challenges raised by this vision.