This one-day workshop intends to bring together both academics and industry practitioners to explore collaborative challenges in speech interaction. Recent improvements in speech recognition and computing power has led to conversational interfaces being introduced to many of the devices we use every day, such as smartphones, watches, and even televisions. These interfaces allow us to get things done, often by just speaking commands, relying on a reasonably well understood single-user model. While research on speech recognition is well established, the social implications of these interfaces remain underexplored, such as how we socialise, work, and play around such technologies, and how these might be better designed to support collaborative collocated talk-in-action. Moreover, the advent of new products such as the Amazon Echo and Google Home, which are positioned as supporting multi-user interaction in collocated environments such as the home, makes exploring the social and collaborative challenges around these products, a timely topic. In the workshop, we will review current practices and reflect upon prior work on studying talk-in-action and collocated interaction. We wish to begin a dialogue that takes on the renewed interest in research on spoken interaction with devices, grounded in the existing practices of the CSCW community.