Ground-truth labeling is an important activity in machine learning. Many studies have examined how crowdworkers apply labels to records in machine learning datasets. However, there have been few studies that have examined the work of domain experts when their knowledge and expertise are needed to apply labels. We provide a grounded account of the work of labeling teams with domain experts, including the experiences of labeling, collaborative confgurations and work-practices, and quality issues. We show three major patterns in the social design of ground truth data: Principled design, Iterative design, and Improvisational design. We interpret our results through theories of from Human Centered Data Science, and particularly work on human interventions in data science work through the design and creation of data.