Background Language use is of increasing interest in the study of mental illness. Analytical approaches range from phenomenological and qualitative to formal computational quantitative methods. Practically, the approach may have utility in predicting clinical outcomes. We harnessed a real-world sample (blog entries) from groups with psychosis, strong beliefs, odd beliefs, illness, mental illness and/or social isolation to validate and extend laboratory findings about lexical differences between psychosis and control subjects. Method We describe the results of two experiments using Linguistic Inquiry and Word Count software to assess word category frequencies. In experiment 1, we compared word use in psychosis and control subjects in the laboratory (23 per group), and related results to subject symptoms. In experiment 2, we examined lexical patterns in blog entries written by people with psychosis and eight comparison groups. In addition to between-group comparisons, we used factor analysis followed by clustering to discern the contributions of strong belief, odd belief and illness identity to lexical patterns. Results Consistent with others' work, we found that first-person pronouns, biological process words and negative emotion words were more frequent in psychosis language. We tested lexical differences between bloggers with psychosis and multiple relevant comparison groups. Clustering analysis revealed that word use frequencies did not group individuals with strong or odd beliefs, but instead grouped individuals with any illness (mental or physical). Conclusions Pairing of laboratory and real-world samples reveals that lexical markers previously identified as specific language changes in depression and psychosis are probably markers of illness in general.