We are concerned with the problem of similarity joins of text data, where the task is to find all pairs of documents above an expected similarity. Such a problem often serves as an indispensable step in many web applications. A crucial issue is to preclude unnecessary candidate pairs as many as possible ahead of expensive similarity evaluation. In this paper, we initiate an idea of adopting a cascade structure in text joins for a large speedup, where a latter stage can exclude a considerable number of invalid pairs survived in former stages. The proposed algorithm is shortly referred to as CasJoin. We further adopt a prefix filter to build the stage of CasJoin by introducing a novel vision to the dynamic generation of document vector. Specifically, a vector is partitioned into a chain of multiple prefixes that are appended one by one for cascade joining. We evaluate our CasJoin on a typical web corpus, ODP. Experiments indicate that, comparing to the state-of-the-art prefix algorithms, CasJoin can achieve a drastic reduction of candidates by as much as 98.15% and a dramatic speedup of joining by up to 13.34x. © 2010 ACM.