Improving the efficiency of legal e-discovery services using text mining techniques

Sachindra Joshi; Prasad M. Deshpande; Thomas Hampp

doi:10.1109/SRII.2011.97

SRII 2011

Conference paper

26 Aug 2011

Improving the efficiency of legal e-discovery services using text mining techniques

View publication

Abstract

E-Discovery Review is a type of legal service that aims at finding relevant electronically stored information (ESI) in a legal case. This requires manual reviewing of large number of documents by legal analysts, thus involving huge costs. in this paper, we investigate the use of IT, specifically text mining techniques, for improving the efficiency and quality of the e-discovery review service. We employ near duplicate detection and automatic classification techniques that can be used to create coherent groups of documents. Since a group characterizes a syntactic or a semantic theme all the documents in a group can be reviewed together. This leads to a faster and more consistent review of documents. Our experimental results on the publicly available Enron email corpus show that we can achieve high precision and recall in identifying the syntactic and semantic groups. We also conduct a user study that demonstrates 80% reduction in the review time and improved consistency in the review results, leading to better service quality. © 2011 IEEE.

Conference paper