Conference paper

An Indian Court Decision Annotated Corpus and Knowledge Graph


Document collection is increasing enormously in the legal domain, which requires automatic steps to analyze the data and curate the information from the same. Many challenges are being faced by the legal stakeholders to extract the information from the lengthy and unstructured court judgment documents relating to the main concepts, topics, and named entities in the documents. It has become an essential task in the current scenario to automate the information extraction process and store the documents in a properly structured format along with the different named legal entities for ease in the information extraction. In this paper, we introduce an annotated Indian Court Decision Document Corpus consisting of 10 coarse-grained classes and 30 fine-grained classes as a benchmark data set for constructing the knowledge graph. We also construct the Indian Court Case Documents' knowledge graph by utilizing a rule-based approach for Named Entity Recognition (NER) and Relation Extraction (RE). The results are evaluated against the proposed benchmark based on precision, recall, and F1 score and also qualitatively using SPARQL queries. The proposed approach gives a good F1 measure, though, further work is required to improve the recall.