Big Data 2022
Workshop paper

Natural Language Interface for Process Mining Queries in Healthcare

View publication


Recently, the needs of data required for data analysis are becoming more diversified, and research on data extraction and analysis methods has been continuously being made in order to effectively respond to various needs. Process mining is a solution that analyzes various system logs built by companies or healthcare institutions so that they can be used for process improvement. From the process model extracted from the system logs, it is possible not only to grasp the exact flow of the current business process, but also to acquire additional information such as repetitive execution of activities in the process where the bottleneck occurs in the business process flow. The manufacturing industry has been made great efforts to improve the process management, and as many companies are paying attention to big data these days, various data-related technologies are emerging in the healthcare industry as well to provide patients with the care needed properly. Process mining tools allow users to pull data by programming in a process mining query language using the APIs provided with the process mining tool, or by creating reusable analytical documents manually using user friendly tool. However, these tasks require user be familiar with the query language APIs and understand the data model and its relationships with respect to creating analytical documents. This paper proposes a methodology that allows users easily extract desired data through natural language interface, which relieves non-professional users of the burden of programming process mining query language. The process mining query engine with natural language interface presented in this paper consists of four major components. Among them, the natural language processing pipeline not only extracts intermediate representation of entities used when constructing a process mining query language report from natural language queries, but also effectively extracts a query hint from the context of natural language query. The query hint is used to select a process-specific function from the library that fit the context of the user query while transforming a natural language query into a process mining query report. The method proposed in this study has the advantage of being able to roughly grasp the process state for the user just by entering a query in natural language. The proposed system provides users with four query process options. That is, the user 1. Retrieves intermediate representation of entities and query hints from the NLP pipeline, 2. Retrieves the process mining query language from the query language generator, 3. Submits the query language to the process mining engine and execute the query, 4. Retrieves description of intermediate representation of entity and query hints in natural language to confirm that the query is processed correctly. The contents proposed in this paper were constructed and conducted, and the query reports in process mining query language programmatically generated by the proposed query engine were also executed on a process mining engine and the query results were verified.