Automatic Prompt Optimization for Knowledge Graph Construction: Insights from an Empirical Study

Nandana Mihindukulasooriya; Niharika DSouza; Md Faisal Mahbub Chowdhury; Horst Samulowitz

VLDB 2025

Workshop paper

01 Sep 2025

Automatic Prompt Optimization for Knowledge Graph Construction: Insights from an Empirical Study

Abstract

A knowledge graph (KG) represents a network of entities and illustrates relationships between them. KGs are used for various applications, including semantic search and discovery, reasoning, decision making, natural language processing, machine learning, and recommendation systems. Automatic KG construction from text is an active research area. Triple (subject-relation-object) extraction from text is the fundamental block of KG construction and has been widely studied since early benchmarks such as ACE 2002 to more recent ones such as WebNLG 2020, REBEL and SynthIE. There has also been a number of works in the last few years exploiting LLMs for KG construction. However, handcrafting reasonable task-specific prompts for LLMs is a labour-intensive task and is subject to being brittle to the changes in LLM models. Recent work in various NLP tasks (e.g. autonomy generation) using automatic prompt optimisation/engineering addresses this challenge by generating optimal or near-optimal task-specific prompts given input-output examples.

This empirical study explores the application of automatic prompt optimisation for the triple extraction task using experimental benchmarking. We evaluate different settings by changing (a) the prompting strategy, (b) the LLM being used for prompt optimisation and task execution, (c) number of canonical relations in the schema (schema complexity), (d) the length and diversity of input text, (e) the metric used to drive the prompt optimization, and (f) the dataset being used for training and testing. We evaluate three different automatic prompt optimizers, namely, DSPy, APE, and TextGrad and use two different triple extraction datasets, SynthIE and REBEL. Our main contribution is to show that automatic prompt optimisation techniques can generate reasonable prompts similar to humans for triple extraction and achieve improved results, with significant gains observed as text size and schema complexity increase.

Conference paper