A demonstration of codebreaker: A machine interpretable knowledge graph for code

Ibrahim Abdelaziz; Kavitha Srinivas; Julian Dolby; Jamie Mccusker

ISWC-Posters 2020

Conference paper

01 Nov 2020

A demonstration of codebreaker: A machine interpretable knowledge graph for code

Download paper

Abstract

Knowledge graphs have been extremely useful in powering diverse applications like natural language understanding. CodeBreaker attempts to construct machine interpretable knowledge graphs about program code to similarly power diverse applications such as code search, code understanding, and code automation. We have built such a 1.98 billion edges knowledge graph by a detailed analysis of function usage in 1.3 million Python programs in GitHub, documentation about the functions in 2300+ modules, forum discussions with more than 47 million posts, class hierarchy information, etc. In this work, we will demonstrate one application of this knowledge graph, which is a code recommendation engine for programmers within an IDE. All user interactions within the application get translated into SPARQL queries, which have quite different characteristics than queries against traditional knowledge graphs such as DBpedia or Wikidata. Aspects of code such as data flow are inherently transitive, hence the SPARQL is complex and requires property paths. One of our goals is to provide these queries as a basis for graph querying benchmarks, while allowing users the ability to interact with a real application built on top of a large graph database.

Paper