Tabular data, semantics and AI seminar series

Virtual
This event has ended.

About

This seminar series will address a broad range of topics related to achieving novel capabilities on tabular data through semantics and AI. Attendees will learn about and discuss the following topics: 

  • Better data understanding, exploration, explanations through semantics and AI
  • Table augmentation and search with semantics
  • Semantic data management and organization
  • Human Computer Interaction enabling users to effectively leverage a semantically driven system for data/predictive tasks
  • AI and semantics: from model building using semantics (e.g., Trusted AI) to using semantics in RL or AI planning
  • Benchmarks and evaluations of approaches

Why attend

Our first seminar is on November 3, and will feature Fatemeh Nargesian. Fatemeh Nargesian is an assistant professor in the Department of Computer Science, at the University of Rochester. She got her PhD at the University of Toronto and was a research intern at IBM Watson in 2014 and 2016. Before the University of Toronto, she worked at Clinical Health and Informatics Group at McGill University. Her primary research interests are in data intelligence focused on data for ML as well as time-series analysis.

Seminar title: Semantic Set Overlap for Join Search

Abstract: Set overlap has been extensively considered as a column joinability measure. However, search techniques based on vanilla overlap fail for semantic search since similar set elements may be unrelated at the character level. In this talk, first, I will introduce semantic overlap and its application to join search. While vanilla overlap requires exact matches between set elements, semantic overlap allows elements that are syntactically different but semantically related to increase the overlap. The semantic overlap is the maximum matching score of a bipartite graph, where an edge weight between two set elements is defined by a user-defined similarity function, e.g., cosine similarity between embeddings. Next, I will present KOIOS, an exact and efficient algorithm that solves the top-k set similarity search problem using semantic overlap. KOIOS is a filter-verification framework including powerful and cheap-to-update filters that prune sets during both the refinement and post-processing phases. Finally, I will discuss the empirical evaluation of KOIOS on web data and open data.

Please email Kavitha Srinivas with any questions about this event: kavitha.srinivas@ibm.com

Speakers

FP
 Fatemeh Nargesian, PhD

Fatemeh Nargesian, PhD

Assistant Professor of Computer Science
University of Toronto

Related Events