Osaka, Japan and virtual
The 2022 IEEE International Conference on Big Data features the top tier of original research papers covering all aspects of Big Data with a focus on volume, velocity, variety, value and veracity.

Meet IBM Researchers presenting on topics from Data and AI Security, to AIOps and Federated learning.


  • AIOps can provide essential value for data lakehouses as lakehouses pose complex operational challenges for Site Reliability Engineers (SRE). This paper proposes that the unified approach of data lakehouses creates a unique opportunity for unified data resiliency management. We focus on AIOps applied to disaster recovery and backup/restore. In particular, we focus on managing data lakehouse hardware resources to ensure that lakehouse data Recovery Point Objectives (RPO) are met with a high degree of accuracy. The goal is to warn an SRE about an impending RPO violation and to suggest adding given amounts of hardware resources before a given time to avoid violation of the lakehouse data's RPO. We claim AIOps can achieve this goal with an ensemble of machine learning and time series analysis.

    Runyu Jin (IBM); Paul Muench (IBM); Veera Deenadhayalan (IBM); Brian Hatfield (IBM)

  In this paper, we present a new scalable and adaptive architecture for FL aggregation. First, we demonstrate how traditional tree overlay based aggregation techniques (from P2P, publish-subscribe and stream processing research) can help FL aggregation scale, but are ineffective from a resource utilization and cost standpoint. Next, we present the design and implementation of AdaFed, which uses serverless/cloud functions to adaptively scale aggregation in a resource efficient and fault tolerant manner. We describe how AdaFed enables FL aggregation to be dynamically deployed only when necessary, elastically scaled to handle participant joins/leaves and is fault tolerant with minimal effort required on the (aggregation) programmer side. We also demonstrate that our prototype based on Ray~\cite{ray}scales to thousands of participants, and is able to achieve a >90% reduction in resource requirements and cost, with minimal impact on aggregation latency.

    Jayaram Kr Kallapalayam Radhakrishnan (IBM); Vinod Muthusamy (IBM); Gegi Thomas (IBM); Ashish Verma (IBM); Mark Purcell (IBM)

