IBM Research | Reliable ML Study Group

Contact

Eitan Farchi, Verification & Quality Technologies, IBM Research - Haifa

Validation / testing

Let's start the discussion on how to test ML-based systems.

We'll cover an elementary example of sequential reasoning from section 4.2 in http://www.med.mcgill.ca/epidemiology/hanley/bios601/GaussianModel/JaynesProbabilityTheory.pdf.

As we discussed the use of concentration inequalities in our meeting on reinforcement learning, we provide a refresher on this subject and discuss Markov inequality.

We cover the Shapley value and recent application to data analysis. For a deep dive, review this paper. For the curious, slide picture was taken in Hummus Yosef.

We continue with concentration inequalities.

Chebyshev's inequality:

Watch the video

At the end of August 2019, I presented our paper on testing ML applications in FSE. You may find this paper interesting.

In this chapter on how to test/validate ML based systems (under construction):

We'll cover how to create a non-parametric confidence interval.
We'll discuss the concept of empirical distribution to better motivate the non parametric confidence interval we have just discussed.
In such ideal assumptions the central limit theorem can be used to create a confidence interval (see section 2).
Bootstrapping is used to overcome budgets constraints.
We'll discuss convergence in distribution.
We'll revisit the bootstrapping example and cast it in the context of a ML learning example (example 2).

This is a Python example of a non parametric confidence interval with unlimited sampling.

Let's revisit the concept of validation and explain model selection. Both can be viewed as a special case of learning.

Reliable ML Study Group

Contact

More on IBM Research

More on IBM Research