Hayim Shaul,
IBM Research - Israel
hayim.shaul@ibm.com
Ehud Aharoni,
IBM Research - Israel
aehud@il.ibm.com
Nir Drucker,
IBM Research - Israel
drucker.nir@gmail.com
Abstract
Outsourcing computations over sensitive data to a third-party cloud
environment should usually rely on dedicated privacy-preserving
solutions to adhere to privacy regulations such as the GDPR. One solution that gained great attention is fully homomorphic
encryption (FHE), a cryptographic method that allows users to perform
different types of computation on encrypted data. However, writing a
non-interactive FHE code that evaluates complex functions is a
task that is mostly left to experts. Otherwise, the resulting code may
become very slow and even impractical.
Tile tensors are a recently developed data structure that aims, together with
a dedicated language, to simplify the process of writing
complex FHE programs. This tutorial introduces developers of security
solutions without previous FHE background to the world
of FHE programming through the use of tile tensors. It provides step-by-step guidelines for implementing complex operators such as
matrix-multiplication and convolutions, and eventually guides participants toward writing their own privacy-preserving convolutional
neural network solution. The demonstrations in this tutorial
use Python and the HELayers library that implements tile tensors.
Tutorial Description
Topics
Summary
- FHE, security models and applications.
- SIMD packing: - Challenge and motivation
- Examples – Matrix multiplication algorithms
- Using a chain of operations to perform privacy-preserving machine learning (PPML)
- Automating the process
- Research opportunities
- Tile tensors
– Convolution algorithms
In detail
FHE, security models and applications: FHE is an encryption method that allows you to apply additions and multiplications to ciphertexts without decrypting them. This leads to interesting models where two or more parties can collaboratively compute a function over their private data without sharing it. As an example, this allows for inference-as-a-service, in which a client encrypts and uploads her (private) data to a server that uses a pre-trained model to perform inference over the data. In this situation, the server is not exposed to the sensitive data (making it easier to comply with privacy regulations), yet the client receives the resulting computed analytics, which she can decrypt. Traditionally, FHE was considered too slow compared to other privacy-preserving solutions (such as multi-party computations (MPC)or garbled circuit (GC)). However, in recent years, the run-times of FHE solutions have improved dramatically. One of the techniques that boosted FHE performance is packing multiple values into the slots of a single ciphertext. Operators applied on ciphertexts are then applied slot-wise in a single instruction multiple data (SIMD) manner. Taking advantage of these slots became a subject of research and multiple packing schemes were proposed.
In this tutorial, we discuss: (i) the challenges of designing an algorithm that utilizes slots; (ii) how packings can affect the run-time and RAM requirements of an algorithm; (3) how to write FHE code that can easily switch from one packing to another, and; (4) how to choose the packing technique for your protocol.
SIMD packing, challenges, motivation, and tile tensors: As hinted above, using SIMD improves the running time of an algorithm. Alas, utilizing SIMD is not always trivial. For example, even for the simple problem of matrix multiplication, several packing methods were proposed in several previous works ([1, 6, 9] to name a few). For more complex systems such as neural networks, the problem becomes harder and it's not always clear from the beginning which packing method will provide the best run-time results. To make matters worse, for the same problem, different packing schemes may be better, depending on the parameters of the input.
Until recently, changing the packing methods required rewriting a lot of the code. This made writing practical solutions with FHE much more expensive and time consuming. In [1, 3], a packing-oblivious language was proposed in which an algorithm is described at a high level, and the details of the packing are filled in later by the compiler, possibly even during run-time. Using the language proposed by [1], an algorithm can be described as operating on tensors. The details of how tensors are packed into ciphertext is given in a concise manner called a shape. Changing the packing can then be done easily by changing the shape of the tensor.
Specific demonstrations: As an example, we will show how different packing methods change the way a matrix is packed. In addition, we explore how packing the input of a convolutional layer affects the overall run-time and memory usage.
Automating the packing process: As mentioned above, different packing methods yield different run-times and RAM requirements. To measure run-time, we consider two metrics: (i) latency - the time passed from receiving an input until delivering the output, and (ii) throughput - the number of outputs we can deliver in a given amount of time, give infinite inputs. When implementing a system, we want to choose the packing method that complies with our RAM requirements and achieves the best possible run-time. We describe an optimizer module that scans all possible shapes and reports the best shape that complies with our RAM requirements. Our optimizer considers different shapes for the input and measures the running time and RAM requirements for each shape. To speed up the execution of the optimizer we also consider gradient-descent heuristics.
Demonstration
The tutorial will be accompanied by online (interactive) demonstrations using HELayers [2, 4, 7, 10], running locally on the participants’ laptops or through pre-defined Google Colab notebooks.
Goals
- Explore and experience a new and exciting data structure called tile tensors that enable writing efficient, non-interactive homomorphic encryption solutions.
- Experiment how homomorphic encryption packing affects run-time and memory.
- Learn a new language to write packing-oblivious code that allows switching between packing methods easily.
- Learn how to use the HELayers library (as a demonstration of the tile tensors technique) to write HE applications, specifically a PPML solution.
Expected background and prerequisites of participants
This tutorial is intended for researchers and practitioners from academia, government, industry, and anyone else interested in privacy-preserving applications. We will focus on the technical aspects of packing techniques and efficiency, and therefore the material will not be relevant to those interested in legal, ethical, or sociological security considerations. Nonetheless, social scientists and policymakers are welcome to participate and hear about recent real-life applications of privacy-preserving techniques. Prerequisite knowledge: This tutorial assumes college-level knowledge of mathematics, specifically linear algebra. We will demonstrate code in Python and encourage the participants to learn by writing code. Basic Python knowledge is therefore required. The tutorial will include a brief general overview of homomorphic encryption, which is sufficient for achieving the tutorial’s goals. The final demonstrations include convolutional neural networks, which we present as a series of algebraic operators. Thus, a deep understanding of machine learning concepts is not required.
Presenters
Hayim Shaul, IBM Research - Israel
Hayim is currently with IBM Research in Haifa, Israel. Hayim completed his PhD in computational geometry under the supervision of Prof. Micha Sharir in Tel- Aviv University. After that, he was the co-founder and CTO of DiviNetworks, a company funded by the IFC, dealing with network optimizations. After that, he joined MIT as a research fellow in the CSAIL lab doing research in homomorphic encryption. Today, Hayim is a researcher in IBM Research continuing his study of secure multi-party computation.
Ehud Aharoni, IBM Research - Israel
Ehud is currently with IBM Research in Haifa, Israel. Ehud received an M.Sc. in computer science from the Technion - Israel Institute of Technology. He worked several years on machine learning projects in the fields of hardware verification, healthcare, and anomaly detection for computer systems. Later, he worked on various applications of machine learning to cyber security. Ehud is currently working on novel encryption schemes in the context of security and privacy preservation.
Nir Drucker, IBM Research - Israel
Nir is currently with IBM Research in Haifa, Israel. Nir holds a PhD in applied mathematics (cryptography) from the University of Haifa and an M.Sc. in operations research from the Faculty of Industrial Engineering and Management at the Technion - Israel Institute of Technology. He worked for more than three years as a senior applied scientist at AWS (cryptography) and eight years as a software developer at Intel. Nir’s research interests focus on applied cryptography and applied security.
References
- Ehud Aharoni, Allon Adir, Moran Baruch, Nir Drucker, Gilad Ezov, Ariel Farkash, Lev Greenberg, Ramy Masalha, Guy Moshkowich, Dov Murik, Hayim Shaul, and Omri Soceanu. 2020. HeLayers: A Tile Tensors Framework for Large Neural Networks on Encrypted Data. CoRR abs/2011.0 (2020). Read more
- Ehud Aharoni, Moran Baruch, Nir Drucker, Gilad Ezov, Eyal Kushnir, Guy Moshkowich, and Omri Soceanu. 2022. Poster : Secure SqueezeNet inference in 4 minutes. 43rd IEEE Symposium on Security and Privacy (2022). Read more
- Ehud Aharoni, Nir Drucker, Gilad Ezov, Hayim Shaul, and Omri Soceanu. 5555.
Complex Encoded Tile Tensors: Accelerating Encrypted Analytics. IEEE Security
& Privacy 01 (jun 5555), 2–10.
Read more - John Buselli. 2021. Secure AI workloads using fully homomorphic
encrypted data. (sep 2021).
Read more - EU General Data Protection Regulation. 2016. Regulation (EU) 2016/679 of the
European Parliament and of the Council of 27 April 2016 on the protection
of natural persons with regard to the processing of personal data and on the
free movement of such data, and repealing Directive 95/46/EC (General Data
Protection Regulation). Official Journal of the European Union 119 (2016).
Read more - Shai Halevi and Victor Shoup. 2014. Algorithms in HElib. In Advances in Cryptology
- CRYPTO 2014 - 34th Annual Cryptology Conference, Santa Barbara, CA,
USA, August 17-21, 2014, Proceedings, Part I (Lecture Notes in Computer Science),
Juan A. Garay and Rosario Gennaro (Eds.), Vol. 8616. Springer, 554–571.
Read more - IBM. 2021. Fully Homomorphic Encryption (FHE) - Never decrypt your data, even
during computation. Technical Report.
Read more - IBM. 2021. HELayers SDK with a Python API for x86. (2021).
Read more - Xiaoqian Jiang, Miran Kim, Kristin Lauter, and Yongsoo Song. 2018. Secure
Outsourced Matrix Computation and Application to Neural Networks. In Proceedings
of the 2018 ACM SIGSAC Conference on Computer and Communications
Security (CCS ’18). New York, NY, USA, 1209–1222.
Read more - Michael Jordan and Rohit Panjala. 2021. The next step in homomorphic encryption
for Linux on IBM Z and LinuxONE. (oct 2021).
Read more
Resources
Important Dates
ACM CCS 2022 | November 7- 11, 2022 |
Tutorial | November 11, 2022 |