With the slowdown of Moore's law and the stop of Dennard scaling, energy efficiency of compute hardware translates to compute power. Therefore, High-Performance Computing (HPC) systems tend to rely more and more on accelerators such as Field-Programmable Gate Arrays (FPGAs) to fuel high demanding workloads, like Big Data applications or Deep Neuronal Networks. These FPGAs are reconfigurable and sometimes no longer bus-attached to a CPU but directly connected to the data center network fabric as standalone nodes. This mix of CPUs and FPGAs leads to the creation of Reconfigurable Heterogeneous HPC $(ReH _2 PC)$ clusters for which no established programming model exists, despite many proposals in the past. In contrast to this, the Message Passing Interface (MPI) has evolved as the de-facto standard to program classical HPC clusters, due to its high-re-usability and fast development of applications. This paper revisits the programming model of ReH _2 PC clusters and argues that MPI is suitable for program-ming heterogeneous clusters of FPGAs and CPUs. Our experiments with 31 FPGAs show an average speedup of 4 and a 90% reduction of power consumption compared to a cluster of CPUs. We demonstrate a one-click solution for compiling and deploying a standard MPI application on ReH2PC clusters. Our framework implements a High-Level Synthesis (HLS) library, a specific run-time environment for FPGAs and CPUs, and a transpiler that closes the semantic gap between the MPI API and FPGA designs. Our experiments with 31 FPGAs show an average speedup of 4 and a 90% reduction of power consumption compared to a cluster of CPUs.