Many modern applications generate a significant amount of data in dispersed geographical areas. To analyze and make use of the data, data fusion and machine learning techniques are usually applied, which has the potential to greatly enhance the amount of information extracted from the data. These algorithms traditionally run in data center environments where all the data are available at a central location. It is challenging to run them in distributed coalition environments, where it is impractical to send all the raw data to a single place due to bandwidth and security constraints. This problem has gained notable attention recently. In this paper, we provide an overview of available techniques and recent results of performing data fusion and machine learning in a distributed coalition environment, without sharing the raw data among local processing nodes. We discuss techniques for distributed model training, scoring, and outline some applications where these techniques are applicable and beneficial.