Palm oil is the largest vegetable oil in the world in terms of produced volume, and 75% of global production is used for food and cooking purposes. Sustainable management of the producing areas calls for the frequent assessment of field conditions. In this paper, we investigate an automatic algorithm based on deep learning that is capable to build an inventory of individual oil-palm trees using aereal color images collected by unmanned aerial vehicles. The idea consists of combining the outputs of two independent convolutional neural networks, trained on partially distinct subsets of samples and different spatial scales to capture coarse and fine details of image patches. The estimated posterior probabilities are combined by simple averaging as to improve detection accuracy and estimate the confidence for each individual detection. Non-maxima suppression removes weak detections. Experiments at three commercial oil-palm tree plantations sites aged two, four, and 16 years in Northern Brazil revealed overall detection accuracies in the range 91.2–98.8% using orthomosaics of decimeter spatial resolution. The proposed approach can be a useful component of a forest monitoring system based on remote sensing.