Wireless capsule endoscopy is a non-invasive screening procedure of the gastrointestinal (GI) tract performed with an ingestible capsule endoscope (CE) of the size of a large vitamin pill. Such endoscopes are equipped with a usually low-frame-rate color camera which enables the visualization of the GI lumen and the detection of pathologies. The localization of the commercially available CEs is performed in the 3D abdominal space using radio-frequency (RF) triangulation from external sensor arrays, in combination with transit time estimation. State-of-the-art approaches, such as magnetic localization, which have been experimentally proved more accurate than the RF approach, are still at an early stage. Recently, we have demonstrated that CE localization is feasible using solely visual cues and geometric models. However, such approaches depend on camera parameters, many of which are unknown. In this paper the authors propose a novel non-parametric visual odometry (VO) approach to CE localization based on a feed-forward neural network architecture. The effectiveness of this approach in comparison to state-of-the-art geometric VO approaches is validated using a robotic-assisted in vitro experimental setup.