Date of Award


Document Type


Degree Name

Master of Science in Electrical Engineering


Department of Electrical and Computer Engineering

First Advisor

John F. Raquet, PhD


While Convolutional Neural Networks (CNNs) can estimate frame-to-frame (F2F) motion even with monocular images, additional inputs can improve Visual Odometry (VO) predictions. In this thesis, a FlowNetS-based [1] CNN architecture estimates VO using sequential images from the KITTI Odometry dataset [2]. For each of three output types (full six degrees of freedom (6-DoF), Cartesian translation, and transitional scale), a baseline network with only image pair input is compared with a nearly identical architecture that is also given an additional rotation estimate such as from an Inertial Navigation System (INS). The inertially-aided networks show an order of magnitude improvement over the baseline when predicting rotation, but the aided rotation predictions are still worse than the input rotations. Translation predictions are not necessarily helped either. A full-trajectory analysis gives similar results. The INS-aided neural networks are also tested for sensitivity to angular random walk (ARW) and bias errors in the sensor measurements.

AFIT Designator


DTIC Accession Number