GraspNet: An efficient convolutional neural network for real-time grasp detection for low-powered devices
Recent research on grasp detection has focused on improving accuracy through deep CNN models, but at the cost of large memory and computational resources. In this paper, we propose an efficient CNN architecture which produces high grasp detection accuracy in real-time while maintaining a compact model design. To achieve this, we introduce a CNN architecture termed GraspNet which has two main branches: i) An encoder branch which downsamples an input image using our novel Dilated Dense Fire (DDF) modules - squeeze and dilated convolutions with dense residual connections. ii) A decoder branch which upsamples the output of the encoder branch to the original image size using de-convolutions and fuse connections. We evaluated GraspNet for grasp detection using offline datasets and a real-world robotic grasping setup. In experiments, we show that GraspNet achieves competitive grasp detection accuracy compared to the state-of-the-art computation-efficient CNN models with real-time inference speed on embedded GPU hardware (Nvidia Jetson TX1), making it suitable for low-powered devices.