We demonstrate a modified stochastic gradient (Tiki-Taka v2 or TTv2) algorithm for deep learning network training in a cross-bar array architecture based on ReRAM cells. There have been limited discussions on cross-bar arrays for training applications due to the challenges in the switching behavior of nonvolatile memory materials. TTv2 algorithm is known to overcome the device non-idealities for deep learning training. We demonstrate the feasibility of the algorithm for a linear regression task using 1R and 1T1R ReRAM devices. Using the measured device properties, we project the performance of a long short-term memory (LSTM) network with 78 K parameters. We show that TTv2 algorithm relaxes the criteria for symmetric device update response. In addition, further optimization of the algorithm increases noise robustness and significantly reduces the required number of states, thereby drastically improving the model accuracy even with non-ideal devices and achieving the test error close to that of the conventional learning algorithm with an ideal device.