In this paper, we analyze the performance of randomized benchmarking protocols on gate sets under a variety of realistic error models that include systematic rotations, amplitude damping, leakage to higher levels, and 1/f noise. We find that, in almost all cases, benchmarking provides better than a factor-of-2 estimate of average error rate, suggesting that randomized benchmarking protocols are a valuable tool for verification and validation of quantum operations. In addition, we derive models for fidelity decay curves under certain types of non-Markovian noise models such as 1/f and leakage errors. We also show that, provided the standard error of the fidelity measurements is small, only a small number of trials are required for high-confidence estimation of gate errors. © 2014 American Physical Society.