The method of Graph Cuts converts a Maximum a Posteriori (MAP) inference problem on a Markov Random Field (MRF) into a network flow, which can be solved efficiently. Many computer vision problems can be conveniently cast as an inference task to find most likely labels for pixels. The method is widely used, but computationally burdensome. Prior accelerator attempts have failed to exploit the problem's attractive, maximum available parallelism: push-relabel flow solvers can run in parallel across every pixel. This paper describes the design and implementation of the first pixel-parallel Graph Cuts inference engine. Our prototype implements a 256-pixel tile of an image, implemented as 256 locally-connected pixel processors. A checkerboard scheduling scheme allows for maximum parallelism while avoiding critical data dependencies. A 150MHz implementation on an FPGA can solve a segmentation task in 6 microseconds. We also discuss strategies for extending our prototype to larger 'virtual' images that span more than the physical extent of the inference tile. Our model suggests 2-40× speedups compared with previous accelerator experiments. To the best of our knowledge, this is the first fully functional, pixelparallel accelerator demonstration for Graph Cuts inference.