Abstract High-speed network intrusion detection systems (NIDSes) commonly employ TCAMs for fast pattern matching, and parallel TCAM-based pattern matching algorithms have proven promising to achieve even higher line rate. However, two challenges impede parallel TCAM-based pattern matching engines from being truly scalable, namely: (1) how to implement fine-grained parallelism to optimize load balancing and maximize throughput, and (2) how to reconcile between the performance gain and increased power consumption both due to parallelism. In this paper, we propose two techniques to answer the above challenges yielding an ultra-scalable NIDS. We first introduce the concept of negative pattern matching, by which we can splice flows into segments for fine-grained load balancing and optimized parallel speedup while ensuring correctness. negative pattern matching (NPM) also dramatically reduces the number of Ternary Content Addressable Memory (TCAM) lookups thus reducing the power consumption. Then we propose the idea of exclusive pattern matching, which divides the rule sets into subsets; each subset is queried selectively and independently given a certain input without affecting correctness. In concert, these two techniques improve both the pattern matching throughput and scalability in any scenario. Our experimental results show that up to 90% TCAM lookups can be saved, at the cost of merely 10% additional 2-byte index table lookups in the SRAM.