Recently, machine learning yield models for Integrated Circuit (IC) have gained widespread prominence in the EDA community, and are very promising in terms of emulating memory design functionality and thereby speeding up circuit simulation based variance reduction methods. A main challenge that arises in this area is class imbalance that occurs naturally due to the high targeted manufacturing yield. Thus, the imbalanced nature of the sampled memory datasets can compromise the model performance. In this work, we attain deep insights into the memory classification problem for modeling rare fail events in the context of importance sampling based yield analysis. We propose a comprehensive and computationally efficient method that addresses the joint considerations of the best combination of relevant features and class balance ratios, which are key for classifier generalization capability. The methodology relies on synthetic minority oversampling techniques to enforce the minority class while probing for the best data balance ratio in conjunction with an iterative L1-SVM based approach that qualifies as an approximation to the L0-norm regularization for the best feature subset selection. We compare the proposed methodology against standalone L1-SVM solutions, unbalanced L0-norm approximation as well as an algorithmic data balancing method in the context of yield estimation methodology. The methodology is shown to result in high fidelity classifiers as demonstrated when analyzing the yield of a 14nm FinFET SRAM cross-section with speedup of 179x for the importance sampling simulations compared to pure circuit simulation based approaches and an average error of 0.19 sigma.