As the wide adoption of FinFET technology in mass production, dynamic power becomes the bottleneck to achieving low power. Therefore, clock power reduction is crucial in modern IC design. Register clustering can effectively save clock power because of significantly reducing the number of clock sinks and register pin capacitance, clock routed wirelength, and the number of clock buffers. In this paper, we propose effective mean shift to naturally form clusters according to register distribution without placement disruption. Effective mean shift fulfills the requirements to be a good register clustering algorithm because it needs no prespecified number of clusters, is insensitive to initializations, is robust to outliers, is tolerant of various register distributions, is efficient and scalable, and balances clock power reduction against timing degradation. Experimental results show that our approach outperforms state-of-the-art work on power and timing balancing, as well as efficiency and scalability.