New service from IBM Research and X-Force Red makes vulnerability management more efficient

IBM Research and IBM’s X-Force Red security teams have created a way for companies to measure an asset’s “resistance strength.”

Tools that use industry standards and benchmarks to monitor and visualize the security posture of hybrid cloud environments are getting more popular as cloud management becomes more complex, and attacks grow more sophisticated. The number of Common Vulnerabilities and Exposures (CVE) in the National Vulnerability Database, for example, has risen steeply in the past year, up from more than 144,000 in late 2020 to more than 171,000 at recent count.

To mitigate such security threats, more and more industry standards and benchmarks have been proposed to monitor, visualize and remediate cloud security postures. Tools to automate such practices are likewise becoming more prevalent.

However, too many standards—the Center for Internet Security (CIS) alone features more than 140 published benchmarks—confuse the tools used to accurately analyze risks in different user-specific environments, delaying much-needed security improvements.

The number of Common Vulnerabilities and Exposures has risen steeply in the past year, up from more than 144,000 in late 2020 to more than 171,000 at recent count.

To meet those needs, IBM Research and IBM’s X-Force Red security teams have created a way for companies to measure an asset’s “resistance strength”—a term coined by the popular FAIR risk management model to measure an asset’s ability to defend itself.

Where to start?

To measure the resistance strength of your cloud environment, you need to measure the risk of each control specified in best practices documents such as (CIS) Benchmarks or the US DoD Security Technical Implementation Guides (STIGs). While some of these documents do offer guidance in terms of their risk priority, some do not—and there is not a 10-point scale as we see in the CVE world. And these findings add up fast—companies are typically overwhelmed with the amount of data they need to deal with when it comes to increasing resistance strength. Modern compute environments need to be compliant with tens (if not hundreds) of policies, and each policy on average may have hundreds of checks.

Typically, businesses will find that some portion of their checks are non-compliant at any given time due to, for example, misconfigurations, default passwords or lax controls on permissions. Fixing those problems requires expertise and time. As a result, these companies are asking how to prioritize an approach to risk so they can address the most urgent problems first.

Using AI and search to create a risk calculator

Our research focused on creating a starting point for businesses, using a combination of Watson Discovery search-based techniques and AI techniques—with STIGs as our training data—to predict risk associated with each check found in best practices documents.

Taking into account environmental factors—such as whether a network is public or private—we derive a threat risk score to measure the overall strength and weakness of a hybrid cloud environment. We also calculate risk for each check, asset and group within a given cloud.

One challenge is that STIG data is unbalanced in terms of the label distribution (High, Medium, Low), with fewer labels for High than for the other classifications. To avoid creating a biased model trained from the data and to be more sensitive to high-risk checks, we adopted an ensemble approach combining machine learning with Watson Discovery.

Our ensemble approach to calculating risk for best practices checks has a 0.91 F-score, which is a measure of accuracy. This score combines precision (the percentage of selected items that are relevant, compared to all selected items) and recall (the percentage of selected items that are relevant, compared to all true relevant items).

Based on the success of our research, our approach has been integrated into X-Force Red’s Vulnerability Management Services (VMS) to improve the offering’s ability to identify, prioritize and remediate vulnerabilities and other weaknesses. In this way, organizations can measure risk not only according to patch level, but also in terms of their asset’s inherent ability to thwart attackers.

Learn more about:

Cloud Security: We’re working on building the most secure cloud infrastructure platforms. Our research focuses on ensuring the integrity of everything in the stack, reducing the attack surface of cloud systems, and advancing the use of confidential computing and hardware security modules.

Subscribe to our Future Forward newsletter and stay up to date on the latest research news

Subscribe to our newsletter

Expanding AI model training and inference for the open-source community
News
Peter Hess
21 Oct 2025
IBM Storage Scale delivers real-world performance: an in-depth analysis
Technical note
Brian Belgodere, Chris Miller, John Lewars, Matthew Klos, Yukio Hayashi Leon, Mara Miranda Bautista, and Olaf Weiser
04 Aug 2025
Reimagining storage for the generative AI era
Research
Talia Gershon, Mike Murphy, Swaminathan Sundararaman, Haris Pozidis, and Khanh Ngo
12 May 2025
IBM and UIUC develop an orchestration system to serve LLMs more efficiently
Technical note
Archit Patke, Saurabh Jha, and Chandra Narayanaswami
18 Apr 2025
- AI
- Hybrid Cloud

Where to start?

Using AI and search to create a risk calculator

Learn more about:

Related posts

Expanding AI model training and inference for the open-source community

IBM Storage Scale delivers real-world performance: an in-depth analysis

Reimagining storage for the generative AI era

IBM and UIUC develop an orchestration system to serve LLMs more efficiently