Back

Discovery and Compliance Scanning

TL;DR:

Discover resources in a target cloud account and evaluate them against pre-defined rules to generate alerts. Automate process using AWS services and open-source software.

Overview

Today’s enterprise operates across thousands of resources distributed over multiple cloud environments which evolve constantly - provisioned and modified by different teams - making one problem increasingly difficult to solve: context and visibility.

How does one answer in real time:

Cloud Security Posture Scanning (CSPM) continuously monitors an infrastructure, detects misconfigurations, and provides a real-time, unified view of resources and findings.

The project

To achieve the above, we need a method to 1. discover resources and 2. evaluate their configuration against a set of rules.

Staying consistent with the approach of Agentless Vulnerability Scanning, we will use AWS services to automate this process:

Workflow

1. Initiation

Scans are triggered in two ways:

2. Polling & Task dispatch

A Lambda poller continuously consumes messages from SQS and launches the appropriate ECS task based on the payload.

3. Scan execution

The scanning software runs as a Docker container on ECS Fargate. Unlike the Agentless vulnearbility Scanning workflow - which uses ECS on EC2 due to custom volume mount requirements - this process is lightweight enough to run fully on Fargate.

The container assumes a role into the target cloud account, performs the scan and loads the data into a Neo4j database.

4. Data load & Normalisation

Results are loaded into Neo4j to be queried and displayed on the frontend.

This extra transformation step aligns Steampipe data to the CloudQuery schema, enabling consistent correlation.

Result

Due to the vast number of cloud resource types and possible relationships between them, it is incredibly difficult to represent an entire environment in a single diagram. This illustration shows a simplified example of correlations that may exist, in combination with Agentless Vulnerability Scanning.

The software

Discovery

CloudQuery is a software built to extract, transform, and load (ETL) configuration from cloud provider APIs into destinations like databases or data lakes. Its Amazon Web Services integration was open-source when I developed this project, but later shifted to a paid model, leaving only the SDK as open-source.

Compliance

Steampipe is an open-source software that enables dynamic querying of cloud provider APIs using SQL syntax. It supports many plugins, but it truly shines when using its plugin mods. For example, the Amazon Web Services plugin includes a Compliance mod, which provides a collection of pre-defined rules for benchmarks such as GDPR, HIPAA, PCI DSS, Cyber Essentials. The engine can evaluate these rules against discovered cloud resources.

Database

Neo4j is an open-source graph database, heavily focusing on relationships between entity types. It provides a more natural way of querying and visualizing complex data.