Ä¢¹½¶ÌÊÓÆµ

Validate Risks with Experiments

Create and run experiments that provide real insights on your systems

Put your systems to the test and improve your operational readiness with a wide variety of experiments. Identify all of your reliability gaps and system limitations.

  • Simulate network-level outages, latency, and traffic issues
  • Run actions to stress resources like CPU, disk space, and memory
  • Change instance states and test database failover processes
  • Inject application-level faults with delays and method exceptions
light bulb icon

See Recommendations

Our Advice feature provides you with a list of recommended experiments

template icon

Start with Experiment Templates

Create new experiments fast by selecting from over 50 pre-built templates

target icon

Create Custom Actions

Build experiments from scratch and add your own custom faults

Lowering the chaos engineering learning curve

When it’s easy to build experiments, reliability can be an inclusive cross-team effort.

Building Your First Experiment

Learn how to create reliability tests quickly with the Ä¢¹½¶ÌÊÓÆµ experiment editor.e exact experiment you want in minutes.

Read More
Creating Experiments from Scratch

Build new experiments fast with hundreds of no-code actions.

Read More
Using Experiment Templates

Use a library of 80+ templates to generate and review ready-to-run experiments.

Read More
Running Your First Experiment

Watch experiments run in real-time to see how each step impacts your system.

Read More
Scheduling Your Experiment Runs

Run future experiments as one-offs or recurring tests.

Read More
Automation & Workflow Options

Create automated workflows with the Ä¢¹½¶ÌÊÓÆµ API, CLI, and MCP Server.

Read More
define targets

Define targets with granular precision and set a safe blast radius

It’s important to start with a small blast radius when running experiments for the first time. For example, you may target only 10% of the pods in a cluster with a given attack. As you build confidence in how your system will respond, you can expand your blast radius and take on more risk.

Targeting in Ä¢¹½¶ÌÊÓÆµ uses an intuitive query language based on discovered metadata. It’s easy to be specific and safe, so you know exactly what your experiment will impact. Each action has a blast radius you can adjust with a simple toggle control. There is always an emergency stop button close by to hit the brakes and rollback changes.

experiment runs

Watch experiments run in real-time and validate monitoring alerts

When you start an experiment, you will be able to watch it run in real-time as each step is executed and review a summary of your system’s behavior. If your target is a Kubernetes cluster for example, you’ll see the Kubernetes event log so you can see each change and the results of health checks.

You can also watch to see if your observability tool is raising an alert when expected. Just install the relevant extension and view these real-time events in Ä¢¹½¶ÌÊÓÆµ.

Use a library of 200+ open source actions & templates

Create reliability tests across your tech stack with a wide range of pre-built actions and templates. View the full library in the .

Actions
Templates

Explore More Actions

Browse open source actions that you can easily add to experiments.

Explore More Templates

Browse the full list of open source experiment templates in the Reliability Hub.

Graphic titled 'Action Kit' featuring a set of action-driven tools and icons for project implementation

Missing an action that would unlock a useful experiment?

Our open source extension framework makes it easy to add custom components to Ä¢¹½¶ÌÊÓÆµ. Build your own custom actions using our language-agnostic ActionKit and create any experiment that would be useful for your organization.

CICD Workflows - Comp

Schedule experiments or automate tests with the Ä¢¹½¶ÌÊÓÆµ API and CLI

You can run experiments manually, on a schedule, or with automation. Many teams will incorporate Ä¢¹½¶ÌÊÓÆµ experiments into their CI/CD workflow so they can continually verify experiments and ensure that new deployments meet a certain reliability standard.

With the Ä¢¹½¶ÌÊÓÆµ API and CLI, it’s easy to incorporate experiments into your development lifecycle to on your terms.

reliability reports

Track your progress with experiment and usage reports

As you run experiments across teams, you can track your progress with reports in Ä¢¹½¶ÌÊÓÆµ. See what types of attacks are being used most often, count experiment runs, and see how many issues you have found and fixed.

Browse Actions & Templates in the Reliability Hub

See what types of actions, targets, and templates are waiting for you and your team in our open source library.

ufo image