About

How a conference workshop led to a community-built benchmark effort.

This effort began at the GWFreeride conference in Sexten, Italy (26–30 January 2026). Between talks and discussions about machine learning and gravitational waves, conversations in the spa, and — of course — skiing, the week made clear that successful use of ML methods, in particular simulation-based inference (SBI), in gravitational-wave science is growing fast.

New opportunities came with challenges. Two that kept coming up were:

There are no strong, standardised ways to compare different machine-learning methods on equal footing.
Evaluating whether a method is ready for production-style use is tedious and often ad hoc.

Given that, a few people sat together for several workshop sessions during the conference and laid groundwork for a benchmark: target audiences, goals, metrics, methods, and more.

After the conference

After the conference, the work continued: others joined, and momentum kept building. We are now an international team of machine learning and gravitational wave scientists working in multiple working groups with regular meetings. The goal is to ship a first version of the benchmark as soon as possible.

Benchmark definitions, evaluation metrics, and submission instructions are all on the pages linked below. For questions about the benchmark or interest in contributing, don't hesitate to get in touch.

On this site

Benchmarks

Levels, tasks, and how ground-based work connects to longer-term LISA / PTA ideas.

Evaluation

Which metrics we use and why — accuracy, calibration, and distributional checks.

Submit

Blind evaluation flow: what to upload, how results are returned, and leaderboards.

Elsewhere

Code & organisation

Open-source repos and datasets under the gwbenchmark organisation.

GWFreeride workshop

Conference site (Sexten, Italy — programme and context).

Contact

Get in touch

For questions about the benchmark, contributing, or joining a working group, please drop James an email.

James Alvey