Submit your method

Choose a benchmark, download its task package, then upload posterior samples for scoring.

Work in progress

Submission details are not final yet

Exact file formats, manifests, and the upload path are still being built. When they are ready, the definitive instructions will appear on this page. Datasets and automated evaluation are planned to live on Hugging Face; the same links and summaries will be mirrored here.

Evaluation metrics · Benchmark definitions

How it works

01Choose a benchmark

On the Benchmarks page. Pick the track and level you want to enter (for now: LVK Level 0 or Level 1), then download the release from Hugging Face once it is published — strain, PSD, metadata, simulator, and any reference materials for that level.

02Run your inference

Train or configure your method using the released simulations and rules for that level, then run it on every test observation and collect posterior samples in the task parameterisation.

03Package to the specification

Arrange samples and sidecar metadata exactly as the release notes describe — layout, naming, dtypes, and parameter ordering will be validated automatically once the checker ships.

04Upload

Upload your posterior files through the channel we announce, together with any required companion fields (team or method id, seeds, short run notes). You receive scores against reference posteriors and can opt in to the public leaderboard.

Downloading the data

We expect to host datasets on Hugging Face. The on-disk layout is still being finalised; the intent is roughly as follows.

Strain as NumPy NPZ files in the frequency domain, together with the detector PSD and the frequency array.
Metadata (event identifiers, splits, and related columns) as Parquet.

You will submit posterior samples in a prescribed layout, plus any small sidecar fields the release asks for. Exact schemas will ship with the first public package.

LVK levels

LVK Level 0 and Level 1 share the same BBH setting and on-disk data layout; only the parameter dimensionality differs. See the Benchmarks page for physics and priors. LISA and PTA are separate road-map tracks and are not on this submission path yet.

Level 0 — fixed extrinsic (bbh-pe-l0)

bbh-pe-l0

Five intrinsic parameters; extrinsic quantities fixed by the benchmark. Full column ordering, splits, and evaluation rules will be documented in the release.

Level 1 — full parameter space (bbh-pe-l1)

bbh-pe-l1

Same data as Level 0; infer the full eleven-dimensional parameter vector. Task details will be spelled out alongside the Level 0 package.

Higher benchmark levels may ask for more than posterior files alone — for example runnable simulator code, pinned dependencies, or Docker images so results can be reproduced under stricter pipeline rules. For Level 0 and Level 1 we intend to keep the barrier low: posteriors plus light metadata should be enough.

Questions while the pipeline is still moving: use the contact options on the About page.