17 May 2026·8 min read

Eight days before the WHO. AqtaBio's first prospective Ebola signal, and what it does and doesn't prove.

AqtaBioPre-etiologic surveillanceWHO PHEICEbolaXGBoostPublic ledger

Tile heatmap showing the Congo Basin biome cluster from the 2026-05-09 W19 commit, with a red hot-zone surrounded by orange and green tiles, eight days before the WHO Bundibugyo PHEIC declaration. Biome-level signal: rank-1 tile in CAF (no PHEIC); rank-4 country DRC is where the PHEIC was declared.

TL;DRBiome-correct, country-uncertain. On 9 May 2026 the inaugural entry in AqtaBio's append-only public ledger listed all five top ebola spillover risk tiles in the Congo Basin biome. Eight days later the WHO declared a Public Health Emergency of International Concern for Bundibugyo Ebola in DR Congo and Uganda. The rank-1 tile (CAF, 0.999) shares the biome but has no declared outbreak; DR Congo, where the PHEIC was declared, sat at rank 4 (0.732). This is the system's first prospective signal to precede a WHO declaration. It is commit #1, not #100. All prior validation was retrospective per-pathogen backtests; cross-pathogen aggregate is a Q3 2026 medRxiv deliverable. This post is the honest version of all of that.

What happened, in two timestamps

2026-05-09 02:09:04 UTC. The first scheduled run of AqtaBio's public commitment workflow pulled the top risk tiles from the live MCP server and wrote commitments/2026-W19.json to the public research mirror, the inaugural entry in the append-only ledger. The top five ebola tiles in that file were all in the Congo Basin biome. The ranking by country was: CAF rank 1 (0.999), COG rank 2 (0.983), CAF rank 3 (0.967), DRC rank 4 (0.732), COG rank 5 (0.639). The top tile's uncertainty band was 0.001; both p10 and p90 rounded to 1.0. The model was not hedging on the biome. It was less confident on which country inside the biome.

2026-05-17 (today). The WHO declared a Public Health Emergency of International Concern for an ongoing Bundibugyo Ebola outbreak in DR Congo's Ituri province, with confirmed cross-border cases in Uganda. 246 suspected cases, 80 reported deaths, eight laboratory-confirmed, one case in Kinshasa, one in M23-controlled Goma. Bundibugyo is one of three ebolavirus species and has no approved drugs or vaccines.

Eight days between the two events. The GitHub commit timestamp on W19 is independently verifiable by anyone with a browser. There is no backdating mechanism that survives the public Git history. But there is also no second commit to point at yet. W20 has not shipped (it ships Monday). Calling this a “weekly cadence” would require shipping W20, W21, W22 in order first. This post does not.

How the model made the call

What the model is actually measuring.

AqtaBio is not predicting where ebola will happen. It is predicting where the underlying conditions for spillover stack up: proximity to past spillover events, ecotone shift (the forest-edge zone where bats are pushed closer to villages when their normal range is disrupted), 12-month temperature anomaly, deforestation rate over three years, and armed-conflict density within 50 km (which proxies surveillance and healthcare-system collapse). When those features co-occur in a tile, the tile lights up regardless of whether an outbreak is actually under way. The model does not read case reports. It is asking a different question.

AqtaBio's production scorer is a weighted ensemble of three classifiers (XGBoost at 0.50, Random Forest at 0.35, Logistic Regression at 0.15); the ledger's “AqtaBio XGBoost ensemble” label (recorded as model.name in the W19 JSON, version v0.1.0) is the public shorthand for that, because XGBoost is the dominant scorer and the only component that drives the SHAP attributions below. It is trained on 25 historical zoonotic spillover events between 2003 and 2024 and scores 25 km tiles at monthly resolution. It does not read news feeds. It does not read case reports. It reads ecological, climatological, and structural features and asks where are the conditions that have historically preceded a spillover.

For AF-025-10018, the rank-4 tile sitting inside COD (the country where the PHEIC was declared), the W19 ledger names three top SHAP drivers; their absolute magnitudes from a live MCP query on 2026-05-17 (re-fetchable in one curl) are:

The tile sits close to historical ebola spillover sites in the Congo Basin forest belt. Forest is converting to bushmeat-hunting territory faster than baseline. The wider biotic transition index registers compounding ecosystem disruption: land-use change, habitat fragmentation, edge effects between agricultural and forest land. Three risks compounding in eastern DRC, where the M23 / FARDC / ADF theatre leaves conflict-disrupted health services with little margin for any of them to slip past surveillance.

The Congo Basin signal is not ebola-only. In the same W19 commit, COD appears in the top-5 tiles for every one of the five pathogens AqtaBio currently scores (ebola, h5n1, cchfv, wnv, sea-cov), the only country in the W19 cohort to do so. That structural cross-pathogen pattern is the lead claim of the forthcoming medRxiv preprint, not this post; the point here is just that the W19 ebola call sits inside a broader regional risk signature, not as an isolated guess.

The model does not deserve credit for noticing those signals. Any competent epidemiologist looking at the same features would say the same thing. What the model does is run that calculation against every 25 km tile across the Congo Basin every week, write the result to a public ledger you cannot retroactively change, and let a public-health system decide whether to pre-position resources before the WHO machinery finishes ratifying that yes, this is now an emergency.

What this is not

The biggest caveat first: the model put DR Congo at rank 4, not rank 1. The 0.999 top tile sits in Central African Republic. CAF shares the Bundibugyo reservoir ecology with DRC and is roughly 200 to 500 km from Bunia, but no PHEIC has been declared in CAF. A sharp epidemiologist reading W19 will see this immediately and I owe it to them to put it at the top of the page, not at the bottom. The honest statement is “biome-correct”, not “country-correct”.

That said, the rank-4 DRC tile carried real conviction. The W19 commit recorded a risk score of 0.732 with p10 of 0.651, p90 of 0.812; even the lower bound of the model's uncertainty band sat above 0.65, which is in the high-risk bracket on AqtaBio's published tier table. The model was hedging on which country inside the biome would convert, not pulling DRC out of the danger zone.

“No claim that an outbreak WILL occur in the listed tiles; the commitment is that AqtaBio considered these tiles highest-risk for the named pathogen during the named window.”
verbatim from honest_caveats[2] in the 2026-W19 ledger entry, committed 2026-05-09 02:09:04 UTC

One more thing worth saying out loud. The W19 commit also flagged H5N1, CCHFV, and a Southeast Asian coronavirus lineage at near-maximum confidence in the same Congo Basin biome during the same week (rank-1 risk scores 0.999, 0.993, 0.998 with uncertainty bands all under 0.005); WNV ranked the biome at 0.940 with a wider band. Those four signals remain open. The single ebola call is what got the headline because the PHEIC happened; the multi-pathogen convergence is the more interesting scientific claim, and the lead claim of the forthcoming medRxiv preprint. A biome lit up across five independent disease models in the same week is what continuous pre-etiologic surveillance is supposed to look like.

Limits of this result

No province. 25 km tiles identify ecological zones, not administrative divisions. Naming Ituri needs a 5 km resolution upgrade or an admin-boundary overlay; both are on the v0.2 roadmap.
No species. “Ebola” is scored as a single class. Bundibugyo, Zaire, and Sudan ebolaviruses behave differently and have different vaccine coverage; per-species labels on the existing 25-event cohort are days of work, a v0.2 item.
Validation is mostly retrospective. Per-pathogen backtests hit AUROC up to 0.975 on ebola with held-out time-aware splits. The cross-pathogen aggregate (AUCPR, lead-time distribution, miss list) is the Q3 2026 medRxiv deliverable. Today is the first prospective signal to precede a WHO declaration. That is the precise framing, not “I predicted the outbreak”.
Cadence is not yet weekly. This is commit #1, not #100. The 2026-W19 filename is the ISO week, not the 19th entry. The workflow runs Mondays 09:00 UTC; “weekly cadence” becomes defensible after W20, W21, W22 ship in order.
Not fully reproducible end-to-end. The model code, feature list, MCP source, and W19 commit are public; the ingest pipeline and trained artefacts (models/{pathogen}/model.ubj) sit behind the pilot-engagement boundary. A normal split for AI shipping into regulated industries, but worth saying out loud.
No public-health responder acted on a live alert. The 8-day lead is currently counterfactual. Lead time only matters when someone uses it; that gap is what a pilot partnership exists to close.

Verify the claim in four ways, none of which require asking me

AqtaBio's prior-knowledge proof is intentionally not custodial. Four paths, any of which is sufficient:

Read the public ledger entry on GitHub. commitments/2026-W19.json. Scroll to pathogen: ebola and read the top three tiles. GitHub's commit timestamp on that file is the third-party witness.
Read the Internet Archive snapshot, if you do not want to trust github.com. The Wayback Machine has the W19 file captured at 2026-05-17 20:23:50 UTC, byte-identical to the GitHub blob (SHA-256 48436bd48b0314319abb8dbf61e0434922518171343120b0b110e83718150458). Two independent third parties now witness the same bytes.
Call the live MCP server yourself. The endpoint at https://qjtqgvpd9s.eu-west-1.awsapprunner.com/mcp is publicly callable with no auth, returns the same tile rankings the ledger captured, and has 19 tools you can poke at. The curl recipe is on aqtabio.org/preview.
Open the live banner on the dashboard. aqtabio.org/preview shows the current top three ebola tiles next to a link back to the 2026-W19 ledger entry, refreshed from the live MCP every five minutes. The “tiles monitored” count on that banner is the ebola-only slice of the 578 tiles seeded across all eight pathogens, not a different number; same pipeline, different filter.

Why the ledger matters more than the prediction

Proving you knew first is the bottleneck for every “the model knew first” claim. Without a public, append-only commitment mechanism the claim is unfalsifiable, indistinguishable from motivated retrospective reading. Tweets do not count. Press releases do not count. Slack screenshots do not count. The mechanism that counts is a third-party-witnessed timestamp on the prediction itself.

That mechanism is independent of biosurveillance. Any AI system that claims to forecast. Fraud detection that “saw the breach coming”, credit risk that “flagged the borrower”, climate models, market models, agentic alerts can adopt the same pattern: scheduled signed commitment to a public ledger, no retroactive editing, anyone can verify. It is one of the cheapest and most useful pieces of AI governance plumbing in the regulated-industry toolbox. I use it for the spillover model. It would work equally well for the model you ship.

What I am doing next

Three things, in order:

GPG-sign the ledger commits. The 2026-W19 commit carries GitHub's commit timestamp but no GPG signature today. I have the signing key generated and the workflow already configured; wiring the three GitHub Actions secrets gets the “Verified” badge on the commit and removes the last “GitHub could in theory…” objection.
Add ebolavirus species and a finer geographic feature set so the model can distinguish Bundibugyo from Zaire and name provinces rather than ecoregions. v0.2 milestone.
Pilot the alert path with a named public-health partner. The prior-knowledge mechanism is now proven. The alert path is what is missing. One health ministry, one regional public-health body, or one academic group to receive the signal and act on it. If that is you, partnerships@aqtabio.org.

The hard part is not the XGBoost. It is having the discipline to commit the output publicly before you know if you are right, and then being honest when the country-level ranking was off. Most systems do not survive that test because they never take it.

Share this article:

References

BBC News (2026-05-17). WHO declares Ebola outbreak in DR Congo an international emergency. Source
World Health Organisation. Disease Outbreak News, Ebola virus disease. Source
Aqta-ai/aqtabio-research (public mirror). commitments/2026-W19.json, committed 2026-05-09 02:09:04 UTC. Source
AqtaBio MCP server (public, no auth). Source
AqtaBio /preview. Verify the live signal in 30 seconds. Source
AqtaBio methodology and model card. Source

Anya Chueayen

Founder of Aqta. Before this, I worked on integrity at social media platforms, the unglamorous side of AI where human behaviour, edge cases, and ethics collide at scale. That work convinced me that responsible AI needs infrastructure, not just good intentions. Based in Dublin, closely watching how regulation is reshaping what we build and how.

Connect on LinkedIn

How AqtaCore signs attestation receipts: Ed25519, canonical JSON, and zero-trust verification

Same family of governance plumbing. Ed25519-signed receipts on every AI enforcement decision, verifiable offline in five lines of code, no calls back to any server.

The Human Supply Chain Behind AI

The model is the easy part. Three AI coding assistants pulled deprecated libraries; one shipped a known RCE. AI safety needs the same governance plumbing as any other supply chain.