AI for Science Datasets: Request for Proposals
Following the publication of the United Kingdom Department for Science, Innovation, and Technology’s (DSIT) AI for Science Strategy, Renaissance Philanthropy is releasing this call for AI for Science dataset proposals in collaboration with the UK’s Department for Science, Innovation and Technology (DSIT). This initiative is driven by the shared vision that high-quality, targeted data is critical for realising the transformative potential of AI in scientific discovery.
The history of AI for Science is built on a foundation of transformative datasets. The Protein Data Bank, an openly accessible repository, was the public dataset that enabled the Nobel Prize-winning AlphaFold and RFdiffusion models. More recent datasets, such as the OpenBind consortium, promise to generate large high-quality datasets on protein–small molecule interactions, accelerating the discovery of life-saving therapies.
Context
In this call, we are seeking proposals for the creation, curation, and scale-up of valuable scientific datasets that will power the next generation of AI for Science breakthroughs.
We are seeking proposals for datasets with the potential to accelerate progress in the following five priority areas:
Engineering biology
Fusion energy
Materials science
Medical research
Quantum technologies
This request for proposals will operate in two phases:
Phase 1: Participants will be invited to submit 4-page proposals, by 10 January 2026, following the framework outlined below. Up to 25 proposals will be awarded £500 prizes.
Phase 2: Phase 1 winners will be provided with support to help build coalitions around their ideas and develop more established plans. Up to 10 proposals will be awarded £5,000 prizes at the end of phase 2.
Winners of the RFP will receive support in addition to prize funding, including help in identifying collaborators and funders.
Call for Proposals
Proposals should be submitted as PDFs using this link by January 10, 2026.
Proposals should be 4 pages and use at least 11 point font and 1-in margin.
Proposals should have at least one collaboration with a UK-based organisation or individual.
Reach out to datasets@renphil.org with any questions.
Each proposal should address the following questions:
What problem do you want to solve?
Present the case for why this specific problem is a key bottleneck to the broader field or priority area. Why is this particular problem particularly amenable to being addressed by AI and high-quality data? Is current data enabling meaningful progress with machine learning? Is there research suggesting that this dataset would enable significant scientific progress? Why now?
What kind of data is needed?
What are the existing datasets in this domain? How comprehensive, accessible, and actively maintained are they? What are the systemic, structural, or technical reasons this dataset has not been created already? Identify the specific quality and quantity of data needed to train models to address this problem. What quality control and calibration efforts need to be taken? What level of data granularity will be required for successful AI model development?
What capabilities are needed to generate the data and who can do it?
Identify the specific capabilities, equipment, and materials needed to generate the needed dataset. Propose potential performers who would be well positioned to execute on this dataset proposal. What is the projected cost of generating this dataset? If costs are traditionally prohibitive, are there new technologies or infrastructure (e.g. high-throughput methods) that could dramatically reduce costs?
What are one or more technical milestones that should be met before devoting large amounts of funding to scaling the size of the dataset?
What “proof of concept” results would be needed to justify increasing the size of the dataset? What kind of collaboration between computational and experimental researchers is needed to validate the utility of the dataset for some critical prediction or design task?
How will the data infrastructure be maintained and continued?
Can this dataset be integrated into an existing, high-quality database (e.g. PDB, UniProt)? If not, what infrastructure would need to be developed for long-term hosting, discoverability, and integration with existing tools?What model evaluation and benchmarking is needed to increase the utility of the dataset?
Consider standard tasks and metrics for evaluating ML models on the dataset, baseline models and reproducible evaluation pipelines, comparisons with existing datasets where appropriate, CASP-style competitions.
What breakthroughs or downstream capabilities would this dataset unlock?
Describe the scientific, technological, or translational impacts that would become possible if this dataset existed. What new types of models and capabilities (including prediction, design, and inverse design) could this dataset enable the field to build? How might this accelerate discovery, reduce cost, or enable inventions that are currently impossible? How might the proposed dataset benefit the UK in particular?
Phase 1 Proposals
FAQs
-
This competition is run by Renaissance Philanthropy, following the publication of the UK Department for Science, Innovation and Technology’s (DSIT) AI for Science Strategy.
-
The prizes are being awarded by Renaissance Philanthropy.
-
UK DSIT provides strategic alignment and priority-area input.
-
No.
-
The competition is open globally to researchers, labs, companies, consortia, and independent teams. We require at least one UK collaboration for proposals.
-
Phase 1: Up to 25 awardees, each receiving £250
Phase 2: Up to 10 awardees, each receiving £5,000
-
Proposals should be submitted as a PDF by January 10 to datasets@renphil.org.
-
No. Early-stage ideas are welcome. However, proposals should identify:
Who could perform the data generation
What expertise or capabilities are required
Any existing collaborations or relationships
This helps reviewers assess feasibility.
-
No, but reviewers will assess the feasibility of the dataset being created.
-
Yes. Winners will receive support in identifying collaborators, funders, and building coalitions around their ideas.