Request For Proposals

AI for Science Datasets

Following the publication of the United Kingdom Department for Science, Innovation, and Technology’s (DSIT) AI for Science Strategy, Renaissance Philanthropy is releasing this call for AI for Science dataset proposals in collaboration with the UK’s Department for Science, Innovation and Technology (DSIT). This initiative is driven by the shared vision that high-quality, targeted data is critical for realising the transformative potential of AI in scientific discovery.

Context

The history of AI for Science is built on a foundation of transformative datasets. The Protein Data Bank, an openly accessible repository, was the public dataset that enabled the Nobel Prize-winning AlphaFold and RFdiffusion models. More recent datasets, such as the OpenBind consortium, promise to generate large high-quality datasets on protein–small molecule interactions, accelerating the discovery of life-saving therapies.

Context

Call for Proposals

In this call, we are seeking proposals for the creation, curation, and scale-up of valuable scientific datasets that will power the next generation of AI for Science breakthroughs.

We are seeking proposals for datasets with the potential to accelerate progress in the following five priority areas:

Engineering biology
Fusion energy
Materials science
Medical research
Quantum technologies

This request for proposals will operate in two phases:

Phase 1: Participants will be invited to submit 4-page proposals, by 17 January 2026, following the framework outlined below. Up to 25 proposals will be awarded £500 honoraria in recognition of time spent developing and submitting high-quality dataset concepts.
Phase 2: Phase 1 winners will be provided with support to help refine their dataset concepts, build coalitions, and develop credible plans for future funding and execution. Up to 10 proposals will be awarded £5,000 honoraria at the end of Phase 2 to support this refinement work.

Note on scope and funding: The prizes awarded through this competition are honoraria, intended to recognise time spent on idea development, scoping, and early validation. They are not intended to necessarily fund dataset build-out, storage, or long-term infrastructure costs. The purpose of the competition is to identify and de-risk high-impact dataset concepts that could then be positioned for substantially larger follow-on funding from government, philanthropy, or other partners.

Phase 1 Proposals

Proposals should be submitted as PDFs using this link by January 17, 2026.

Proposals should be 4 pages and use at least 11 point font and 1-in margin.
Proposals should have at least one collaboration with a UK-based organisation or individual.
Reach out to datasets@renphil.org with any questions.

Each proposal should address the following questions:

What problem do you want to solve?
Present the case for why this specific problem is a key bottleneck to the broader field or priority area. Why is this particular problem particularly amenable to being addressed by AI and high-quality data? Is current data enabling meaningful progress with machine learning? Is there research suggesting that this dataset would enable significant scientific progress? Why now?
What kind of data is needed?
What are the existing datasets in this domain? How comprehensive, accessible, and actively maintained are they? What are the systemic, structural, or technical reasons this dataset has not been created already? Identify the specific quality and quantity of data needed to train models to address this problem. What quality control and calibration efforts need to be taken? What level of data granularity will be required for successful AI model development?
What capabilities are needed to generate the data and who can do it?
Identify the specific capabilities, equipment, and materials needed to generate the needed dataset. Propose potential performers who would be well positioned to execute on this dataset proposal. What is the projected cost of generating this dataset? If costs are traditionally prohibitive, are there new technologies or infrastructure (e.g. high-throughput methods) that could dramatically reduce costs?
What are one or more technical milestones that should be met before devoting large amounts of funding to scaling the size of the dataset?
What “proof of concept” results would be needed to justify increasing the size of the dataset? What kind of collaboration between computational and experimental researchers is needed to validate the utility of the dataset for some critical prediction or design task?

How will the data infrastructure be maintained and continued?
Can this dataset be integrated into an existing, high-quality database (e.g. PDB, UniProt)? If not, what infrastructure would need to be developed for long-term hosting, discoverability, and integration with existing tools?
What model evaluation and benchmarking is needed to increase the utility of the dataset?
Consider standard tasks and metrics for evaluating ML models on the dataset, baseline models and reproducible evaluation pipelines, comparisons with existing datasets where appropriate, CASP-style competitions.

What breakthroughs or downstream capabilities would this dataset unlock?
Describe the scientific, technological, or translational impacts that would become possible if this dataset existed. What new types of models and capabilities (including prediction, design, and inverse design) could this dataset enable the field to build? How might this accelerate discovery, reduce cost, or enable inventions that are currently impossible? How might the proposed dataset benefit the UK in particular?

Phase 1 Proposals

FAQs

Frequently Asked Questions

This competition is run by Renaissance Philanthropy, following the publication of the UK Department for Science, Innovation and Technology’s (DSIT) AI for Science Strategy.
The prizes are being awarded by Renaissance Philanthropy.
UK DSIT provides strategic alignment and priority-area input.
No. Winning the RFP does not guarantee further funding. However, a central aim of the competition is to identify datasets that DSIT and other funders could realistically consider for follow-on support, and winners will be actively supported in those conversations.
The competition is open globally to researchers, labs, companies, consortia, and independent teams. We require at least one UK collaboration for proposals.
Phase 1 awards are £500 honoraria for up to 25 shortlisted proposals.
Phase 2 awards are £5,000 honoraria for up to 10 final proposals.
These awards are intended to recognise time and effort spent on idea development and refinement. They are not intended to cover dataset build-out, storage, or operational costs at this stage.
Proposals should be submitted here by January 10, 2026.
No. Early-stage ideas are welcome. However, proposals should identify:
- Who could perform the data generation
- What expertise or capabilities are required
- Any existing collaborations or relationships
This helps reviewers assess feasibility.
No, but reviewers will assess the feasibility of the dataset being created. At this stage, we are not funding dataset development or long-term management. Applicants are asked to describe what dataset should exist, why it matters, and what would be required to build it if sufficient resources were available. One goal of the competition is to help position the strongest dataset ideas for future large-scale funding from DSIT or other partners.
Yes. Winners will receive support in identifying collaborators, funders, and building coalitions around their ideas.
No. Applicants retain ownership of their ideas and any associated intellectual property. Submissions are treated as confidential and shared only with Renaissance Philanthropy, DSIT, and appointed reviewers. No proposal content will be made public without the explicit consent of the submitting team.

AI for Science Datasets

Context

Call for Proposals

Phase 1 Proposals

Frequently Asked Questions

Who is running this competition?

Who is funding the prizes?

What is UK DSIT's role?

Does winning the RFP guarantee any funding from DSIT?

Who can apply?

What are the prizes?

How do I submit my application?

Do I need a full team in place to apply?

Will I need to build and manage the dataset?

Will winners receive support beyond prize funding?

Do applicants give up ownership of their ideas or IP by applying?