There are many forces in policymaking (and in our lives generally) that push us towards the short term. Many of the most important measurements in political life are on extremely tight timelines: election cycles, monthly unemployment reports, even the President's daily intelligence briefing. The pressure to get results — and to show results — on a tight turnaround is incredible.
One of my questions on Statecraft for a while has been: How do you build a machine to get long-term results? Whether it's a new agency or a new initiative, how do you set up a structure to work toward a goal that's 10, or 20, or 50 years away? And how do you protect that structure from short-term political pressures?
Today's interviewee is Sir Rory Collins. Sir Rory has spent a full 20 years building and leading one of the most important scientific resources in the world: the UK Biobank.
The Biobank represents a fascinating case study in long-term thinking. It's a database of half a million British participants whose health is being tracked longitudinally for the next 30 years. The Biobank was established with the knowledge that the upfront work, and the spending required, would only really start to pay off 15 years later. When Sir Rory went in for the 10-year review with funders, they asked what had been achieved so far. He said, “Nothing.”
But today, UK Biobank is paying massive dividends: It's democratized access to population-scale data for researchers worldwide, and it's already yielding amazing insights into the causes of and cures for disease. I wanted to understand how he built the UK Biobank, and, just as importantly, how he managed to sustain it over a long period of time.
We discussed
How to create long-term value in research
How to recruit half a million research subjects
Why the Biobank deferred so many decisions
How other countries’ prospective studies are learning from the UK Biobank
[Publication note: I refer to Collins several times as “Sir Collins.” I have since been informed by editor Harry Fletcher-Wood that, as a knight, he is “Sir Rory” (whereas, if he were a lord, it would be “Lord Collins”). I regret the error. Fletcher-Wood’s judicious edits to the audio and transcript, as well as his insight into British norms, are greatly appreciated.]
I’m excited to have this conversation. I wanted to start with some basics for those of us on the American side who may not be familiar with the UK Biobank. There are 500,000 people in the Biobank. What has happened to them?
The UK Biobank was set up at the beginning of the century. It initially came out of an idea from the Wellcome Trust charity, which funds a lot of medical research, and the Medical Research Council (MRC), our UK equivalent of the National Institutes of Health [a US government agency funding health research]. The concept, which I think came from a lot of epidemiologists, geneticists, and other health researchers, was to set up a large, what we call a prospective cohort [which follows a group of people over time to see how their health develops].
We recruited half a million men and women who were aged 40–69 between 2006 and 2010. They're from all across the United Kingdom, from different socioeconomic groups, urban, rural, etc., who came along, answered lots of questions about their lifestyle, environment, family, and their medical history. They agreed to provide us with biological samples and physical measurements and, importantly, to allow us to follow their health, which we've been doing over the last 20 years, largely through linkage to the National Health Service health record systems [the national medical system, which all UK residents can access free], but also going back to participants themselves and asking them about different aspects of their health.
Talk to me about the scientific value of the Biobank. What have we learned so far? What are we hoping to learn in the future?
The beauty of studying people in middle age, when they're relatively healthy, is you can find out about their exposures, not just their genetic makeup, but also the way in which they're living, and all the things that predate the development of disease. The big advantage of having studied them then, and then following their health, is we can understand the causal determinants of disease: Not only the genetics, but also lifestyle, environment, the proteins and metabolites in their blood, that subsequently lead to them developing some particular condition.
The important thing that the Medical Research Council and the Wellcome Trust required of us, when we set it up, was to make this data available to researchers around the world for any kind of health-related research that's in the public interest. And that's the consent that these half-million altruistic participants have agreed to. So the data is being used by scientists everywhere to try to understand, why is it that one person gets a particular disease and another doesn't? Is it their genetics? Is it their lifestyle? And what are the pathways through which genes, environment, and lifestyle might lead to getting a disease?
There are thousands of researchers around the world working on the data. Last year alone, there were 5,000 peer-reviewed publications based on UK Biobank. It's unprecedented in terms of the scale of discovery and the range of discoveries that are emerging, all down to the altruism of those half-million participants.
Take us back to the early 2000s, as this idea was floating around the scientific landscape. You're an epidemiologist, and before you began leading the Biobank, you did research on large-scale population sets. How did people think about this challenge before the Biobank existed?
The decision to set up UK Biobank looks prescient now, but at the time it was a risky decision. The great thing about genetics is that your genes don't change. In some respects, the most efficient way of studying the genetic determinants of disease is to take people with a particular disease and compare them with those who don't, because the disease itself won't have changed your genetics. Therefore, you can see the differences in the genetics of someone who has a disease as compared with those who don't. That's how we've learned so much about genetic mutations that are strongly causal of disease.
Epidemiology [the study of how often diseases occur in different groups of people and why] is much harder, because your lifestyle and environment is influenced by disease. You want to study people when they're healthy, find out how they're living, and then follow them for a long period of time to understand what led to disease. For example, you study people’s smoking habits, follow them, and you find that smoking causes lung cancer, many other cancers, cardiovascular disease, respiratory disease, etc. One of the best ways to get people to stop smoking is for them to have a heart attack: then they stop smoking. They changed their environmental risk factor.
There was a lot of controversy at the beginning of the century as to whether a study like UK Biobank was the right thing to do if you were interested in genetics alone. But we're not: we're interested in how genes and environment and lifestyle interact. For that, you need to set up a large, prospective study. Large, because you need enough people to develop any particular condition in order to be able to have the statistical power to determine the causal factors. Prospective, with long-term follow-up, so that you are actually studying the causal associations.
So it required a kind of a vision of the importance, not just of genetics, but of all of the different drivers of disease, for the MRC and the Wellcome Trust to decide they were going to set this up, with a recognition that for the first 10 or even 15 years, it was unlikely that it would actually produce anything of material value in terms of discoveries. Now, of course, it's a gold mine.
What made you think this was even a feasible project to undertake?
There are a number of epidemiologists who have been involved in setting up large prospective studies. The classic in Britain was the British Doctors study that studied doctors and their smoking habits and has followed them for half a century. So the ability to recruit large numbers of people is certainly there. But when one looked at the studies that had been done, they had largely looked at old risk factors: blood pressure, body mass index, obesity, or extreme thinness, cholesterol levels, where cholesterol had been measured, and the samples hadn't been stored.
So a number of people, around the world, had seen the value in establishing a large prospective study with biological samples stored, and then just waiting until it was possible to assay those samples at that very large scale of hundreds of thousands of individuals, with markers that hadn't been studied so far. We knew how to recruit large numbers of people. In the UK, we had been running really quite large randomized trials, working with the National Health Service to recruit directly into studies. It seemed to us that one could take that approach of working with the NHS to invite people to join the study. And then, when they agreed to take part, to look at how one could build a factory production line, where they would go through answering lots of questions. We used touchscreen technology, which was relatively novel at that time, to get lots of questions answered. [We had] people moving through this production line of answering questions, having physical measurements, and collecting biological samples. If you invited enough people and you had enough funding, then you really could recruit very large numbers, which is what we did.
When you say that epidemiologists had put together very large prospective trials in the past, what's very large in that context?
There had been a number of long-term observational studies of things like blood pressure, blood cholesterol, body mass index, with tens of thousands [of participants]. Perhaps the biggest was the MRFIT screening study in the US of a third of a million American men, which showed this very lovely association between cholesterol level and the risk of coronary artery disease. It showed that throughout the range, higher cholesterol was associated with higher risk, and there really was no normal level whereby lower cholesterol was not likely to be cardioprotective.
These very large prospective studies, either individually or collectively through a meta-analysis, a combination of the results from different studies, showed that if you had hundreds of thousands of people in prospective studies, either individual [studies] or combined, then you could get very clear signals about the strength of the relationship of the risk factors that have been measured and disease. That generated the hypothesis that if we could create studies of hundreds of thousands of people, then we would be able to study many more risk factors, if we stored the samples and then just waited until we could analyze them at scale.
You and other advocates for UK Biobank began to shop the idea around and build support. Did you run into trouble with the long-term nature of the project — that all of the costs are upfront and all of the benefits are out at least a decade?
I know that a number of individual researchers put in grant applications that were not supported. The International Agency for Research on Cancer (IARC) in Lyon set up a study of the dietary determinants of cancer, where they collected biological samples, but it was really a study of studies. They did a study in each country and then IARC brought those together. They were all slightly different, but it did produce a large-scale prospective study. But these were seen as studies that were of value for non-genetic risk factors. When UK Biobank was being proposed by the Medical Research Council and the Wellcome Trust, I think that one of the drivers for it was the opportunity to study the genetic determinants of disease.
Quite rightly, people said, “If you're interested in genetic determinants of disease, you can get quick answers relatively cost-effectively by just studying people with disease and people without.” I think the messaging didn't really get across clearly enough that one wasn't interested solely in genetics, one was interested in all of the different risk factors: genetics, environmental, lifestyle. The beauty [in having all three] is that one can actually unpick the causal aspects of environmental and lifestyle risk factors. With Mendelian randomization, you use genetic changes in, say, body mass index to determine whether body mass index is actually causally related to disease. Having all of these things in one study is incredibly powerful, not only for determining the strength of an association, but determining its causal nature.
Will you tell us a little bit more about those initial years in the early 2000s when you were building support politically and institutionally for it? Were you going to Whitehall and convincing MPs? What was the day-to-day as you tried to put together a coalition for it?
I wasn't seriously involved in it until around 2005. The decision was made by the Medical Research Council and Wellcome Trust that they wanted to fund a large prospective study. It really came from within those organizations and it's quite difficult to understand exactly why those decisions were made. I think partly they were a result of epidemiologists having argued the case for the value of large prospective studies that could study newer risk factors and the value of having much bigger studies than we'd had before. I think part of it was the excitement about the ability to study genetics at scale. These came together, in my view, to help the funders decide that they wanted to set up a large prospective study that would be able to look at the genetic determinants of risk, but also to look at those in the context of lifestyle and environment. So they made an internal decision to fund this study.
They set up a small group led by Tom Meade, an epidemiologist, to get input from scientists around the UK and beyond around what the scale of that study should be. They looked at power calculations to decide what sort of differences in risk could be detected with studies of 100,000, 200,000. They came up with half a million individuals. The age range was selected on the basis of: young enough that you would be studying people before disease changed the risk factors, but old enough that the study would start to produce results within 10-15 years. So there was always an expectation that this was a long-term commitment.
I think only organizations like the Medical Research Council and the Wellcome Trust could make a decision whereby the first 10 years are really just waiting. But I don't think they, or indeed any of us, thought it would become as productive as it has been. The reason it has is access. One of the things that the Wellcome Trust and the MRC decided at the very beginning was that this was not a resource that was being produced by researchers for themselves alone. This was going to be made available as widely as possible, in order that both the financial and the altruistic investment of the participants generated as much knowledge about how to prevent and treat disease as possible. That, I think, has been the transformative aspect of UK Biobank: the model of accessibility.
When you say that you and others didn't expect that long-term value, why not? What changed that made you update?
There was quite a lot of uncertainty about the quality of findings that would emerge if you just made data accessible to researchers around the world. There was also uncertainty about what kind of capacity there would be to turn the samples into data. To give you a sense of that, when it was decided to move forward, it was set up with half a million samples, and the idea was to collect them in multiple different tubes and sub-aliquot them [to divide the samples into additional sub-samples which can be used by different researchers]. In order to retrieve particular samples from particular individuals, it was decided to build an automated archive which was state-of-the-art, to keep the samples at -80 [Celsius; -112 Fahrenheit], but be able to track them, and retrieve them when you wanted to. That archive was built with a single robot, because it was expected that one would pull out samples from, say, 10,000 women who developed breast cancer after 10 years, and 10,000 controls to compare the people who do and don't develop a particular cancer. This is a case-control approach. If what you're interested in is studying a particular disease, that is very cost effective because instead of analyzing all half a million people, you measure the people with the disease you're interested in and some matched controls.
But, after we had recruited all the participants, and we were discussing this with our science advisory board, they said, "If you're trying to build a resource for researchers around the world to use for all kinds of different research, then a strategy where you assay the samples [test them for substances or biological markers] as if you're doing it for a single researcher isn't actually the best approach. The thing to do is to wait until you can assay all of the samples for some particular type of assay.” So, until you can genotype the whole cohort [profile each individual’s DNA] leave [the] samples alone. Until people have developed disease, during the first 10 years when they're relatively healthy, there is no value in assaying the samples. Our watchword was defer: don't do anything now that you could do better later. Rather than doing case-control, better to wait until you can genotype the whole cohort. That turned [the Biobank, rather than being] a star study of a particular disease, into a resource for researchers to do all kinds of different research.
How did you manage to build an institution that has “defer” as a watchword? Because I think many listeners will be surprised and impressed at an institution that was set up with the ability to pass the marshmallow test.
They say that the definition of an intelligent man or woman is one who can hold two opposing views at the same time, don't they? In a way, that was the situation. I think we all want quick wins. But we know there are no quick wins in a prospective study. So it was really just reinforcing the messaging. Every five years the funders say, ”What have you done?” The first 10 years, we were building, recruiting, and then starting to turn samples into data. We were linking people to their health records, but it was a very slow buildup of use of the resource. At the ten-year review, I was asked, “What had UK Biobank done, for which it was set up?” It was a one-word answer: “Nothing.” There was a quiet pause at that point, but that was the truth. It was a resource that was in gestation.
I went on to say that it will be the next 5, 10, 15, 20 years where it really starts to deliver on the investment that has been made. The thing that transformed UK Biobank from a UK to an international study was when the UK government offered some funding to get the genotyping done. Affymetrix in California won the tender. We were able to tell researchers that there was genotyping data available on half a million participants. That's where we were moving to a scale that had never been done before. Researchers had been combining studies of a few hundred or a few thousand people to try to understand the genetic determinants of, say, blood pressure or height or cardiovascular disease. Here [we were] now with genotyping data on half a million people. That was what put Biobank on the researchers’ map and we started to see an increase in use of the resource internationally, particularly in North America and mainland Europe.
As the Biobank began to recruit those half a million people in the second half of the 2000s, there was a lot of public interest. Did that public or parliamentary scrutiny lead to significant changes in the operations of the Biobank?
Before anything happened, the Wellcome Trust and MRC did a widespread consultation with the research community about the things that people wanted to study. A lot of different working groups were set up to determine what questions or measurements one would make to be able to understand, say diet or activity. There were a whole set of pilot studies done by researchers to look at how you collect and assay samples in the future. So, how would you collect and process a sample so that when the technology caught up, and you could do that at the scale of half a million people, your sample would be the right kind?
There was a lot of consultation with patient groups around issues like feedback. In UK Biobank, there is a policy of not feeding back information to participants [about what assays of their samples reveal]. That was something that went through a lot of iterations because you imagine [participants thinking], you want me to join a study, you're going to make all these measurements, why aren't you going to tell me what those measurements are? Then you go, "We don't know what these measurements mean. Do you still want us to tell you what they are, when they may or may not be relevant to your disease?” They may be misleading, for example. Even things like single-gene disorders [diseases caused by a single gene] that are found in a population may have quite different relevance in a free-living population [who aren’t suffering particular diseases and are not part of specific medical studies]. They may be much less predictive of disease and we don't know these things.
The whole purpose of UK Biobank is to learn. The issue with feedback is it can actually cause harm, it can mislead. This policy was discussed in great depth. I think there are different positions. The one thing you can say is that if you don't have feedback, you can guarantee you won't cause someone harm. But there was a lot of ethical, legal, and participant engagement before that happened.
The area where there was controversy was that setting up a study like this is very expensive. Researchers were saying, “That's going to take away funding that could be used for my research now.” That is a perfectly reasonable concern. I think what happened was that Wellcome and the MRC put additional funding in to set it up. So I don't know that it really had that effect.
The other controversy was, “Is this what you would do if you were interested in genetics alone?” That was allowed to rumble on for longer than perhaps it should have done. I remember when I was asked to take on UK Biobank in 2005, the first thing I did was go and talk to the geneticist who had been saying, “This isn't what you do if you want to set up a genetic study.” I said, “It really isn't a pure genetic study. This is an epidemiological study where we need to study people when they're healthy, and we need to follow them long-term in order to look at the association between genes, environment, and lifestyle.” They went, “Oh, why didn't someone say that?” The same with some of the people in government. The Chair of the Health Select Committee [the parliamentary body scrutinizing health policy and spending] had also raised concerns about this. I went and talked with him. A lot of it was addressing misunderstandings of what UK Biobank was about.
Will you walk through communication with these altruistic participants? Imagine I was someone signing up for my data to be tracked anonymously in the UK Biobank. What would I be told or not told about for my measurements?
We wrote out through the NHS, inviting people who were living within about 50 miles of recruitment centers to participate. They were sent an information leaflet telling them what the study was about, and that there wouldn't be feedback on any of the subsequent analyses of the samples. When they came to the assessment center, things like their smoking history and body mass index would be recorded. They got about half a dozen pieces of information at the end of the visit. They were told about their heel ultrasound density test, which is a rather crude measure, but if you have a very low density, it can be suggestive of osteoporosis. They were told if they were overweight. That was about it. But it was very, very clear in the consent that this was not a health check and that they would not be getting feedback on their subsequent results.
We would be keeping them informed about what UK Biobank was doing in terms of the use of the data, research findings that emerged, and how they were contributing to helping to understand how to prevent and treat disease. We also had their agreement to get access to all of their medical and health-related records and [consent] to get back in touch with them to ask for more information, provided that did not constitute feedback. So we wouldn't go back to somebody and say, “You have a particular genetic abnormality and we want to get more information from you.”
Despite that strong limitation on information sharing, you hit 500,000 participants earlier than expected.
We set up a machine that worked. The IT team that set up the systems for running these assessment centers did a really good job. We were, if you like, running an airline. We had 100 seats per day in each of the half dozen centers we had open. We ran them seven days a week. If we only got 90 in a day, we would run out of money. If we went to 110 a day, the system crashed: the people ended up having to hang around waiting and they were unhappy. So the systems were put in place to monitor the process of inviting people, filling out the appointment slots, not overfilling them, monitoring to see whether there were any delays occurring in any of the centers on any day, as well as monitoring the quality of the data that was being collected. They did a very good job in managing that. So we were able to recruit a bit ahead of time and, most importantly, on budget.
It is a very impressive operational feat. I'm curious: the Biobank was established as an independent charitable company. Were there ever other institutional forms proposed?
Not really. They didn't want it to be the study of a particular institution: an Oxford or a Manchester University study. The MRC and the Wellcome Trust were keen for it to be seen as being a UK institution, with no researcher ownership, again to stress this accessibility. When I was made the Principal Investigator (PI) and Chief Executive, I made the point that people could not collaborate with us on using the data: it was their data to use. So although I am the PI, I probably published less on UK Biobank than almost anybody else. We wanted to break the mold and make it clear that the data were for researchers anywhere in the world to use for health-related research in the public interest. They made an access application, complied with the material transfer agreement for accessing those data, and applied their imaginations to learn more about how to prevent and treat disease.
What advice would you give to people who are trying to stand up an institution that can carry on complex research work in the long term?
We focused on, “What do we need to do now? What do we need to put in place so that we can do what we want to do in the future?” Don't try to do everything now. At the beginning of UK Biobank, there was a lot of pressure to say, “What's your access policy?” Our point was, we haven't got anything to access. We're focusing on getting something to access. We're doing that in such a way that we will be able to allow the data to be used in the way that we want it to be used, but we're not going to develop the access policy. We want to focus on, “What do we ask the participants for consent for? Do we get the information from the participants? Do we collect the samples in ways that can be used in 10 years time?”
One of the first things I did when I was made the CEO was to cancel an order for biochemistry analyzers. I said, “Let's recruit the people, let's collect the samples. We'll think about what we analyze when we've got the samples.”
Defer what you don't need to do now. That means you can focus your attention on the things that you do need to do now, and focus your resources on the things you can do now. Hopefully in the future, the things that you want to do will be easier to do and cheaper to do. If we had genotyped all the samples as we were collecting them, which some studies do, it would've been a lot more expensive than when we did them.
Perhaps more importantly, the quality would be worse, because you would be collecting [the samples] in the order of the participants you collected. If there were shifts in the type of participant and in the assay methodology, you'd find it very difficult to untangle those. Whereas if you wait until you recruit everybody and then analyze the samples in a quasi-random order, you get rid of any systematic difference between the participants. And you are using better technology because it's later, you use a smaller volume [of the samples], it's at lower cost.
You defer, you focus on what you have to do, and make sure that you are planning things in a way that allows you to do the things you want to do in the future. But don’t necessarily do them. People struggle with that sometimes. What's the quick win? The quick win is not doing something.
If you could go back to 2005 when you first became Principal Investigator and CEO, would you do anything differently this time around?
I would try to work out how we could get a million people for the same money. That would be number one.
What would that give you statistically, that 500,000 people will not?
You'll get more people developing disease within a shorter period of time. It’s the number of people who develop a condition that drives the [statistical] power [to robustly identify the causes of a disease]. I think the age range was right. We got quite a lot of diversity for the UK. One of the arguments that’s always made against the UK Biobank is, there aren't a lot of people from Africa or from India or from Asia in UK Biobank, but at the time there weren't [that many people from those regions] in the UK.
If you want to study a wider range of peoples, you need to set up studies like UK Biobank in carefully selected parts of the world. In fact, we set up a parallel study in China, the China Kadoorie Biobank, at the same time in order to address that. We also worked with the Mexican Ministry of Health to support a large-scale study in Mexico. So you can't get some kinds of diversity in a population like the UK, but we have diversity in terms of socioeconomic status, in terms of rural, urban, et cetera. But larger numbers would give you greater power to study more conditions and rarer conditions.
I wouldn't have done the biochemistry assays when we did them. We had to do hematology with fresh blood, but with the biochemistry there was a lot of pressure to do the assays of things we know about like cholesterol and various cancer markers. But I actually think it was a lot of work, quite expensive at the time, used quite a bit of sample and probably one could do that now with some of the newer technologies with less effort and less cost. So that was something we could have deferred. But in general, it has worked out well. The access model has resulted in a much wider spread use of the resource and many more findings than I had anticipated at this pace.
Maybe this is a silly way of framing this question, but if I was thinking of setting up a US Biobank, with say 600,000 participants, what would the value of that be when the UK Biobank already exists?
Under President Obama, the All of Us study was set up. I was one of the people asked to advise on that study, given our experience in UK Biobank, and I think it is going to be a very important study because it includes a much wider range of people than UK Biobank: very large numbers of people from various ethnic groups that will be interestingly different genetically and in terms of lifestyle and environment. It's important to have a few such studies in different parts of the world in order to understand the full range of genetic, lifestyle, and environmental risk factors, and a wide range of different diseases. If you go to different parts of the world, you have different rates of disease, and so studies that allow you to look at the widest range of exposures and diseases can't be achieved in a single country. UK Biobank demonstrates the value of having such studies and of making that data as widely available as possible.
I think our experience of doing so has been positive in two ways. One, it showed that if you make data available to researchers a lot of research findings come out. The second thing is that there's been a lot of external investment in making UK Biobank even more valuable. The UK government funded the genotyping, but it was actually industry that funded the exome sequencing, a large part of the whole-genome sequencing, and they're now funding proteomic assays on the data. The government supported imaging in UK Biobank: we've imaged a hundred thousand of the participants. But now industry and philanthropic organizations are funding repeat imaging. So every dollar of Wellcome Trust and Medical Research Council funding of UK Biobank has leveraged about $12 of external investment in enhancing the resource and then those data being made available to that very wide range of researchers. So that's allowed them to do even more research than otherwise would have be possible. Accessibility generated researchers using the data, generated inward investment, generated researchers being able to do even more with that enhanced data.
I think that that experience has helped other similar studies, like All of Us, consider the value of that kind of access model. The Mexico City Prospective Study has just been made available to researchers around the world. It was exome sequenced by industry. Those data are now available to all researchers, having been first made available to Mexican researchers to give them a head start for their contribution to having set it up.
One of your overarching lessons is that we underestimate the value of open scientific data that's available to a broad range of researchers. Although the UK Biobank costs a lot of money to set up in relative terms, it's actually not that much. As science and markets evolve, that infrastructure can become incredibly valuable in the future. Beyond assets like the UK Biobank, what other kinds of very large, open-access scientific infrastructure would you want to see?
We've seen so much over the last 20 years of the value of data. But the value of data is only if people can use the data. People are often more concerned about the sins of commission than the sins of omission. “What could go wrong if we make data accessible?” The risks are given great emphasis, whereas the risks of not making data available are not considered seriously enough. I think UK Biobank helps to redress that: 5,000 peer-reviewed publications last year alone, by making the data available.
The benefit of that is enormous. We're already seeing new ways of identifying who’s at risk, decades before disease develops, and therefore identifying ways in which we could target interventions to prevent people from developing those diseases in the first place. [We’re seeing] better ways of screening for disease: you can imagine precision public health, using things like breast cancer or colorectal or prostate cancer screening of people who are at risk, in a much more precise way. [We’re seeing] better ways of using cholesterol lowering and other cardiovascular protective treatments in people at high risk, based on the genetic and other risk factors data that are coming out of UK Biobank. So perhaps UK Biobank helps to redress that balance and understand the risks of not allowing data to be used, particularly in the health arena.
The ability now to have data on secure platforms, which is where we've got UK Biobank data, and having researchers come to the data also democratizes access. You don't need to have a big computer in your institution, because you can just come to the data, which is sitting on a cloud-based platform. The data are more secure, but more importantly, the data are more accessible in terms, not only being able to get at them, but [being] able to have the compute to analyze them wherever you are. One of the things that we've been trying to do is help support researchers in less well-resourced parts of the world, in Africa and India, but also Eastern Europe and South America, to take advantage of their brains by making the data accessible to them.
Any other lessons that we should try and get to for an American policymaker audience?
I think that one has to have a very long view. UK Biobank was set up 20 years ago and it's now a mature resource. It's now starting to generate important findings. There's real value in setting up studies now [that are] not competing with UK Biobank, they're complementary. They need to be interestingly different. I think the All of Us study is a very interestingly different population in the US that will extend the range of genetic, lifestyle, and environmental risk factors that can be studied. It will really start to come into its own in the next 10 years or so. The question is not where we want to be now. The question is where we want to be in the future, so thinking about a strategy of having a series of these large-scale studies and how they can be accessible and [can be analyzed] together to understand better how to prevent and treat disease.
People say to me sometimes, ”What's the sunset clause for UK Biobank?” I say, “This is a sun that never sets. This is a resource that gets more and more valuable as time goes by.” And that will be true of studies that are set up now. They will become more and more valuable as time goes by. We're just touching the surface at the moment. Sequencing is just the beginning. The proteomic assays that we're doing in UK Biobank are just the beginning of the kinds of detailed proteomic assays that will be possible in the next five or ten years. I think that what we're going to see is a set of these studies, that over the next 20 or 30 years will help us to understand how to prevent and treat a huge range of diseases. No single study is going to be sufficient.
Well, this has been enormously enlightening for me. I want to close with a personal question. You've spent two decades in this role as CEO of UK Biobank, which is a long time for anybody to be in any one role. I'm curious, as you reflect on those two decades, has it been as much fun as it seems like?
It's been an absolute blast. I'm learning all the time. I find myself thinking not, “What have we done?” But more, “What have we not done? What are the opportunities that we haven't yet taken and how do we take advantage of those to make the resource better?”
Just seeing how the research community has responded to having access to data has been very interesting. We started out with this project where people said, “You mean we can access the data? Really? We can access the samples?” And we would say, “Yes.” Once they got past the, “What’s the catch?” element of it, we found ourselves turning into a utility. We have to learn how to function as a utility and think about how we can make UK Biobank even more valuable for the research community. That's what we're focusing on now. But there's always the next thing we should be doing.
Share this post