How to Run Good DARPA Programs

"DARPA literally invented the modern world"

Dec 07, 2023

Is it fair to say that the Defense Advanced Research Projects Agency, or DARPA, is having a moment? You may know DARPA as the organization that helped invent the internet, or GPS, or features of modern computers, the computer mouse, and Siri. Lately, government reformers have wondered whether we could take the model of DARPA and build other, similarly effective government agencies. Recently, IFP proposed research directions for the newest ARPA, focused on health. And last month, I interviewed Jason Matheny on his leadership of IARPA, the ARPA for the intelligence community.

But will those ARPAs succeed? Can DARPA’s special sauce be replicated? What makes DARPA succeed, and when does it fail? And is DARPA itself at risk of losing its edge?

What You’ll Learn:

Why program managers get so much freedom
How a PM can fail
DARPA’s great weakness
Why VCs have misaligned incentives

Our guest today, Dr. Joshua Elliott, was a Program Manager at DARPA's Information Innovation Office from 2017-2023. While at DARPA, he led successful programs across a wide range of fields: better understanding human intelligence, using AI to digitize paper maps of geologic data and identify sites for critical mineral exploration, creating better crisis prediction tools, and more. Elliott joined DARPA from the University of Chicago Computation Institute, where he led research projects on socio-technical change, decision-making under uncertainty, and environmental impacts on food security. He holds a Doctorate and Master of Science in Physics from McGill University, has founded a startup, and was a Fellow at Argonne National Laboratory.

Joshua Elliott — RDCEP — Dr. Joshua Elliott

You had a long academic career before you arrived at DARPA. How did you end up at DARPA?

My background is all over the place: climate and economics, energy systems and agriculture and hydrology. And my PhD is in theoretical particle physics. I've never actually taken a computer science class my entire life. I had gotten fed up with academia and planned to take a one-year leave of absence from the university to run this startup full-time.

A month before that was supposed to start, I got a call from a DARPA program manager I'd met one time, randomly. He asked me if I'd be interested in interviewing there. My first reaction was literally, “This is Joshua Elliott from the University of Chicago. It's a pretty common name.” But he convinced me that he was looking for me, so I jumped on the opportunity. I've been obsessed with DARPA since I was a 12-year-old computer nerd, so it wasn't a terribly hard sell. DARPA was a mythical place.

As a young nerd, what put DARPA on the map for you?

DARPA literally invented the modern world in so many different ways. Oftentimes it was just willing to put funding behind an idea that was nascent.

Somebody pointed me to “the mother of all demos” that Doug Engelbart did back in the late 1960s. It's the most fascinating thing. This was a Stanford researcher who started the Human-Computer Interactions Group at Stanford Research Institute. He was funded by DARPA, largely by J. C. R. Licklider when he was at DARPA.

Engelbart set out to imagine what an interface between a computer and humans would be, at a time when computers were giant boxes and interfaces were punched cards. He sat down and invented the keyboard, the mouse, the first proto-word processing capability, the first proto-spreadsheet capability, where you could actually do calculations in a spreadsheet mode. And he did this amazing demonstration at this computing conference in San Francisco, they call it the mother of all demos. It's absolutely fantastic. The videos are amazing, you should go and watch them.

To make this demo happen, he got access to one of the first projectors that ever existed, that used this windshield wiper capability that would wipe things onto a screen. Not only did he demonstrate all of these things in one demonstration – the first mouse, the first modern keyboard, a monitor screen interface, word processing and spreadsheets and all this different stuff – he then demonstrated the first Skype call. They had dedicated network capacity between Palo Alto and San Francisco, and he video-called one of his research assistants from his Palo Alto research, and they collaborated together on a document that he was working on.The guy's face goes up in the corner of the screen, it's wild. He literally demonstrated Skype in 1968. All of these things that we now think of as modern computing and it took decades before some of these things became mainstream.

That blew my mind. Everybody in the computing world thought this guy was a complete and total lunatic, and DARPA believed in him, pushed him forward, and went to these great lengths to make this amazing thing happen.

Sometimes DARPA programs are the brainchild of a DARPA program officer pushing it forward. Sometimes there's some crazy person that no one else will fund, and DARPA comes in and funds them. Like with Engelbart, it's that side of a hybrid story. But DARPA is often behind the scenes. It's not that DARPA comes out and says “Hey, we invented all this.” It’s not a big splashy organization. It has the opposite profile to NASA.

As a PM at DARPA, when you came in, did you have a clear sense of the projects you wanted to lead? Or were you closer to the Engelbart case, you got convinced that some things were worth funding?

I had a pretty clear idea. I ended up doing a ton of things I was not planning on doing, because everybody comes into DARPA with what they think is a clear idea of what they want to do. It's even how we do the interviews. You come in and basically give a practice pitch, where you pitch a program, and that's the biggest part of the interview process.

But I would say 9 times out of 10, you end up either doing a modified version, or you might not actually do that at all, because when you get there, there's so much energy and ideas flying around. In many cases, the environment itself can take your thinking and ambition to a whole new level.

Give me an example of that. What does that mean in practice?

I pitched AI for scientific modeling and data analysis as this integrated platform. I ended up doing a couple of small pilot programs on it, and finally a big program called Automating Scientific Knowledge, Extraction, and Modeling (ASKEM). Before I even sold that program, I’d been really excited about human-machine teaming, which was, especially six years ago, a critical new area.

I had been talking to folks in cognitive psychology and organizational psychology, because another obsession of mine is institutional design, and the concept of collective intelligence, this idea that a good team of humans is not just the sum of its parts. There's something more about it, and we were trying to figure out how to measure that. You can put the three smartest people in the world on a team and they can be a terrible team, right? In the DOD, this is of course extremely well known, because we have lots of small, high-performing teams from Special Forces and all these other areas. Teaming is a critical aspect of how everything gets done across the US military.

I became obsessed with what makes humans able to team at various scales, from small teams all the way up to when you're driving on a highway. When you’re driving, you’re in some sense in a team with all the other people on the road, and your goal is to get to some place without crashing and dying, right? The key observation is that we would call this broad category of things social intelligence.

At an individual level, the best mental models of it are around what we call theory of mind. Humans have this ability to interpret lots of data coming from another human being, build a model of the latent states of that human, what are their beliefs, what are their goals, and then use that model to actually do forward prediction: to predict what they're going to do next.

That theory of mind is this incredibly powerful tool, and it's also incredibly human. At that time, we'd really never been able to demonstrate anything you might call “machine theory of mind,” in anything but caricature settings, like a little block moving in a two-dimensional surface.

So you took that idea and you managed a program around it?

I designed, created, and launched a program called Artificial Social Intelligence for Successful Teams (ASIST). That was about creating AI with enough characteristics of social intelligence that it can understand the dynamics of a small team such that it can collaborate with that team, coach it, and help it stay synced and operate effectively.

What did creation and management look like? Were you hiring people from outside and actively overseeing their work?

Yes. So the DARPA model is very program manager-driven. We come up with ideas, build those ideas into programs, “sell” those programs, basically pitch those programs to our leadership, and, if they sign off on it, then you have your budget, your mandate, and you're off to the races.

You then put out a call for proposals based on your idea. You get a bunch of proposals, you select the ones you want to fund within your budget, and then that is your program. So the program will have multiple technical areas. For example, in ASIST, one technical area was developing AI agents. One technical area was cognitive, psychological, and social science. And we had a third technical area, which was integration and evaluation. That was a single team we were looking for, that could build out the test bed in a virtual gaming environment, design all of our experiments, do our evaluations, design all of the APIs necessary to connect all these complex pieces together, etc.

The program is built like that, out of technical areas. We fill those technical areas with the best possible proposals. Hopefully you end up with a fully formed program with all the components. If you don't get a good enough proposal in one area to fill a gap, you might find alternative means.

Once the program is created, we have a very actively managed model. There’s a kickoff meeting when the program starts; everybody gets together over several days. Then we have in-person meetings with all of the principals in the program. Sometimes that’s as many as 200 people coming to these meetings every six months, which is a very aggressive cycle.

We're having working group calls five times a week. The program managers are deeply engaged, often traveling around the country doing site visits, where you spend an entire day digging into the details of what every performer is doing. It’s a very active-management approach to innovation.

As I understand it, the program manager is relatively lightly managed, right? You have broad autonomy to structure and manage the program the way you want.

Once the program's approved, that is true. We have broad latitude and control over how we do it. Once we select the programs we want to fund, we do this source selection meeting with our office leadership. We say, “I'm going to fund this and this,” and they sign off on it.

After that, it depends on what level of review process we’re talking about, but generally we'll have annual review talks. You'll come in and give a briefing on the status of your program, either to your office leadership, or to DARPA director-level leadership.

In a previous interview, we talked to Jason Matheny about IARPA. He talked about what might stop a proposal from getting through review. But once a program has been approved by your higher-ups, what are the ways it can fail?

Wow. That's a big question. The first thing that's worth pointing out is that this is always a struggle, because it's a struggle at any organization.

But we try really hard to be failure-tolerant, as you know. If no programs are ever failing, then we're not taking enough risks: that's what we say at DARPA. There should be some high fraction of failures if we're taking the appropriate level of risk. But on the other side, very few programs fully fail, in the sense that they don't produce any meaningful advancements. Even programs that don't reach their initial stated goal often advance fields in different ways.

One construct I like for thinking about how things can fail is borrowed from my friend Adam Russell: You may not fully understand a problem, or you may not be asking exactly the right question. It's a failure of knowledge, right? Sometimes that's because of the timing of the program, that you weren't quite timed to it. It may be that there's a failure of understanding the advances that are possible in a space. That’s more of a technical failure.

How might a PM misunderstand the state of play of a particular technology?

An interesting example comes from the recent AI revolution, because most people didn't really see advances from LLMs and foundation models coming. It certainly caused me in a couple of my programs to pivot, which you may call a type of failure, right?

In fact, I had one program where I'd invested probably $10 or 12 million over the course of four years in doing old-school natural language processing and knowledge graph construction, in order to do causal knowledge discovery and delivery for analysis.

And when foundation models came along, I basically ripped out all of that complex, bespoke, computationally intensive infrastructure, and replaced it with a fine-tuned LLM. You could ask causal questions and it worked just as well, in some cases a lot better, on our first pass.

That's an example where it's not necessarily that the premise was originally flawed. Five or six years ago, when I pitched it, it wasn't flawed. But I didn't foresee how rapidly the evolution of technology in this space would happen. So it's a technical failure, in the sense that I didn’t recognize that background evolution within the AI space would make what I was doing unnecessary.

DARPA spends a lot of time recruiting, because PMs are term-limited. How much of the recruiting process focuses on managerial skills? Obviously, DARPA is looking for area experts. How much focus is put on managerial skills?

That's a fascinating question. I have a side project called Brains that I'm working with Ben Reinhardt from Speculative Technologies on. It was originally an acronym for “the Breakthrough Research Accelerator for Innovative Nonprofit Science,” but I was told acronyms are not cool outside of government anymore. So we shortened it to Brains. It’s an accelerator for training and mentoring ARPA PMs, or people that want to start an FRO with Convergent or to start their own nonprofit startup. It's like Y Combinator but for pre-commercial nonprofit science. So I've been thinking a whole lot about what makes a good program manager.

I would say that the traditional model (and this has varied depending on leadership) did not have a ton of specific characteristics. The way we talked about hiring a PM was: PMs need to have big ambitious ideas, have a fire under them, and be able to communicate. That's it.

Management experience was not something that we thought a lot about in particular, because we're often hiring academics, and academics are the opposite of management experience.

Is that a flaw in the model? It sounds like you see management experience as a need.

There's a bunch of different models or templates of program managers, and none of them are necessarily better or worse than others, but they fit more squarely into specific roles. A certain type of person is better at doing a certain type of program or in advancing a certain type of technology, right?

Some program managers are these big, inspiring thought leaders. They're not getting into the nitty-gritty of much of anything. They're coming out with big ideas and they're pushing people to be ambitious. They're visionaries. And there's a bunch of quintessential examples and those people are really good at doing the brand new stuff DARPA is probably best known for. Licklider is probably a good example, from way back in history.

But you also have lots of examples of extremely successful program managers who are the management type, who could take a nascent idea and get it over the finish line, into transitionable technologies that could revolutionize a field.

There's tons of examples of those in the computing field, especially in hardware and microprocessors: stuff where DARPA hasn't necessarily created a whole new model for computing (although I'm sure we've done that too). There's these kooky ideas out there about new microprocessor designs, but none of the companies are willing to pick them up because they're way too early and pre-commercial. These things take tons of money and investment and time to develop. A really ambitious startup program manager, who's a really good planner and has a vision, but is really a get-sh*t done type, can organize and push these things over the line.

I don't think one is better than the other. But it’s an interesting thought exercise: whether certain programs should be operated by certain types of program managers.

It seems that there's a tension between DARPA’s non-hierarchical, non-bureaucratic structure, and the fact that some of DARPA’s most successful projects were run by folks with a keen grasp of modern managerialism. Is that fair to say? It’s like DARPA’s trying to pick out the best parts of the business world and the hands-off DARPA ethos.

The tension often shows up most when you’re trying to transition technologies, whether it's into the DOD or more broadly.

When you’re trying to get a technology transitioned – it can either be fully developed into a capability, or maybe it's already a capability and you want to transition it directly within the DOD, etc – that requires an amazing ability to navigate the government bureaucracy. It requires someone willing to do the meticulous, hard work of knocking on a thousand doors, someone who can navigate the complex politics of the bureaucracy and these systems. The goal of DARPA is to do big, ambitious, high-risk ideas that have the potential to change the world. But at the same time, we’re trying to deliver capabilities. And transitioning capabilities often requires a very different skill set than creating and launching world-changing ideas. There is definitely a tension there.

You're working on tools to help more folks do ARPA-like stuff. In my world, there are often calls for “an ARPA for X,” or “an ARPA for Y.” How many ARPAs should there be? Are there fields where the ARPA model is just not applicable?

It's a great question, and it has multiple layers. There was an article a few weeks ago by John Pashkowitz, former DARPA PM, titled No, We Don’t Need Another ARPA. His point is that ARPAs are great for doing this applied research translation, but what we're really lacking is the next step in the value chain: these translation organizations that can actually scale technologies, get them into commercialization pathways, and turn these big scientific or engineering achievements into advantages for the US economy or military. That's his argument.

I take a somewhat more political look at this question, which is that we need more research, right? We can all agree that we need more R&D. And if it's more politically viable for Congress to fund an education ARPA than it is to just give more money to education research, then fine. If it's more politically viable for Congress to fund an ARPA for infrastructure than it is to give the Department of Transportation funding for transportation research, then fine.

If there's some political magic in those four letters, then I say we should take advantage of it.

For better or worse, ARPA has a great brand, and it seems to be able to unlock Congressional gridlock. That has to be a net good, even if we can argue about decreasing returns to scale, or whether the ARPA model fits perfectly into all of these different areas.

Many folks in venture capital see themselves as playing that translational role, of taking a concept and shepherding it to commercialization. Are they wrong? What's missing from that ecosystem that makes your colleagues say, “No, we need other kinds of organizations”?

It's a great question. The venture model, it turns out, doesn't work nearly as well for nearly as many things as we may have thought. The pressure it puts on the rapid scaling of a company is not always optimal.

In particular, it's not clear that startups and VCs are the pathway for technologies with massive potential implications for the government, whether at DOD or otherwise. To be clear, I helped launch dozens of startups from my programs, and I think that's an incredibly positive mechanism for impact.

I was working in AI, which is obviously extremely fertile ground for it because it's software, it's scalable, etc. It fits the traditional model of VCs. But if you’re trying to scale up new companies that disrupt the mining industry with new technologies for mineral processing, that use biotech or extreme electrochemistry or whatever, that’s more difficult, because it's a more capital-intensive industry, and an extremely difficult competitive landscape dominated by extremely big players. And until very recently, it’s not been an area that VC traditionally focused on, so there's not a lot of experience or expertise in the VC community.

VC is one part of the ecosystem. But if you don't have the support for companies to scale these things, then the industries can struggle. Sometimes that means government funding, sometimes it means support in other ways: the appropriate infrastructure and incentives for these technologies to scale.

One line I've heard before is that DARPA’s position within DOD gives it advantages over other equivalent organizations. How much do you think that's true? How much is DARPA sui generis?

We talk about that question a lot. Of course, the answer is that it's complicated. But the fact is, it offers a variety of different advantages. The most obvious one is that you have a built-in customer, one with very deep pockets. That helps enable advances in DARPA to be propagated and expanded.

It also helps having really clear mission guidance, even if sometimes that mission can creep quite a lot in terms of what “national security” is. But you have a clear motivating mission because of that, that you don't often get elsewhere.

People that were involved in the Homeland Security ARPA, HSARPA, when it existed briefly and then collapsed, will tell you that a big part of the problem was that there was no clear mission. A clear mission is really important for helping you make efficient decisions, and know which risks to take and which to avoid. A mission is something that you very clearly have within the DOD.

What mistakes did you make as a PM?

Oh my gosh, I feel like I made so many. A huge part of program management is people management. Designing a program so that it's not just positioned to be technically successful, but so that the program as a social organism is positioned to be effective at accomplishing its goals. It definitely took me longer to learn and fully internalize that than I would have liked.

Effective management, right? The amount of attention you really need to pay to people and the organizational interactions that are necessary to make collaboration and communication work in a program. That's a hard problem, and one that I certainly wish I'd learned earlier.

Another mistake I made was spreading myself too thin across lots of different programs. I have a habit of trying to do way too much. While there's no theoretical limit to how many programs you should be running at DARPA, when you have this model of extremely active management, even if you have a fantastic small support team around you of SETA advisors, there's a certain limit where you start to see things falling through the cracks, simply because you're trying to manage eight programs and you have fourteen different tech transitions.

You're working with different parts of the government, military and private sector. And you're trying to further extend different networks in order to communicate X, Y, and Z and design different experiments.

And honestly, the motivation at DARPA is to constantly do more and more, right? DARPA PMs are fundamentally highly ambitious people. As soon as you sell one program, you're already thinking about the next big idea you want to sell, because you only have four to six years at DARPA, and you want to get as much done as you can.

Is there competitive pressure among PMs?

That can be a bit cyclical. There's definitely competitive pressure between offices, but it often depends on the way that the budget is being designed in the building. Directors like to do it in different ways. Sometimes they like to say, “Each office has a budget. Here's your budget: create programs in this budget.” And some directors like to say, “Nope, there's one budget at DARPA, and the best programs get the budget,” and offices end up getting to spend what they can. In that model, obviously there ends up being competitiveness amongst the offices and an incentive to try and get programs up as fast as possible so that you can fill up the available budgetary space.

Between PMs, within an office, it's an extremely collaborative, congenial environment. That was one of my favorite things about working there: I could just walk into anybody's office and start talking to them, and often walk out with a crazy idea for a new program. Often we explicitly collaborated with people on programs. We would all put effort into mentoring new program managers when they came in and helping them get started and learn the ropes and pitfalls.

What do you disagree with your colleagues about?

Too often at DARPA, we count on there being some sort of magic. Maybe we do that less often than other institutions, but it happens. We think, “we'll just take smart people and slap them in this role and give them this drive and ambition and the possibility of big budgets and they'll do huge world changing things.” To some extent, that's true. But DARPA, like any institution, is a collection of human beings. Human beings have their own incentives, biases, personal ambitions, risk preferences, all these different things.

That can create this drift towards institutional scar tissue that every organization faces. DARPA relies on the fact that this magic that is DARPA will come in, and innovation will happen, and we don't need to worry about that stuff.

But we do need to worry about that stuff. I've had a lot of conversations with my colleagues and leadership, especially over the last year as I was wrapping up at DARPA, about needing to rethink how we define our objectives and our processes so that we can actually consider the human dynamics of institutions, and not just assume that if you set something up right to begin with, it will always be right forever.

Give me an example of what instantiating that would look like?

The only good analogy I've ever had for it is: Every organization needs processes and bureaucracies, right? Those are important. But the problem is that humans come in and they add new processes and rules, and they layer on each other and have nonlinear effects. At a certain threshold, you have so many different processes that you start to get deep contradictions amongst them, and it creates these massive bottlenecks in actually getting things done.

DARPA has a habit of building up things over time, and then somebody comes in and sweeps a lot of stuff away and starts over from scratch. Which is great, because most places don't even have that ability to do that. But that might take decades.

I've been inclined to convince people that we need to take a more constitutional model of institutional management, where we say, “Here are our principles and the implications of those principles, almost like a bill of rights and responsibilities, and here are the implications of these principles, this constitution. We're going to come back and reassess our processes and our bureaucracy based on this constitution that we wrote down. If it doesn't line up, if suddenly we're taking too long to do things, or our risk tolerance has declined, we'll use the mandate to reassess our bureaucracy and process to get us back in line with our originally stated principles. That's something I strongly believe in, although that sort of stuff is way above my pay grade.