Statecraft will be off for the week of Christmas. We’ll see you in 2025!
Today, we talk to Jennifer Pahlka and Andrew Greenway about their new paper on state capacity. It’s called “The How We Need Now: A Capacity Agenda for 2025 and Beyond.”
We discuss:
What is “state capacity?”
Why is there fresh interest in the topic in the UK?
How did the model of a “government digital service” spread to the US?
How do you fix unemployment insurance?
Andrew, you're British. Jen, you're American. But you're both interested in “state capacity.” Help me understand what that is.
Jennifer Pahlka: The academic definition is just “the ability of a government to achieve its policy goals.” It's relevant now because we're seeing our lack of ability to do that, even with a robust state and enormous budgets.
Andrew Greenway: Over here, state capacity particularly manifests as the translation of political and democratic intent into tangible outcomes that the public can actually feel and see.
In undergrad, state capacity was a political science concept for me, an academic term. But recently, the phrase state capacity has come up more often, like it's something the public is actually interested in. Do you notice that?
That may be a stretch. I'm not sure the public is that interested.
Let’s say sections of the public are interested — the Ezra Klein-listening public.
Jennifer: It is having a bit of a moment. If it’s relevant to the average person these days, it's due to frustration with not feeling tangible outcomes. It’s the disjuncture between hearing, “We're going to be investing in new infrastructure for clean energy or EV charging stations,” or, “During the pandemic, you're going to get your unemployment insurance” and then it doesn't happen, or it doesn't happen in a way that the people who wrote that law really intended.
So although I don’t think most of the public knows the phrase, I think they would say it’s important to them if they understood the meaning of it, because at the end of the day, that’s what we all care about.
Andrew: In the UK, I think there's probably a bit less discourse around “state capacity” as a phrase.
But it’s definitely coming through, and one reason is that people have gotten quite tired of other phrases. We talk a lot about “civil service reform” and “public service reform” here. Those phrases have maybe not become discredited, but there's an element of weariness with those concepts, partly because people have heard them trotted out for decades and not really seen or felt the difference.
You both wrote a paper together on state capacity: it’s called “The How We Need Now: A Capacity Agenda for 2025 and Beyond.” What's the point of writing a public policy paper that's partially British and partially American?
Jennifer: I wanted to write about state capacity in the US, but it feels to me like we can’t imagine some of the things that we need to imagine without being able to see it somewhere else.
Most of Andrew’s team at Public Digital were part of the Government Digital Service (GDS) in the UK, which came about around 2011, and I’ve always found that they were a bit ahead. They’re actually about to show examples of some of the things we need to do, like closing the loop between policy and implementation. I wanted Andrew to join me so we could move the Overton window for what people are thinking about in the US.
Andrew: Us Brits tend to be quiet and polite, so I always feel slightly embarrassed on our behalf when Jen says we’re way ahead. It sometimes doesn't feel like that, but the UK has played a leading role in some of this thinking in the past.
As readers may know, we've got a new government here, the first Labour administration in 14 years. They have come to the realization quicker than some governments have, that the system of government is not quite ready or quite fit to deliver the level of ambition that they have.
In fact, there was a speech last week by a senior cabinet minister that was laying out some of his intentions around this.
What's going on politically and culturally with the new Labour government on this question of state capacity?
Andrew: The new Labour government came into power in July, so they're still relatively early days. One of the ways that they've set themselves up and framed the government was describing themselves as being “mission driven.” They’re leaning on some of the work of Mariana Mazzucato and that “entrepreneurial state” side of things. They’ve set these five missions: things like growth, health, crime and so on. But strongly implicit within that was that they were going to approach how they went about things in quite a different way.
We wrote a report back in the spring called The Radical How. It put some meat on those bones of what mission-driven government might mean. Looking back primarily in the UK at the good, bad, and ugly of governments delivering complex things, often some common threads appeared in the successful examples: things like taking iterative test and learn approaches, having multidisciplinary teams, and so on.
The political context here is, effectively, that this government will have to show people quite quickly that government can do things. This is not so much the usual UK fight of centre-left vs. centre-right, Labour and Conservatives. There's a growing sense that people don't buy either of those shades, because neither of them has stepped up and delivered what people perceive as real progress.
This government is staking quite a lot on delivering something tangible — in effect to rebuild trust in politics and institutions — and as a guard against populism. We have a harder right party, Reform, which is gaining in polls, and is essentially trading on this government inability, left or right, to do things. This government is framing a lot of its work around Reform, countering that and showing the difference.
Jen has talked a lot about how in the US context, civil service reform more often pops up on the right: see Schedule F, or DOGE.
Andrew, in the UK has civil service reform been a similarly right-of-center interest historically?
Andrew: Yeah, the work I did as a public servant helping set up the Government Digital Service: that came in under a centre-right government that our minister at the time, a guy called Francis Maude drove that. But he approached that problem with quite an intentionally cross-party perspective.
Effectively you had the three biggest parties in the UK, either in government or with very recent experience in government, that had all collectively come to the same conclusion that the state of affairs wasn't working.
We’d just spent the thick end of 12 billion on a healthcare IT system that spectacularly broke. That's not a partisan issue. That's just a sense that the system is not delivering here. And I think here as well, the debate is perhaps less framed in terms of left and right. I think there's quite a strong center around this that spans the parties, but it's a debate almost about people and systems.
So there's a guy here, Dominic Cummings, who's got flavors of Musk about him, who I think would argue that “Actually, there are thousands of officials who've just been throwing sand into the gears. We need to get rid of a hell of a lot of people, just properly shake this up, effectively cull quite a lot of the public service. “We need some more technologists in the room and let's move it forward.” Versus those who go, “That's not wholly untrue, but equally there's some fundamental structural things we need to do about incentives within the system that get the most out of the talent that we've got in public service.”
Jen, you wrote an essay earlier this year called, “The Brits are way ahead of us, again.” What exactly did you mean?
Jennifer: Before I answer that, I just want to say one quick thing about your last question: I don't think Schedule F is civil service reform. We need civil service reform of the kind they're talking about in the UK: making it possible to hire on the basis of merit, fire underperformers, promote the right folks. That’s all a merit-based system. I think Schedule F is very different territory.
But about the Brits: A pillar in this paper is that we have to be able to close the loop between policy and implementation. That's illustrated by this metaphor of a waterfall vs. an agile loop.
The way it works today is you pass a law, write a policy, and it goes into this machinery of government. It's a hierarchy in which work descends through the layers.There's really no affordance for feedback in the middle, as you're doing the implementation, both for checking with the folks who wrote it about their intent, but also for the teams to explain to the folks who wrote it, “This is what we're learning while we're implementing it. If we do this thing that seemed like the right thing, in fact, we're going to get a very perverse outcome.”
Andrew’s report The Radical How has a really fantastic example of a team in the UK closing the loop. You have this quite complex policy, Universal Credit, that has to do with combining a number of social benefits together into one.
Their small cross-disciplinary team did a hard reboot on it in a way that's far beyond what we've been able to see in the US so far. They had the minister responsible for the department in the room with implementers. Those implementers had done user research earlier that week showing that a particular policy edge case created a benefits cliff for somebody who needed this benefit [a benefits cliff is a sudden decrease in benefits that occurs when a person's income increases]. They showed that artifact of user research to the minister and the minister said, “Yes, that's wrong. Let's change it.”
That is closing the loop. That is testing and learning in a way that's going to get you a policy that actually gets the outcome you intended. I really want us to be aspiring to that level of tight feedback loops.
Andrew: Universal Credit was the combination of a bunch of working-age benefits into one. It was effectively the biggest domestic policy of the administration from 2010 to 2015.
And to clarify for Americans, the idea was you'd consolidate a bunch of existing benefits programs in one program that was administered through one central entity.
Andrew: That's exactly right. They initially approached that problem through a very typical traditional government lens: The minister had an idea. He gave it to some smart people. They turn that into a lot of policy requirements and regulations. They then tried to convert them into a bunch of technical requirements. They then chucked them over the wall to the usual outsourcers to try and build this thing.
After three years, they'd spent about half a billion dollars and got quite a lot of paperwork, and not a single user had tested this thing. There were red flags all over the place.
There was a hard reset called at that point. They started again: New building, new team, multidisciplinary, not that many people, maybe 30-odd people in the room, even fewer to start with. They started by building an end-to-end prototype of the service, and tested it with a very small number of people in a specific part of the country.
They looked at that test, learned a bunch of stuff, went back out, and incrementally tested it with larger populations of people over greater regions, and eventually scaled the service nationally. The real proof of the pudding in Universal Credit was during COVID in 2020: overnight, there were millions of additional people claiming this benefit. Universal Credit saw a 12x in the scale of demand without falling over. But they were also changing the policy, changing the actual rules behind that service, as well as the operations on a daily basis in those first three or four weeks when it was really crazy. They could do that because they set the whole thing up on that footing from the get-go. To use a Sherlock Holmes metaphor, it's one of the dogs that didn't bark during COVID. It worked, and people didn't notice, because people expected it to work.
Jennifer, will you compare how universal credit performed under the stress of COVID to how our benefits programs performed? Maybe you can do the bragging that Andrew is loath to do.
Jennifer: Let me not throw our country under the bus entirely: we had some benefits programs that did pretty well. We offered SNAP benefits to kids who couldn’t get meals from school. Many states chose to deliver those automatically without making people apply, and I think that worked relatively well.
But, yes, famously, a big one was unemployment insurance because the pandemic threw so many people out of work. Every state developed a pretty enormous backlog of claims. I happened to work on one, the unemployment insurance crisis in California, where we had a 1.3 million claim backlog. Pulling up the hood and looking at how those systems really worked was shocking.
Say more. When you opened up the hood on unemployment insurance in California, what did you see?
Jennifer: Most people think the problem here is that a lot of these systems are written in COBOL. That is somewhat true: there is COBOL in the bottom archaeological layers of these IT systems, and layers that have accrued on top of that over the years.
But the problem really is just outdated processes that don't make sense anymore and really have never been challenged. Those archaeological layers of technology all map to archaeological layers of policy and process that have come down from the Department of Labor, the state itself, the courts, and the decisions they've made.
One of the things we found out is that, in order to be competent at processing a claim in California, you need to have worked there for something like 17 years. So they’d hired all of these people. I think at the time that I had started, they'd hired 5,000 contractors, mostly through Deloitte, to come in and help process these claims. They didn't seem to be aware that not only practically but legally, those people could not process claims.
Just explain that to me. What was the legal requirement?
Jennifer: The legal part of it is that there are provisions in state law, and I think this is true in many states, that reserve that kind of work for what's called merit staff. You're not allowed to have somebody outside come and do that, for two reasons. One is, it's a safeguard against people trying to do something that's crazy complex without the training. And the other thing is, I think the unions put that in, essentially for job security. But these people have great job security! We were pulling people back out of retirement as fast as we could, to get people in the door there.
But on a practical basis, you're talking about having to know many distinct systems that don't work in intuitive ways whatsoever, and a lot of rules and hacks and workarounds to process a claim.
Now, if a claim was deemed to be valid, it actually sailed through. I don't want to undersell that we got out some ridiculously high number of claims, but the ones that were the problem are the ones that had to get handled by claims processors in a non-automatic way.
The really scary thing about it was how we decided — or how the system decided—whether you got that special handling. If there was a concern that you were not who you said you were, you got that handling. The way the system ran, you got flagged for additional identity verification if there was any mismatch between your name, Social Security number, date of birth, and what's in the Social Security database and other databases that we check against.
Think about the IDs that match perfectly. They're the ones auto-submitted by folks who have gotten lots of data off the dark web. When an application is submitted by computers, it's absolutely going to say “Jennifer L. Pahlka” because it's going to get my Social Security number right. The ones that got flagged for additional processing were the ones in which people forgot, say, to put their middle initial. Humans make mistakes. Computers don't.
We saw all of these very valid claims getting flagged for additional processing, whereas the ones that went in perfectly because they were submitted by bots, through fraudulent crime rings, sailed right through. This is a big reason we had that enormous problem with not only poor service delivery where legitimate claimants were waiting many months for their check, but also enormous fraud.
What's the British idea that would have helped in the unemployment case?
Jennifer: I think unemployment insurance is just an example of literally decades of neglect. People who run unemployment insurance systems in states have been subject to enormous criticism over the years. Essentially, whenever there's a dip in economic activity, which creates more unemployment and they have a surge in claims, they get highly criticized for backlogs. But in the normal years, nobody wants to invest in them. Nobody wants to spend the time understanding how they're broken.
If they get any attention, it's for a fraudulent claim. So the culture — you could really feel it in the Employment Development Department in California — is so afraid of fraud. The sad and ironic thing is that fear of fraud pushes them to be very defensive of the systems that they use today, these processes like flagging you if you miss your middle initial. In order to avoid the criticism of having opened this up to fraud, which actually opens us up to fraud. We get more fraud because we make public servants so defensive, and the only thing they have to stand behind to defend themselves is existing processes that don't work.
But we really need to get in there with the air cover from elected officials, from leaders with power saying, "Let's stop the blame game and care, not just about the policy that's been passed, but about the mechanics and the talent and the procedures that we're relying on to deliver these policies." We just completely tune those out between crises.
My mental model for this stuff is that the public cares a ton about fraud. Pressure from the public that services be delivered fairly and without cheating seems like a fundamental feature of democracies.
If you assume that there will always be some sources of pressure on civil servants from that perspective, how do you avoid putting them in this position where they're creating more bureaucratic process and difficulty for the correct recipients of these benefits?
Jennifer: I think we think about fraud fundamentally as incompetence on the part of a state agency, instead of the result of years of layering on mandates and constraints. It shouldn't take 17 years to learn how to process a claim for unemployment insurance. In its basic conception, it’s just not that hard.
Legislators need to learn to subtract as much as they add, and they need to learn to think differently about oversight. Oversight's really important: I'm not calling for less oversight. I'm calling for a very different kind of oversight. Not, "Oh, look, we got a bad outcome. Let's go have an outrage hearing." But, "Let's understand the many mandates and constraints we've placed on this agency, which makes it so hard for them to operate in a way that's competent, in a way that can actually deal in a 2024 kind of way with fraud."
Andrew: The Brit in me is desperate to qualify how good we are, by explaining that it's not all that great. One of the really important things about Universal Credit is that it got to a point where a crisis was declared. There was a point at which everybody realized, both politically and officially, "We've got to get ourselves out of this."
I say that because it's certainly still true here in the UK that this kind of working does happen here now and again, but it’s by exception. It's because of exceptional circumstances and usually exceptional leaders who drive reform through despite the wider structural pressures, be they the perceptions around fraud, be they even the natural inclinations around internal governance where the inclination is, "Oh, we'll have a program steering board on this thing every six weeks and a lot of papers will get written for it."
Instead, let's get the minister and the senior officials who are responsible for this whole thing in the room with the team, and the team can show them what they've built this week, and then they can show them the user research and you make the decisions based on that.
Both of you talk about the value of multidisciplinary teams, as opposed to having single generalists come in. Andrew, I'd love to hear about your experience in the UK's Government Digital Service, which is modeled on this team approach. Then I want to hear about what Jen picked up there that helped you stand up the American version.
Andrew: I was incredibly lucky to join the Government Digital Service at the time that I did, because, hands up, I'm a generalist. I'm one of those dangerous people. I was a young, ambitious civil servant. I could have easily gone to the dark side.
I guess the simplest way of describing the GDS team members is that they were just formidably smart, but in completely different ways. You were sitting alongside developers, service designers, people who wrote well for the web, user researchers, genuine deep practitioners in what they did and what they do. And you were collectively brought around an outcome that you were there to deliver. My job as a more policy guy wasn't to write the policy paper. My job was to work with the other people in my team to deliver whatever the overall outcome of the thing that we were shooting for.
That seems super basic really, and it is. There's a lot of this recognized practice in many parts of the private sector, but that just didn't happen in government. That's partly because obviously you get these tribes in the silos, things getting tossed over walls to the other.
But there's also some quite interesting stuff about power dynamics between those worlds too. Certainly in the UK, the generalists and the policy types or the Treasury economist finance types — they were the first among equals. They sat at the top.
If you're involved in tech or even more so in operations, actually doing stuff and being on the front line, you were “below stairs,” really. You just got these gaps between these different tribes that reality failed to intrude upon until it was too late. One of the more profound organizational changes that GDS espoused was bringing those kinds of worlds together around an outcome in the same team.
Beyond delivering more effectively, it was transformative to the experience of being a public servant. Rather than telling your parents, "I went to a meeting or two and I wrote some stuff" — it's "No, I did that thing. It's easier to get a passport now, and part of that was me." It's a really transformative experience to be a part of.
Jen, you visited the UK GDS, and you came back here and said, “We should do the same thing.”
Jennifer: Yes. I was running Code for America at the time and had briefly met two of the folks who started the GDS — Mike Bracken, and the minister responsible, Francis Maude, who'd come by Code for America.
About nine months later, I happened to be in the UK and got invited to see this team as it started to evolve. The experience of being in the office that day in November 2011 just blew my mind. You could see people working in a different way. They had this amazing thing called “the Wall of Done.” You would see somebody get up and take something from a different place and put it on the Wall of Done.
What they were doing at the time was taking this incredibly fragmented set of bad websites and putting them all in gov.uk, taking out useless content and stuff that was confusing and making it clear, simple, and easy for people to use. You had this sense of "Oh, that's happened. That's happened." The feeling was so different from what a typical government feels like. I was like a kid in a candy store.
While I was in that office, Todd Park, at the time the US chief technology officer, called me to ask if I would come work in the federal government. He wanted me to run the Presidential Innovation Fellows, based a little bit on Code for America. Code for America was doing it at the local level, and he had started this program for federal agencies. I said to him, "I can't come to DC, unfortunately, but wow, do you need to see this!" And there's a very long story that proceeds from that. But over the course of the year, I did end up going, and we got USDS set up as our version of the Government Digital Service.
What is the status of USDS right now?
Jennifer: It's done some super amazing work. Andrew just mentioned online passports. I don't want to take away from the team at the Bureau of Consular Affairs, but there's a little bit in the paper about how USDS got that to be far better than it would have been if they hadn't been there to bring all the practices that Andrew's just been talking about and share them.
I think it's a good example — there are many other examples of great work they've done in the past couple of years. But I like that one because you can see the ways that Consular Affairs is now saying, "Oh, we don't want to go back. This is so much better. We can deploy changes every day, not every two months. We can actually test this with real users and understand before it launches what complaints people are going to have. We can instrument this site so that we know when there's a problem instead of waiting for it to fail."
That sense that Andrew just talked about, of being proud to be a public servant — I think you also see that when you have a great engagement between USDS and an agency. When the USDS gets redeployed to a different federal agency, you see the agency they just worked with sticks with those new practices, and also spreads them to other teams. They have no tolerance now for what used to be just the normal way of doing business.
That's a huge win that has far greater impact than just getting the passport renewal online. It's going to affect visas. It's going to affect everything else. Somebody at Consular Affairs talks to a different part of State and someone in State talks to somebody else over at Education. That's how these practices spread.
They've also done a whole lot on some of the core stuff like hiring and procurement practices. But they've really gotten their funding reduced. It used to be that they had their own appropriation and they could just send a team over. Now they have been functioning largely under a pass-the-hat mechanism. If you want them to work with, say, Consular Affairs, they have to say “Give us money for it,” which takes time and energy. Then they feel more like consultants, and they have to please the client, which is not the engagement that I think has gotten these really transformational results.
I'm really hoping that in the next administration, we can return to resourcing both USDS and the other groups that work in this way the way they should be resourced.
Will you each give me the object-level places in the British and American governments where you most clearly notice the lack of state capacity?
Andrew: In the UK, there's a couple of areas which GDS quite intentionally didn't tackle when I was working there, because we thought they were such big problems they would swallow us up and kill us.
There's a real need for more of this within our healthcare sector. The NHS — National Health Service here — is not really an organization. It's a lot of different organizations under a brand, and health and care costs continue rising as we have a bunch of demographic issues very similar to the states.
We're spending more and more on it, but the quality of service and health outcomes that we're getting are way off the pace of what is needed. That's one area where an uplift in state capacity could be totally transformative.
The other is local government. The UK is a very centralized system, much more than the US, and the whole of the UK public sector has gone through a period of austerity and budget cuts over the last 10-15 years. But that's really been felt at the local government level.
There are plenty of critiques of GDS, but one I think that's valid is that we focused squarely on central government and we didn't get down into that local level. And government is government to people. They don't care if it's national or local. They just want it to get done, and the services at that level are hurting for lack of that state capacity.
Jennifer: I'll start with some places where it is working pretty well, just as we don't want to paint everything with such a broad brush. I manage my grandmother's Social Security benefits, and the site works great. It's pretty simple. For folks who needed to file their taxes for free this year, Direct File has incredible customer satisfaction reports.
I live part of the year in a California region where you have to burn your outdoor waste in the winter. You apply for your burn permit and it just comes right back at you. It's an automatic thing they send to you immediately.
Now, in Australia, if you want to put solar on your house, you do it the same way I get my burn permit — you type in a little thing and it comes back into your email the minute you hit submit. We don't have that here. All of the work that we are doing to try to transition to a low-carbon economy: enabling solar, wind, new transmission facilities — all of this stuff is really encumbered by that procedural bloat that we also tackle in this paper.
It is just really hard to put solar on your house. It's really hard to build new transmission lines. It's really hard to build housing. There are digital ways of working that I think help fix this — I don't necessarily mean to say there are great digital tools we can bring to the table, though that's also true. But when Andrew talks about a digital way of working, it's one that doesn't accept that all the steps in the process that exist today should just be digitized. We don't need a lot of those steps in the process!
You guys have been great. As we close, I'd love to hear a quick-fire set of recommendations from your paper. What do we need to do, in your view, to fix American state capacity?
Andrew: We've talked about the personnel side: hiring the right people and being able to exit the wrong people. There's a certain brutality to that, but it’s a really common thread across the UK and US. Both countries have hiring processes that have a hard time finding any people who would fit the roles that are urgently needed by the state. Neither gives public servants the tools to deal effectively with poor performance. Speaking as a former public servant, there's nothing more frustrating than poor performance just sitting there and adding to the rot in the feel of the place.
We talk about “stop people” and “go people.” We're not saying the government should not do compliance stuff: Clearly, that is mad. You need those people in the system. But equally, you've got to find the balance of those forces of people keen to drive things forward and push things through, as well as those making sure that's being done responsibly and appropriately.
That demands a certain kind of workforce planning that we think could come through more strongly from the Office of Personnel Management (OPM). The other one we touched on is closing the loop around outcomes. That means aligning funding in a way that supports working in those test-and-learn, more experimental ways, and allowing teams to test their riskiest assumptions first. And you know what? Maybe stop stuff if it doesn't work and turn the tap off.
One of the things that we certainly found in the UK is it's basically as easy to go to our finance ministry and ask for a hundred million dollars as it is for a million dollars. So obviously you're going to ask for a hundred million dollars.
We need to put mechanisms in place that allow for more agile and iterative funding. And also, to build an oversight system that also supports that. Rather than building up to your big set-piece meeting where 50 people sit around the table every two months and check the work, how do you get that more incremental oversight and course correction from senior folks? Closing the loop, and making sure that you're paying close attention to personnel.
Jennifer: We've got to reduce the burdens on civil servants, which means tackling this procedural bloat that they have to work under. Obviously, we're at a moment when LLMs and other forms of AI can help with that. We're going to see that coming out of DOGE. But I would also like to see it coming out of Congress and the agencies of their own accord.
But you really just have to find ways to right-size these procedures that are downstream of that procedural bloat.
We have to reduce the surface area for attack by adversarial legalism, essentially, where we've created a real no-ocracy [NB: Jen means that there are too many veto points in our society. “No-ocracy” should not be confused with “noocracy,” or rule by the wisest]. That is a big part of our not being able to build housing or green infrastructure. It's just really easy to sue to stop things that really need to happen.
We have to look at the Administrative Procedures Act and other cross-cutting laws that affect all areas of government. We have to tackle trade-off denial. We're always saying, "Oh, but this would be a good thing." Yes, it would be a good thing in the abstract, but combined with everything else, it's just too much.
Lastly, this idea of investing in digital and data infrastructure — we talked about the need to fund USDS. We also really want Congress to take seriously the notion that the way we fund these projects is actually the original sin of their big, bloated, mega-project development model. We have to move from what we call a project model to a product model. A product model has both the right team in the way that Andrew's talked about, but also is done in an incremental, agile way where you can learn along the way.
The funding model has to match the development model. We have a lot of work to do to educate authorizers, appropriators, and people who do oversight in what this model looks like. We hope that people take it seriously and can pick up whatever part really resonates for them.
Share this post