Thread
OK, this is one I’ve been waiting to share for a *long* time – the first ever demonstration of deep reinforcement learning on a nuclear fusion research device! www.nature.com/articles/s41586-021-04301-9
Nuclear fusion is the holy grail of clean energy - abundant fuel, small footprint, runs 24/7, zero meltdown risk or long-lasting waste. But despite 70 years of work, it has yet to become a reality. It is more familiar from science fiction than real life.
The most mature approach, magnetic confinement fusion, works by compressing a plasma in a donut-shaped magnetic bottle called a tokamak. Keeping this plasma stable is incredibly complex - plasma is a 3D self-organizing fluid that must be contained at incredibly high temperatures.
If you don’t have active control on these machines, the plasma can disrupt - in the worst case, causing serious damage to the machine. Historically, these control algorithms were based on classical control and had to be carefully designed by experts.
www.youtube.com/watch?v=CUfR819hIDg
We set out to replace these classical control algorithms with deep neural networks trained by reinforcement learning. Because these machines can only be run a few dozen times a day, we trained these policies in simulation and transferred them to a real machine.
With our fantastic partners at @EPFL, we ran these deep RL policies on the Tokamak à Configuration Variable (TCV). This is an excellent machine for testing out exotic control, because it is capable of forming exotic plasma shapes that no other machine can run.
While we don’t set any records for fusion power (TCV is a relatively small machine) we might set a record for the *weirdest* plasma stabilized - a configuration called the “droplet” which is really two plasmas on top of each other, which we keep stable essentially indefinitely.
With our deep RL approach validated on a real machine, we are hopeful that these methods can be used to push the envelope of what magnetic confinement fusion is capable of, taming plasma configurations that had previously been considered too challenging to pursue.
My role in this project was primarily as a catalyst - I was the one who organized a hackathon on this at DeepMind and invited control experts from EPFL to join back in 2018. It’s been incredibly gratifying to see this mature into a real project from the seeds that I helped plant.
But of course the real credit goes to the people who worked on this day in and day out: @317070 @ffelici Jonas Buchli Michael Neunert @brendantracey @FrancescoCarpa9 @timoewalds Roland Hafner and many others.
Mentions
See All