Scientists run into a lot of tradeoffs trying to build and scale up brain-like systems that can perform machine learning. For instance, artificial neural networks are capable of learning complex language and vision tasks, but the process of training computers to perform these tasks is slow and requires a lot of power.
Training machines to learn digitally but perform tasks in analog—meaning the input varies with a physical quantity, such as voltage—can reduce time and power, but small errors can rapidly compound. An electrical network that physics and engineering researchers from the University of Pennsylvania previously designed is more scalable because errors don’t compound in the same way as the size of the system grows, but it is severely limited as it can only learn linear tasks, ones with a simple relationship between the input and output.
Now, the researchers have created an analog system that is fast, low-power, scalable, and able to learn more complex tasks, including “exclusive or” relationships (XOR) and nonlinear regression. This is called a contrastive local learning network; the components evolve on their own based on local rules without knowledge of the larger structure. Physics professor Douglas J. Durian compares it to how neurons in the human brain don’t know what other neurons are doing and yet learning emerges.
“It can learn, in a machine learning sense, to perform useful tasks, similar to a computational neural network, but it is a physical object,” says physicist Sam Dillavou, a postdoc in the Durian Research Group and first author on a paper about the system published in Proceedings of the National Academy of Sciences.
“One of the things we’re really excited about is that, because it has no knowledge of the structure of the network, it’s very tolerant to errors, it’s very robust to being made in different ways, and we think that opens up a lot of opportunities to scale these things up,” engineering professor Marc Z. Miskin says.
“I think it is an ideal model system that we can study to get insight into all kinds of problems, including biological problems,” physics professor Andrea J. Liu says. She also says it could be helpful in interfacing with devices that collect data that require processing, such as cameras and microphones.
In the paper, the authors say their self-learning system “provides a unique opportunity for studying emergent learning. In comparison to biological systems, including the brain, our system relies on simpler, well-understood dynamics, is precisely trainable, and uses simple modular components.”
This research is based in the Coupled Learning framework that Liu and postdoc Menachem (Nachi) Stern devised, publishing their findings in 2021. In this paradigm, a physical system that is not designed to accomplish a certain task adapts to applied inputs to learn the task, while using local learning rules and no centralized processor.
Dillavou says he came to Penn specifically for this project, and he worked on translating the framework from working in simulation to working in its current physical design, which can be made using standard circuitry components. “One of the craziest parts about this is the thing really is learning on its own; we’re just kind of setting it up to go,” Dillavou says. Researchers only feed in voltages as the input, and then the transistors that connect the nodes update their properties based on the Coupled Learning rule.
“Because the way that it both calculates and learns is based on physics, it’s way more interpretable,” Miskin says. “You can actually figure out what it’s trying to do because you have a good handle on the underlying mechanism. That’s kind of unique because a lot of other learning systems are black boxes where it’s much harder to know why the network did what it did.”
Durian says he hopes this “is the beginning of an enormous field,” noting that another postdoc in his lab, Lauren Altman, is building mechanical versions of contrastive local learning networks.
The researchers are currently working on scaling up the design, and Liu says there are a lot of questions about the duration of memory storage, effects of noise, the best architecture for the network, and whether there are better forms of nonlinearity.
“It’s not really clear what changes as we scale up a learning system,” Miskin says. “If you think of a brain, there’s a huge gap between a worm with 300 neurons and a human being, and it’s not obvious where those capabilities emerge, how things change as you scale up. Having a physical system which you can make bigger and bigger and bigger and bigger is an opportunity to actually study that.”
Sam Dillavou is a postdoc in the Durian Research Group.
Douglas J. Durian is the Mary Amanda Wood Professor of Physics and Astronomy in the School of Arts & Sciences.
Marc Z. Miskin is an assistant professor of electrical and systems engineering in the School of Engineering & Applied Science.
Andrea J. Liu is the Hepburn Professor of Physics in the School of Arts & Sciences.
Other authors are Benjamin D. Beyer and Menachem Stern of the Department of Physics and Astronomy in the School of Arts & Sciences.
This research was supported by the National Science Foundation (MRSEC/DMR1720530, MRSEC/DMR-DMR-2309043, and DMR-2005749, Simons Foundation (327939, and U.S. Department of Energy (DE-SC0020963).