(From left) Doctoral student Hannah Yamagata, research assistant professor Kushol Gupta, and postdoctoral fellow Marshall Padilla holding 3D-printed models of nanoparticles.
(Image: Bella Ciervo)
2 min. read
In the race to develop AI that understands complex images like financial forecasts, medical diagrams, and nutrition labels—essential for AI to operate independently in everyday settings—closed-source systems like ChatGPT and Claude currently set the pace. But no one outside their makers knows how those models were trained or what data they used, leaving open-source alternatives scrambling to catch up.
Now, researchers at Penn Engineering and the Allen Institute for AI (Ai2) have developed a new approach to train open-source models: using AI to create scientific figures, charts, and tables that teach other AI systems how to interpret complex visual information.
Their tool, CoSyn (short for Code-Guided Synthesis), taps open-source AI models’ coding skills to render text-rich images and generate relevant questions and answers, giving other AI systems the data they need to learn how to “see” and understand scientific figures. The research is detailed in a paper for ACL 2025, a global AI conference.
“This is like taking a student who’s great at writing and asking them to teach someone how to draw, just by describing what the drawing should look like,” says Yue Yang, co-first author and research scientist at Ai2’s PRIOR: Perceptual Reasoning and Interaction Research group.
“Training AI with CoSyn is incredibly data efficient,” says Mark Yatskar, assistant professor in CIS and Yang’s doctoral co-advisor. “We’re showing that synthetic data can help models generalize to real-world scenarios that could be unique to a person’s needs, like reading a nutrition label for someone with low vision.”
By building CoSyn entirely with open-source tools, the researchers hope to democratize access to powerful vision-language training methods without the ethical and legal challenges surrounding web scraping and copyrighted content.
Read more at Penn Engineering Today.
Ian Scheffler
(From left) Doctoral student Hannah Yamagata, research assistant professor Kushol Gupta, and postdoctoral fellow Marshall Padilla holding 3D-printed models of nanoparticles.
(Image: Bella Ciervo)
Jin Liu, Penn’s newest economics faculty member, specializes in international trade.
nocred
nocred
nocred