From smartphones and fitness trackers to social media posts and COVID-19 cases, the past few years have seen an explosion in the amount and types of data that are generated daily. To help make sense of these large, complex datasets, the field of data science has grown, providing methodologies, tools, and perspectives across a wide range of academic disciplines.
But the challenges that lie ahead for data scientists and engineers, from developing algorithms that don’t exacerbate biases to ensuring privacy protections, are equally complex and, in some instances, require entirely new ways of thinking.
As part of its $750 million investment in science, engineering, and medicine, the University has committed to supporting the future needs of this field. To this end, the Innovation in Data Engineering and Science (IDEAS) initiative will help Penn become a leader in developing data-driven approaches that can transform scientific discovery, engineering research, and technological innovation.
“The IDEAS initiative is game-changing for our University,” says President Amy Gutmann. “This new investment allows us to boost our interdisciplinary efforts across campus, recruit phenomenal additional team members, and generate an even more sound foundation for discovery, experimentation, and design. This initiative is a clear statement that Penn is committed to taking data science head-on.”
Building on a foundation of existing expertise
Led by the School of Engineering and Applied Science, the IDEAS initiative builds upon the steadily gathering momentum of its data-centric research. The Warren Center for Network and Data Sciences has been a major catalyst for this type of work, generating foundational research on ethical algorithms and data privacy, as well as collaborations that have drawn in faculty from the Wharton School, Law School, Perelman School of Medicine, and beyond. In addition, Wharton’s Department of Statistics and Data Science is an active partner in research and teaching initiatives that apply statistical modeling across a wide variety of fields.
“One of the unique things about data science and data engineering is that it’s a very horizontal technology, one that is going to be impacting every department on campus,” says George Pappas, Electrical and Systems Engineering Department chair. “When you have a horizontal technology in a competitive area, we have to figure out specific areas where Penn can become a worldwide leader.”
To do this, IDEAS aims to recruit new faculty across three research areas: artificial intelligence (AI) to transform scientific discovery, trustworthy AI for autonomous systems, and understanding connections between the human brain and AI.
Penn already has a strong foundation in using AI for scientific discovery thanks in part to investments in basic research facilities such as the Singh Center for Nanotechnology and the Laboratory for Research on the Structure of Matter. Additionally, there are centers focused on connecting researchers from different fields to address complex scientific questions, including the Center for Soft and Living Matter, Center for Engineering Mechanobiology, and Penn Institute for Computational Science.
Developing “trustworthy” algorithms, ones that work reliably outside of situations in which they are trained, is another key component of the IDEAS initiative. Ongoing research at the Penn Research in Embedded Computing and Integrated Systems Engineering (PRECISE) Center, the General Robotics, Automation, Sensing & Perception (GRASP) Lab, and DARPA-funded projects on the safety of AI-based aircraft control provide a starting point for furthering Penn’s research portfolio on safe, explainable, and trustworthy autonomous systems.
In the area of neuroscience and how the human brain is similar to AI and machine learning approaches, research from PIK Professor Konrad Kording and Dani Bassett’s Complex Systems lab exemplifies the types of cross-disciplinary efforts that are essential for addressing complex questions. By recruiting additional faculty in this area, IDEAS will help Penn make strides in bio-inspired computing and in future life-changing discoveries that could address cognitive disorders and nervous system diseases.
Fostering a data science ecosystem
In addition to the IDEAS initiative, Penn is also growing its existing data science ecosystem with the construction of Amy Gutmann Hall. Slated for completion in 2024, the University’s new data science hub will physically centralize resources, including software, hardware, and intellectual expertise, and will also provide a space for members of the campus community to connect with experts, get advice, and find potential collaborators.
“Penn has an incredible opportunity to do data science in a different way, and I see IDEAS and Amy Gutmann Hall as very much being a part of that,” says Zack Ives, Computer and Information Science Department chair “One of the great things about Penn is that we have strong engagement across campus, whether it’s with medicine, humanities, physical or life sciences, and that will allow us to maximize the impact of these new resources and connections.”
Data-driven discoveries in Arts & Sciences
In addition to being a partner in the IDEAS initiative, the School of Arts & Sciences has launched The Data Driven Discovery initiative (DDD), a central pillar of its strategic plan. Current activities coordinated by DDD include funding for data science-focused postdoctoral fellows in the natural and social sciences; Data Science for Social Good seed grants, which aim to fund early-stage projects that connect faculty, students, and agencies through data-driven projects; and the Summer Undergraduate Data Science Hangout to bring together students doing data driven research.
Astronomy professor Bhuvnesh Jain, who will co-direct the DDD initiative with Greg Ridgeway, says that the growth of research on complex datasets in a way that cuts across a diverse set of fields, from humanities to the sciences, has helped spur campus-wide efforts like IDEAS.
“Along the engineering-science connection, there are constantly emerging data science tools that we’re all excited to apply. We want to create the right environment to learn and share,” Jain says. “This is a new mode of research, where we tackle common problems involving data across very diverse disciplines.”
Penn and the future of data science
Jain is looking forward to the “exciting possibilities” that IDEAS and the growth of the data science ecosystem can provide to the Penn community. “We definitely believe that, whether it’s criminology and sociology on one side and astronomy and engineering on the other, we can make it happen on a much bigger scale and place Penn at the cutting edge,” he says.
Vijay Kumar, Nemirovsky Family Dean of Penn Engineering, sees IDEAS and Amy Gutmann Hall as being pillars for the future of Penn Engineering and its role in the University’s mission to translate new knowledge into action for the public good.
“This is a truly transformative initiative that will solidify Penn as the premier place for data science,” Kumar says. “The IDEAS initiative and Amy Gutmann Hall can make Penn a catalyzer for innovation impact in the area of data science and engineering, not only across campus but also across the region, through partnerships with Philadelphia schools and educational programs. By bringing data-driven thinking and learning to everyone, we can truly tackle the problems of the 21st century.”
Bhuvnesh Jain is the Walter H. and Leonore C. Annenberg Professor in the Natural Sciences in the Department of Physics and Astronomy in the School of Arts & Sciences at the University of Pennsylvania.