Skip to main content

As the first UC campus to establish a data science major, UC Irvine has been an educational leader in this growing field for close to a decade. In addition to launching the major in 2015, UCI recently started hosting summer programs for both high school and undergraduate students interested in data analytics and biostatistics, and its SoCal Data Science program helps recruit, train and dispatch a diverse workforce of STEM and data science majors.

Now, the SoCal Data Science program is scaling up, thanks to a $1.3 million Pathways Development grant from the California Education Learning Lab.

The SoCal Data Science Program
The SoCal Data Science program kicked off in 2022 with a three-year $1.5 million grant from the National Science Foundation. The program is a collaborative effort between faculty from UCI, California State University Fullerton (CSUF) and Cypress College, and the inaugural cohort of 32 fellows, drawn from all three campuses, attended a one-week training bootcamp and conducted a six-week research project at UCI last summer.

shahbaba
Babak Shahbaba

With the program welcoming 30 new fellows later this month, Statistics Professor Babak Shahbaba of UCI’s Donald Bren School of Information and Computer Sciences (ICS) is already looking to the future. “As we are working with this new cohort, finishing the second year, we’re [thinking] about continuity and sustainability,” says Shahbaba, the program lead at UCI. “We don’t want the program — and the benefit that it provides — to end when the [NSF] grant ends.”

The new four-year Pathways Development grant guarantees that the benefits will continue.

Creating a Pipeline of Talent
The grant proposal, titled “PIPE-LINE” (Programs for Institutional Pathway Engagement — acceLerating INfrastructure and Education), outlines ways to develop pathways and overcome equity gaps in data science learning. CSUF will serve as the host organization, with Professor Jessica Jaynes as project lead partnering closely with faculty from UCI, Riverside City College (RCC) and Rio Hondo College (RHC).

“This grant will help us continue and expand our activities around data science education at the undergraduate level,” says Shahbaba. “And this actually goes beyond the students. It includes developing training materials and hosting workshops for instructors to build the required infrastructure.”

The SoCal Data Science program aligns perfectly with the California Learning Lab’s Grand Challenge on Building Critical Mass for Data Science. “[The Learning Lab] wanted to establish this relationship between different tiers of academic institutes in California, and we were already doing that with a UC, Cal State and community college,” says Shahbaba.

The PIPE-LINE proposal has three overarching goals:

  • create pathways in data science between the three institutional tiers by establishing new courses and programs;
  • address equity gaps in data science education by expanding access to technology resources, incorporating culturally relevant pedagogy, and applying a “content-with-context” approach; and
  • support a diverse group of faculty and students in data science, resulting in a robust community of data science learners.

The proposal doesn’t extend the number of fellows (120) expected to be funded with the NSF grant; rather, it helps ensure other students have access to the same type of data science education outside of the program. “It’s more about building the infrastructure to replicate the training that we’ve been providing to these fellows and to scale it up,” says Shahbaba.

As part of this work, ICS will host training workshops and an annual Data Science Education Conference, helping instructors establish data science courses at community colleges and building a strong sense of community. “And at Cal State Fullerton,” adds Shahbaba, “we will look into setting up a data science major and minor.”

This creates a pipeline of talent for not only industry but for graduate programs as well. “We provide this program to help the students, but also it helps UCI in a way,” explains Shahbaba. Two of the fellows from Cal State Fullerton in last year’s cohort were among the 11 students accepted into UCI’s Statistics Ph.D. program this year. “So, it’s a two-way gateway. We transfer the knowledge and experience that we have developed over many years, and, in return, we’re recruiting these highly diverse, skilled and motivated students.”

PIPE-LINE thus aims to arm the next generation of data scientists with the knowledge and tools needed to tackle tomorrow’s biggest challenges. Students well trained in data science at the undergraduate level strengthen data analytics in all industries, from healthcare to transportation, and support graduate-level statistical research that will advance our understanding of issues ranging from disease prevention to climate change, leading to novel solutions.

Shani Murray