Sponsored by the University of Connecticut Statistical Data Science Lab, The UCSAS 2024 USOPC Data Challenge challenged entrants to develop an analytical model that would identify two groups of five gymnasts, one group for men and one group for women, that would optimize the United States’ chances of success in gymnastics at the upcoming Paris Olympics.
Shilong Ren, a master’s student in analytics at USC Viterbi’s Daniel J. Epstein Department of Industrial & Systems Engineering, was selected as one of three finalists. The University of Connecticut flew Ren out to attend their sports analytics conference where he presented his findings.
“When I received an email saying I was a finalist, I was shocked,” said Ren. “But after reflecting on my work, I felt confident and excited to present my outcomes at the conference.”
For the men, Ren’s model predicted that the following five American athletes would form the best men’s team: Dallas Hale, Samuel Mikulak, Frederick Richard, Shane Wiskus, and Alec Yoder. For the women, he identified: Simone Biles, Sunisa Lee, Kaliya Lincoln, Konnor McClain, and Mykayla Skinner.
“Beyond these findings, what is more interesting and important is the whole algorithm and how we leveraged different analytical techniques to create the optimized teams,” said Ren.
The Olympics is structured such that athletes compete in several different events. Not every athlete competes in all of them. Thus, Ren had to choose a unique combination of athletes, rather than simply the best five individual athletes, to predict a team capable of winning the most medals.
With the data from the athletes’ previous competitions, Ren first created probability densities for individuals’ scores in each event. Ren then cleaned these distributions by regenerating and resampling as much data as possible to ensure the fairest outcome. He also streamlined the process.
“As I tried to optimize our algorithm, I found an interesting thing: the time we used to generate one score and a thousand scores is the same,” said Ren. “So, I modified the program to generate 1000 scores at once and store them in a stack. This approach is much faster than generating a new score each time we need one.”
Using the refined probability distribution for each athlete in each apparatus, Ren employed a genetic algorithm to form teams with the highest expected medal count. Mimicking natural selection in biology, genetic algorithms begin with an initial population, in Ren’s case, the five best-performing athletes, and simulate the population’s success over multiple generations. As more generations of athletes are simulated, the stronger these teams become, since the weakest teams are eliminated before they can algorithmically reproduce. The result: a team of athletes with the most historical success.
Although he didn’t win the competition, Ren said he was honored to make it as far as he did.
Born in Guangzhou, China, Ren studied math and applied mathematics for his bachelor’s degree at the Jinan University-University of Birmingham Joint Institute. Aiming to enhance his job market prospects, he began researching master’s programs in analytics for further specialization.
“After I looked at the classes and structure at USC, I chose it,” said Ren. “I would learn about data analytics, visualization techniques, database management and more. Plus, it doesn’t hurt being in LA.”
Ren credits the courses guidance he received from Bruce Wilcox, an analytics expert and senior lecturer of industrial and systems engineering, among others, for helping him succeed in the competition and for preparing him for future professional success.
“Without Bruce, I wouldn’t have joined this competition and done as well as I did,” said Ren. “The professors and classes at USC have prepared me greatly for my future career, and I find myself using the tools I’ve picked up from this school frequently during my summer internship.”
Published on July 24th, 2024
Last updated on July 24th, 2024