
STEM Program
Big Data Analytics and Predictive Modeling: Using RStudio to Predict Game Outcomes
Faculty Advisor: Technical Consultant for DoD; Former Professor, Systems Engineering, United States Military Academy
What is RStudio and R Programming?
RStudio is an integrated development environment (IDE) for R, a programming language for statistical computing and graphics. RStudio can help users analyze data, build models, execute code, debug, and manage their workspace.
R is a high-performance data science tool. It is widely used in business analysis, statistical analysis, social science research, machine learning and deep learning, AI products, and Kaggle competitions.
Research Program Introduction
The dramatic rise in participation of teams in “virtual” worlds, specifically those defined by the Multiplayer Online Battle Arena (MOBA) game League of Legends, presents an opportunity to explore high-density, long-range data on teamwork at a relatively large scale. A sense of the trajectory and possibilities for team instrumentation is provided in today's MOBA environments, where groups may persist over long time scales. Here, extensive data may be used to develop new methods and empirical results on team performance through predictive modeling.
This program will teach students how to leverage RStudio to load data, conduct statistical analysis, and perform predictive modeling to evaluate teams' performance. Can we anticipate the winner of a given match? Can we develop proxy measures to foresee group performance and match outcomes?
Students will also learn general and subject-specific research and academic writing methods used in universities and scholarly publications. Students will focus on individual topics and complete their own work products upon program completion.
Project Topics
What can the MOBA game League of Legends teach us about team performance?
How do we analyze large sets of data for predictive modeling?
How can we leverage RStudio for analysis and modeling?
How to predict team performance and game outcomes?
Program Details
Cohort size: 3 to 5 students
Workload: Around 4 to 5 hours per week (including class and homework time)
Target students: 9 to 12th graders interested in business analysis, statistics, data science, machine learning, deep learning, or game design. This project is best for students with genuine curiosity, diligence, and initiative. Prior experience in coding (Python or R) is a plus.