STEM Program

Big Data Analytics and Predictive Modeling: Using RStudio to Predict Game Outcomes

Faculty Advisor: Technical Consultant for DoD; Former Professor, Systems Engineering, United States Military Academy

What is RStudio and R Programming?

RStudio is an integrated development environment (IDE) for R, a programming language for statistical computing and graphics. RStudio can help users analyse data, build models, execute code, debug and manage their workspace.

R is a high-performance data science tool. It is widely used in business analysis, statistical analysis, social science research, machine learning and deep learning, AI products, and Kaggle competitions.

Research Practicum Introduction

The dramatic rise in participation of teams in “virtual” worlds, specifically those defined by the Multiplayer Online Battle Arena (MOBA) game League of Legends, presents an opportunity to explore high density, long-range data on teamwork at a relatively large scale. A sense of the trajectory and possibilities for team instrumentation is provided in today's MOBA environments, where groups may persist over long time scales. Here, extensive data are available that may be used to develop new methods and empirical results on team performance through predictive modeling.

This program will teach students how to leverage RStudio to load data, conduct statistical analysis, and carry out predictive modeling in order to evaluate the performance of teams. Can we anticipate the winner of a given match? Can we develop proxy measures to foresee group performance and match outcome? 

Students will also learn general and subject-specific research and academic writing methods used in universities and scholarly publications. Students will focus on individual topics and complete their own work products upon completion of the program.

Project Topics

  • What can the MOBA game, League of Legends, teach us about team performance?

  • How to analyze large sets of data for predictive modeling?

  • How to leverage RStudio for analysis and modeling?

  • How to predict team performance and game outcomes?

Program Detail

  • Cohort Size: 3-5 students

  • Workload: Around 4-5 hours per week (including class time and homework time)

  • Target Students: 9-12th grade students interested in Business Analysis, Statistics, Data Science, Machine Learning and Deep learning, or Game Design. This project is best for students with genuine curiosity in the subject, diligence, and initiative. Prior experience of coding (Python or R) is a plus.