A Data Mining Project
A deliverable from my Data Mining Foundations and Practice Certificate
7/1/2026


The goal of this project was to create an algorithm that would classify NHL players into their optimal lines on a Stanley Cup winning team. I created three algorithms, with varied success, that classified players into the following categories:
First, Second, Third, and Fourth Lines
First and Second Lines and a Botton-Six
Top and Bottom Six
The algorithm was trained on a dataset made up of the rosters of the past ten Stanley Cup winning teams. This ensured that the model's standards of what a first line player vs a fourth line player matched the standards of a winning team, rather than the standards of a losing team.
To evaluate the model, datasets of players who were on the teams that lost in the finals, such as the 2025-2026 Vegas Golden Knights. A second dataset of players who changed line placements before going on to win a Cup was used as an additional evaluation metric. Both datasets saw increasing accuracy as the number of categories decreased.
This project was a great learning experience, as I was able to manipulate and search for insights in a large dataset while attempting to solve a problem. In the future, I hope to build off this project, to refine the algorithm and adapt it to specific use cases, such as trade targeting or prospect scouting.