When I was about 10-years-old, my sister gave me a book on the Baseball Hall of Fame for Christmas. I absolutely devoured it. I memorized the players, the stories, and most of all-the stats. I know, a kid who loved baseball stats and now loves data, how original! Still, there's something beautiful about the amount of data out there on this simple game, and even as I've branched in to many other fields, I often find myself coming back to baseball and asking new questions. Below you'll find some projects that grew out of a few of these questions.
After scraping the results of every baseball game going back to 1900 and figuring out every teams' regular series win percentage (see project below for more information), I wanted to build a simple way to share the data I had compiled with the world. To me, the clearest way to share data is to make it fun and easy to visualize, so I used streamlit to build an interactive dashboard. I then deployed said dashboard on AWS. I hope you enjoy playing around with the final product below!
This project started with what I thought would be a fairly simple question to answer--which teams in MLB history have won the highest percentage of their regular season series? Unfortunately, I was unable to find any sources that tracked regular season series success of MLB teams directly. So, using beautifulsoup, I scraped the result of every baseball game going back to 1900 from Baseball-Reference's website, compiled and cleaned the results, and then wrote a function to calculate each team's regular season series win percentage. Below you can find a notebook walking through my code for scraping, cleaning, and compiling the data, and above you can find a dashboard I built out of the resulting data.
This was a group final project undertaken for Programming for Data Scientists as part of the UVA MSDS program. I worked with Stephen Whetzel and Jason Wang to explore baseball's latest controversy--the use of sticky foreign substances by pitchers to increase the spin rate of their pitches. We explored how the league as a whole as well as individual teams and players reacted to Major League Baseball's crackdown on the use of said substances.