Wayde Herman's

Portfolio

Expected Goals Model

An end-to-end football expected goals model. The model predicts the probability of whether a shot will result in a goal.

Interesting things about this project include the structure as well as evaluating the prediction's calibration. The code is structured as a pipeline with the code intended on being reusable and modular as opposed to a once off project. Models are tuned using a predefined hyperparameter space stored in dictionaries.

The models are evaluated both by their predictive performance as well as the calibration of their predictions. The latter is evaluated using visual inspection of their calibration curves as opposed to a single metric, which is usually advised.

Keywords: Python (Pandas, Matplotlib) / Logistic Regression / SVM / Random Forest / Gradboost / XGBoost / Calibration / Model Pipeline / D3.js

kaggle comp

Kaggle CareerCon 2019

My top 2% solution to Kaggle’s CareerCon 2019 competition.

Interesting things about this project include the out-of-the-box solution used to overcome the noise in the data. The sequences provided were too noisy to classify accurately so a novel method of chaining together sequences was used. This method was based on the sequences’ orientation data.

Each sequence was first labelled using a random forest classifier. Then the sequences were chained together. Each sequence then ‘voted’ for the classification of the entire chain which then overruled each individual sequence’s original labels.

Keywords: Python / Time Series / Scikit-Learn / Random Forest

          # Create example dataframe with numbers ranging from 1 to 5:
          df = pd.DataFrame([1,2,3,4,5], columns=['example'])
          
          from category_encoders import BinaryEncoder
          
          example_binary = BinaryEncoder(cols=['example']).fit_transform(df)
          
          example_binary.head()
          

Categorical Feature Encoding Tutorial

Tutorial on categorical feature encoding for machine learning.

This tutorial covers several alternatives to traditional approaches.

Keywords: Python / Machine Learning / Tutorial

flowchart generator

Flowchart Generator

Created a tool to generate company flow charts.

Keywords: D3.js / Javascript / Jquery / Frontend / HTML / CSS /

sadc data viz

Gender Inequality Data Visualization

D3.js data visualization exploring gender and age in Southern African politics.

Keywords: D3.js / Javascript / HTML / CSS