Jonathan Zhao

About Me

I am an Application Engineering working at the Computer Hardware industry and I am looking for opportunities to transition my career into the field of Data Science/Data Analytics.
I completed a 6-month intensive Data Science Bootcamp with Thinkful and I am building up my Data Science skills moving forward.
I am also a gamer and I main Support and Tank in most games.

Feature Projects

Detect Pneumonia using Chest X-Ray Images

According to World Health Organization (WHO), 2 million children under 5 years old killed by pneumonia each year, and WHO reported that 95% of childhood clinical pneumonia cases occurred in developing countries (Kermany, 2018). X-ray images are one of the key elements for diagnosing pneumonia since X-ray images are able to obtain as standard care procedures and help differentiate between different types of pneumonia. However, specialists for interpreting the images are not always available, especially in low-resource areas. Therefore, we can utilize the convolutional neural network to act as the primary screening to detect pneumonia using chest X-ray images.
This data set contains more than 5 thousand chest X-ray images. However, data augmentation is needed to increase our sample size and create an unbiased training data set. A Convolutional Neural Network model is built to classified X-ray images, and this model achieved 86.86% validation accuracy.

Narrative analysis for League of Legends

On November 11, 2015(Patch V5.22), Rift Herald is introduced to the game as a new monster. the intention of introducing Rift Herald to the game is having an in-game monster that equivalent to dragons in the early game. Once a player killed the Rift Herald, this player can summon the Rift Herald, which this monster can due 40% of its current health to towers. This increases the chance of taking down a tower in early game, and potentially could shorten the game length.
This project conducted an A/B Conducted an A/B test to check if Rift Herald will shorten the average game length in Season 6 (with RiftHerald) comparing to Season 5 (without Rift Herald). In conclusion, Season 6 has a shorter average length compared to Season 5. However, this conclusion do not apply to all regions.

Predict Win Placement Percentage in PlayerUnknow’s Battlegrounds(PUBG)

PUBG (PlayerUnknown’s Battlegrounds) is an online player battle royale game. Each game can have up to 100 players, parachute on an island look for weapons and items to kill others and avoid to get killed at the same time.
This project utilized more than 4 million game data and built a regression model using LightGBM to predict Win Placement Percentage for individual players. I also filter out cheaters from this data set as part of the data cleaning process. In the end, the regression model achieved 0.06717 Mean Absolute Error.

Detect Sarcasm in News Headlines

This project processed and clean 20k+ News headlines with Python modules Tldextract, Spacy, and NLTK. Also, TF-IDF is used to vectorized News headlines. Classification models (Random Forst and MLP) were built and compared against clustering algorithm (K-Mean). In conclusion, MLP achieved 75.5% accuracy and outperformed other algorithms.

Exploring College Graduates Salaries

This is a practice Explortary Data Analysis for college graduates salaries. I mainly used Plotly for this analysis, because I did not have much experience using Plotly while I was studying at Thinkful. In this analysis, I produced bar chart, stacked bar chart, box plots, line plot, and pie chart using Plotly.