Welcome to Wenyi's Site!

A data enthusiast intrigued to turn data analytics into valuable insights and help businesses grow.
An analytic thinker who can code with SQL, Python, and R.
Able to build datasets and pipelines for ETL, visualization, and reporting.
Familiar with statistical and machine learning models for data analysis and prediction.
Comfortable with working with cross-functional teams. Always willing to teach and learn from others.

What I can do

Along my path in the data field, I have worked on projects on ...

Revenue Prediction by Shipping Analysis

Keywords: XGBoost Regression, Optimization

Predict the actual delivery time of products manufactured in different countries and shipped in different methods
Calculate revenues in next three months according to predicted delivery dates
Delivered this model to finance team and controlled its error under 15%

Stock Price Prediction in Big Recessions

Keywords: Feature Engineering, Random Forest

Identified recession periods in the past two decades.
Transformed classy trading rules into techinical language.
Built a Random Forest model in a time-series manner.
Improved the 3-day prediction accuracy by 9%

Daily News Headlines

Keywords: Text Mining, nltk

Scraped headline news from famous News Websites
Classified news into 10 topics
Saved one hour's reading time for the internal consultant team

Small Business in 2010 Winter Olympics

Keywords: Time Series, Spectral Clustering

Analyzed public economic data from the Canadian government
Identified structural breaks of small business growth in the British Columbia area
Figured out the general relationship between the Olympics games and small business growth

Squirrel Reporting Web App

Keywords: Django, SQL

Built a web app to allow users to report squirrel sightings in the central park area
Users could also check real-time squirrel sighting distribution from our website
Interesting web app with user-friendly interfaces

Box Office Revenue Prediction

Keywords: Web Scraping, Feature Engineering, Decision Tree

Scraped the data of over 30,000 movies from TMDB websites
Explored the relationship between movies features and its box office revenue
Combined internal movie features with external economical indicators to increase prediction accuracy by 5%

Indoor 3-D Positioning of Wireless Communication Base Stations

Keywords: Least Square Regression

Proposed a linear regression model between measurement coordinates and actual coordinates of mobile terminals that achieved an error of less than 1 meter
Reduced the number of necessary base stations from 30 to 3

Forest Fire Detection

Keywords: SVM,BP Neural Network

Extract 15 image features from satellite pictures
Identified forest fire by analyzing image color, brightness and smoke texture
Achieved a prediction accuracy of 91.2%

More of My Stories

It took a long journey for me to find my passion for data science. If you are interested in the fields that I have explored, I will be more than happy to tell you my story!

A Chip Designer

I am the designer of two high-speed communication chips and the owner of an IEEE conference paper.
I used to be the main designer of a 600,000-yuan research project.

A Potential Enteprenuer

I used to be the winner of an IT Entrepreneurship Competition and given a 200,000-yuan venture fund.

An Investment Banker

I used to work for two famous securities companies in China and won 2nd place among 568 teams in a national investment research competition.