Product Sales Analysis

This project is associated with the Master of Science Business Analytics program (Class of 2024) at University of Washington, Seattle.

What I’m most proud of is that I helped my team to navigate unexpected difficulty in processing large amount of data. By developing Python scripts, I helped reduce data processing time from ~1 hour to 5 minutes through an ETL data pipeline.

Project Context

Olist is a Brazilian e-commerce company that provides a marketplace platform connecting small businesses and consumers. The company has provided real, anonymised commercial data on Kaggle, including 100K orders made from 2016 to 2018. (Data Source: Kaggle)

100K

orders made from 2016 - 2018

$3.2M

in sales revenue

9

data files

Business Problem

Project Goal: understand which product categories are more prone to customers dissatisfaction and how they can improve product sales.

To achieve this goal, we need to:

  • Gather and explore data from 9 data sources provided
  • Extract relevant business insights: identify underperforming product categories and conduct root cause analysis
  • Recommend actionable strategies to improve sales

Technical Solution

To produce business insights on product quality, I have:

  • Took initiatives to outline execution strategy and delegate tasks to team members based on strengths and preferences.
  • Used Python coding to create an ETL pipeline for processing these 9 data tables. The pipeline was used for:
    • Build data schema in MySQL Workbench
    • Extract and transform data from sources
    • Load the data to the right destination table
    • Reduce overall processing time from ~1 hour to roughly 5 minutes and help my team to have data quickly ready in MySQL for analysis purposes. ✨
  • Used SQL to analyze data for insights after exploratory analysis and brainstorming hypotheses.

Raw Data

csv file
Python

Database

MySQL

Analytics Insights

Web Analytics Icon

Recommendation

Minimal Line Art Business Roles Sales Representative

Results