Coffee Sales Data Analysis & Forecasting

Data Analyst / Machine Learning Engineer · 2025 · 3 weeks · 1 person · 1 min read

Built an end-to-end analytics and forecasting system that identified top-performing products, seasonal trends, and achieved 94.1% short-term prediction accuracy using Random Forest.

View on GitHub

Overview

An exploratory data analysis and machine learning project focused on understanding coffee sales behavior, revenue trends, and predicting future sales using historical transaction data.

Problem

The business lacked clear insights into sales patterns, high-performing products, and future demand, making inventory planning and promotional decisions inefficient.

Constraints

  • Limited to a single CSV dataset with no external context
  • Incomplete card/payment information
  • No direct customer identifiers for segmentation
  • Time-series data with strong seasonal and weekly patterns

Approach

Cleaned and engineered features from raw transaction data, performed exploratory data analysis to uncover trends, and evaluated multiple machine learning models to forecast future sales.

Tech Stack

  • Python
  • pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Scikit-learn
  • Jupyter Notebook

Result & Impact

  • 94.1%
    Forecast Accuracy (Last Week)
  • 0.56
    R² Score
  • 2.95 sales
    Mean Absolute Error

Provided actionable insights into product performance and seasonal demand, enabling more informed inventory planning and marketing decisions.

Learnings

  • Feature engineering has a larger impact than model complexity in time-series forecasting
  • Non-linear models are better suited for real-world sales patterns
  • Visualization is critical for communicating analytical insights to non-technical stakeholders