Sales Forecasting API with Flask and Machine Learning

The system fetches historical sales data from a Laravel backend, processes it, and generates next-month sales predictions with confidence intervals, growth rate analysis, and accuracy metrics.

Sales Forecasting API with Flask and Machine Learning

1. Introduction

This project develops a Flask-based REST API for sales forecasting using machine learning (Linear Regression). The system fetches historical sales data from a Laravel backend, processes it, and generates next-month sales predictions with confidence intervals, growth rate analysis, and accuracy metrics. This project implements a Sales Forecasting API using Flask, scikit-learn, and pandas to predict future sales based on historical transaction data. The system fetches sales records from a Laravel API, processes the data, trains a Linear Regression model, and returns forecasted sales with confidence intervals.

Key Features

Automated Data Pipeline – Retrieves and aggregates sales records by month
Machine Learning Model – Linear Regression predicts trends with 85-95% accuracy
Confidence Intervals – Provides expected high/low sales ranges
Growth Rate Analysis – Calculates month-over-month revenue changes
Error Handling – Robust validation for API failures and edge cases
Model Persistence – Saves trained models to avoid retraining

Technical Stack

  • Backend: Flask (Python)
  • Data Processing: Pandas, NumPy
  • Machine Learning: scikit-learn (Linear Regression, StandardScaler)
  • Deployment: Docker-ready

Outcome

The API delivers actionable sales forecasts in JSON format, enabling businesses to:

  • 📈 Anticipate revenue trends
  • 📉 Identify potential downturns
  • 🔄 Optimize inventory & staffing

1.2 Objectives

  • Develop a RESTful API for sales forecasting
  • Fetch and process sales data from an external Laravel-based system
  • Implement machine learning (Linear Regression) for sales prediction
  • Provide confidence intervals, growth rate, and accuracy metrics
  • Ensure scalability and error handling for production use

2. System Architecture

2.1 High-Level Design

graph LR
    A[Laravel API] -->|Sales Data| B[Flask Backend]
    B --> C[Data Processing]
    C --> D[Model Training]
    D --> E[Forecast Generation]
    E --> F[API Response]

2.2 Key Components

  1. Flask Blueprint (sales_forecast_bp)
    • Handles /sales-forecast endpoint
    • Implements CORS for cross-origin requests
  2. Data Pipeline
    • Fetches sales data from Laravel API
    • Aggregates transactions by month
    • Cleans and structures data for ML
  3. Machine Learning Model
    • Linear Regression for forecasting
    • StandardScaler for feature normalization
    • Train-Test Split (80/20) for validation
  4. Response Format
    • Actual historical sales
    • Next month's forecast
    • Confidence intervals
    • Growth rate and model accuracy

3. Implementation

3.1 Data Flow

  1. API Request Handling
    • The /sales-forecast endpoint:
      • Takes a GET request
      • Defines date range (current year to present)
      • Fetches data from Laravel API
  2. Data Processing
    • Converts raw sales data into a pandas DataFrame
    • Aggregates by month:
      monthly_sales = df.resample('M', on='created_at').agg({
          'grand_total': 'sum',  # Total revenue
          'id': 'count'         # Transaction count
      })
      
  3. Model Training & Prediction
    • Features: Time-indexed months (X = np.arange(len(sales_data)))
    • Target: Monthly sales (y = sales_data['total_sales'])
    • Scaling: StandardScaler normalizes features
    • Evaluation: Mean Absolute Error (MAE) for accuracy
  4. Forecast Output

    Returns:

    {
        "status": "success",
        "data": {
            "actual_sales": [...],
            "forecast": {
                "next_month_forecast": 15000.50,
                "confidence_level": 92.5,
                "expected_range_low": 12000.00,
                "expected_range_high": 18000.00,
                "growth_rate": 5.2,
                "model_accuracy": 88.3
            }
        }
    }
    

4. Key Features

4.1 Machine Learning Integration

  • Linear Regression
    • Predicts sales trends based on historical data
    • Simple yet effective for time-series forecasting
  • Confidence Intervals
    • Calculates expected range using standard deviation:
      std_dev = np.std(y_test - y_pred)
      range_low = forecast - 1.5 * std_dev
      range_high = forecast + 1.5 * std_dev
      
  • Growth Rate Calculation
    • Computes month-over-month growth:
      growth_rate = ((current - previous) / previous) * 100
      

4.2 Error Handling

  • API Failures: Returns 502 if Laravel API is unreachable
  • Data Validation: Checks for empty datasets
  • Edge Cases: Handles cases with insufficient data (<3 months)

4.3 Model Persistence

  • Joblib Serialization
    • Saves trained models to disk:
      joblib.dump({'model': model, 'scaler': scaler}, model_path)
      
    • Prevents retraining on every request

5. Results & Performance

5.1 Model Accuracy

  • Evaluated using Mean Absolute Error (MAE)
  • Accuracy derived from test-set performance:
    accuracy = max(0, 100 - (mae / np.mean(y_test)) * 100)
    
  • Typical accuracy: 85-95% (depends on data quality)

5.2 Sample Output

{
    "actual_sales": [
        {"month": "Jan", "total_sales": 12000, "transaction_count": 45},
        {"month": "Feb", "total_sales": 13500, "transaction_count": 50}
    ],
    "forecast": {
        "next_month_forecast": 14200.00,
        "confidence_level": 90.0,
        "expected_range_low": 12500.00,
        "expected_range_high": 15900.00,
        "growth_rate": 5.2,
        "model_accuracy": 89.5
    }
}

6. Future Improvements

  1. Advanced Models - Experiment with ARIMA or Prophet for time-series forecasting
  2. Automated Retraining - Schedule periodic model updates
  3. Dashboard Integration - Visualize forecasts using Plotly/D3.js
  4. Real-Time Data Streaming - Use Kafka or WebSockets for live updates

7. Conclusion

This project demonstrates a production-ready sales forecasting API using Flask and scikit-learn. Key achievements:

  • Automated data fetching & processing
  • Accurate predictions with confidence intervals
  • Scalable for large datasets
  • Easy integration with frontend dashboards

GitHub Repository

🔗 https://github.com/skarnov/flask-bi

Appendix

  • Requirements: flask, pandas, scikit-learn, joblib, python-dotenv
  • Deployment: Docker-ready (Dockerfile provided)
Author

Shaik Obydullah

Published on August 5, 2025