Retail-Sim is python package to easily create synthetic dataset of retaile store.

Overview

Retailer's Sale Data Simulation

Retail-Sim is python package to easily create synthetic dataset of retaile store.

Simulation Model

Simulator consists of env, that generates retailer store simulated data.

Modelling PLAN

Products

Create fake products and relationship between them. Relationship between products (Cateogries, to be more precise) consists of "exchangability", "complementarity". Products have many attributes, such as

  • Base Price
  • Base Cost
  • Volume
  • Attractiveness
  • Category
  • Price elasticity
  • Relative Consumption rate
  • Loyalty

Volume implies how much satisfaction it provieds to the customer (How much of a need it subtracts). Volume is proportional to price, which can be set with vol_price_corr.

Products are discretely grouped by some category. Each category has attribute "consumption rate", "general trend", and "seasonal trend". In real life, products such as fresh food, tissues, bottled water would have high consumption rate. General trend is random linear-like trend, seasonal trend is trend of sales that has period of 1 year. In real life, product like icecream would have winter-oriented seasonal trend.

Customers

Every customer has random set of "needs". Just as real life, you might need shampoo, pair of scissors, and some spagetti souce(All of these are considered as one category) Customers will try to fill those needs. As it happens in real life, customers are encourged to buy the product that both satisfy the needs and has a high preference.

Product's Total Attractiveness

Every product comes with the Attractiveness attribute. If it has higher attractiveness, it is more likely to sell. However,

  • If the product is on discount, it will become more attractive.
  • If the product is on discount and it is advertised to be, it will become even more attractive.
  • If the product has high loyalty, it will have very high attractiveness to some customers.
  • There might be some general trend on the attractiveness.

Therefore during simulation, total attractiveness will be defined as:

$$Total = max(\text{Attractiveness} + \text{elasticity} * \text{discounted rate}, B(loyalty) * infty)$$

Customer's state transition

Customers will buy with n budget, where n is pareto distibuted among all customers. They will randomly pick a category depending on their current need distribution. After that, they will buy a product in that category, based on the products' total attractiveness. Buying that product will subtract the customer's need of that category by Volume's amount.

Owner
Corca AI
AI B2B Consulting Company
Corca AI
An orchestration platform for the development, production, and observation of data assets.

Dagster An orchestration platform for the development, production, and observation of data assets. Dagster lets you define jobs in terms of the data f

Dagster 6.2k Jan 08, 2023
Employee Turnover Analysis

Employee Turnover Analysis Submission to the DataCamp competition "Can you help reduce employee turnover?"

Jannik Wiedenhaupt 1 Feb 13, 2022
Bamboolib - a GUI for pandas DataFrames

Community repository of bamboolib bamboolib is joining forces with Databricks. For more information, please read our announcement. Please note that th

Tobias Krabel 863 Jan 08, 2023
NFCDS Workshop Beginners Guide Bioinformatics Data Analysis

Genomics Workshop FIXME: overview of workshop Code of Conduct All participants s

Elizabeth Brooks 2 Jun 13, 2022
Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Pypeln Pypeln (pronounced as "pypeline") is a simple yet powerful Python library for creating concurrent data pipelines. Main Features Simple: Pypeln

Cristian Garcia 1.4k Dec 31, 2022
A columnar data container that can be compressed.

Unmaintained Package Notice Unfortunately, and due to lack of resources, the Blosc Development Team is unable to maintain this package anymore. During

944 Dec 09, 2022
International Space Station data with Python research 🌎

International Space Station data with Python research 🌎 Plotting ISS trajectory, calculating the velocity over the earth and more. Plotting trajector

Facundo Pedaccio 41 Jun 16, 2022
Exploring the Top ML and DL GitHub Repositories

This repository contains my work related to my project where I scraped data on the most popular machine learning and deep learning GitHub repositories in order to further visualize and analyze it.

Nico Van den Hooff 17 Aug 21, 2022
Includes all files needed to satisfy hw02 requirements

HW 02 Data Sets Mean Scale Score for Asian and Hispanic Students, Grades 3 - 8 This dataset provides insights into the New York City education system

7 Oct 28, 2021
Handle, manipulate, and convert data with units in Python

unyt A package for handling numpy arrays with units. Often writing code that deals with data that has units can be confusing. A function might return

The yt project 304 Jan 02, 2023
Single-Cell Analysis in Python. Scales to >1M cells.

Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc

Theis Lab 1.4k Jan 05, 2023
A simplified prototype for an as-built tracking database with API

Asbuilt_Trax A simplified prototype for an as-built tracking database with API The purpose of this project is to: Model a database that tracks constru

Ryan Pemberton 1 Jan 31, 2022
Used for data processing in machine learning, and help us to construct ML model more easily from scratch

Used for data processing in machine learning, and help us to construct ML model more easily from scratch. Can be used in linear model, logistic regression model, and decision tree.

ShawnWang 0 Jul 05, 2022
Python for Data Analysis, 2nd Edition

Python for Data Analysis, 2nd Edition Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media Buy

Wes McKinney 18.6k Jan 08, 2023
Finding project directories in Python (data science) projects, just like there R rprojroot and here packages

Find relative paths from a project root directory Finding project directories in Python (data science) projects, just like there R here and rprojroot

Daniel Chen 102 Nov 16, 2022
Common bioinformatics database construction

biodb Common bioinformatics database construction 1.taxonomy (Substance classification database) Download the database wget -c https://ftp.ncbi.nlm.ni

sy520 2 Jan 04, 2022
MS in Data Science capstone project. Studying attacks on autonomous vehicles.

Surveying Attack Models for CAVs Guide to Installing CARLA and Collecting Data Our project focuses on surveying attack models for Connveced Autonomous

Isabela Caetano 1 Dec 09, 2021
Functional tensors for probabilistic programming

Funsor Funsor is a tensor-like library for functions and distributions. See Functional tensors for probabilistic programming for a system description.

208 Dec 29, 2022
Fitting thermodynamic models with pycalphad

ESPEI ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for thermodynamic database development within the CALPHAD method

Phases Research Lab 42 Sep 12, 2022