Learning objective
- Use Pyspark skills to answer business queries
Overview
This scenario will help you to go through intermediate scenarios in Pyspark
Story
WeDistro is a multi-conglomerate manufacturing company having 1000's of products across various brands. They have a hierarchy to Manufacture these products and then send them to Warehouses. These products are then later shipped to distribution centers and from distribution centers to stores where customers have a point of sale (PoS).
As a part of this Supply chain process, WeDistro wishes to highlight and understand the availability of products in stores since if products are not available the customers would get disappointed and will lose trust in the products which also leads to an impact on Sales. Hence WeDistro wishes to analyze the following aspects of the data being generated from the Point of Sale
- Better understanding the inventory levels for efficient forecast
- Analyze customers and transactions to identify high-value stores
You are part of WeDistro Data Engineering team and the responsibility now lies with you to help senior management of WeDistro with their questions