Photo by Jess Bailey on Unsplash
My first Python project I made was the Productivity Model that I created while I worked for a company I worked for that time. Our main task as Productivity Department was to indicate demand of hours to cover all of the store operations for each department on store. The tool I created was both very accurate and efficient.
Productivity models are build mainly in Microsoft Excel. Of course, each model owner has an opportunity of using other technologies, but Excel is the most common one.
I was responsible for most of the comercial departments, such as:
It's worth mentioning that we counted the demand of hours for budget for nearly 1,000 stores in 4 countries (Czech Republic, Slovakia, Hungary and Poland). Once I took over the above-mentioned areas, I have got 4 separate Excel files with around 40MB each. However, the models had many common features.
Budgeting process was only part of our responsibilities and it was occured maximum 4 times a year. Our other responsibilities related to the models were:
With given time I have noticed that analysis, model updates or most of the calculations I was doing were taking me too much time. Besides, I did similar activities every time, which were not necessarily something fascinating - especially when I did it for the hundredth time.
Due to the above, I came up to the conclusion that I must do something with it...
My first goal, I set for myself, was to combine four models into one. It would save a lot of space on my hardrive, and most importantly - save my time, as by having all departments in one file I wouldn't have to count everything 4 times. Sounds great, isn't it?
I was briefly happy with the Replenishment Model, because I noticed that it is not as fast as it should be. There were sheets contain the calculations (formulas) which took definitelly too much memory, what made the model too slow (in my subjective oppinion). So, I had an idea that it is better to have it all calculated somewhere outside of the Excel Model. That's how I found Microsoft Access, which was supposed to contain the entire database with values converted in SQL. The Access was connected with a sheet in the Model, where all the values were supposed to be.
It worked quite well, by the time...
Hadoop has appeared in our company. Of course, I had heard about this tool before, and just after I heard we will be able to use this system, I have requested my willing for the training.
It turned out that all the data I need for a single product and for each of the Central European store will be placed in one system. Simply put, I could get it all in seconds without leaving my office. To illustrate what this change was for me, I will use a short example:
Imagine you want to know the ratio between products on pallets to products on shelves in a given area. For getting these data, I had to go to the chosen store (the more stores, the better average) and... count the quantities manually. If I was lucky enough to get sale, I could use a weighted average. Then I used a given percentage ratio for all stores.
In Hadoop, I had all of this and much more at my fingertips. Of course, it is always good to verify the data from time to time while going to a selected store and compare it with the reality on the store. Anyway, it was a huge technological leap that I could and I wanted to take advantage of.
There are two reasons why I started to thing about Python language:
I got the approval without any problems but with one condition. I had to keep the model in Excel, too.
One year tooks me building of the new Model for commercial departments. Maybe it's a long time, but I was creating and learning at the same time.
To illustrate what the change was about, I've prepared two diagrams. The first diagram shows the logic of the basic model (in Excel) that I used before the change. Majority of data kept in the Excel Model was measured and observed in stores.
The diagram below shows what the current model looks like:
As you can see, a large part of the datas are currently extracted from the system in really detailed level. There are still areas that, for obvious reasons, cannot be taken out from the system, for example:
To understand what Python offers and how it influenced of the Productivity Model, I will use a comparison between two tools, Python and Excel.
Main advantage of Excel is its simplicity and easy ramp-up. It is best when doing small and one-time analyses or creating basic visualizations quickly. It is easy to become an intermediate user relatively without too much experience due to its GUI.
Python is harder to learn because you have to download many packages and set the correct development environment on your computer. However, it provides a big leg up when working with big data and creating repeatable, automatable analyses, and in-depth visualizations. It is a much faster tool. The scripts are much lighter than their Excel counterparts. Thanks to this, we can keep the entire model (all functions, calculations and analyzes) in one small file, which, after being put into the GitHub repository, gives us many additional opportunities, such as data backup or a full history of changes you made.
My work in the Productivity Department gave me a lot of opportunities that I was eager to use. I was able to develop my skills, try new solutions and implementing them, even though no one else in the department was able to use it. I think it was a bold decision by my superiors that paid off for both sides. I have learnt new tools and found a new hobby. In the other hand my former department got a tool that offers a lot of opportunities for development.
Please feel free to visit my GitHub account where you can find all of the scripts from the Replenishment Model with their description.
1 comment
anconda - Aug. 23, 2021, 6:42 p.m.
perfecto