Candy Show Case
Advanced Analytics for dessert
At Steadforce, it is part of the daily ritual that after lunch there are a few bowls of sweet and salty snacks in the kitchen for everyone to help themselves to. This is not only delicious, but also boosts work morale. But which sweets and snacks are the most popular? Our Analytics team wanted to know exactly.
Steadforce had been looking for a suitable show case for a long time, where we could show our customers examples of how we can handle data and the complex issues involved. The aim was to showcase the process of such a project and identify any problems that might arise. Since different use cases from different customers often have a common core, a topic was sought that could be used to clearly explain the algorithms and models of a typical data science project.
During a brainstorming session in the Steadforce kitchen, Federica and Jonathan from the Analytics Team eventually came up with the idea of analyzing and evaluating the consumption of sweets in the company in detail. As the case contains all the parameters of a classic data science project, it was ideally suited for preparation as a show case.
The declared objective was to ultimately be able to make predictions about when which type of candy is eaten. Of course, Steadforce can not only optimize the purchase of sweets. These methods are also used for purchasing and stock-keeping of, for example, medicines in hospitals or in the area of predictive maintenance. Additionally, the handling of noisy and incomplete data should be trained.
The fog in the forest of data is clearing
But first things first. In order to be able to measure the amount of candies consumed, manual skill was required, because the Analytics team relied on self-built scales.
In addition to a few plywood boards, cardboard boxes, scale sensors and some soldering tin, two Raspberry Pi were used.
Two self-made scales were connected to each of these mini-computers in the size of a pack of cards, on which the bowls with the sweets were placed. The technology behind them: In every standard scale there is a sensor that bends slightly depending on the weight difference and thus emits different voltages.
The recording of the data, i.e. the measurement of consumption based on the weight differences, could now be started with the four available scales.
Over a period of four weeks, the data were collected and the different types of sweets were grouped into several categories for better analysis. As is often the case with other projects, Federica and Jonathan started with a simple data modelling, the so-called Baseline Model. They knew it would produce a concrete result in the end, even if it was not necessarily optimal.
The Baseline Model generally serves as a basis for further expansion stages of the modeling. It provides information on whether a multi-layered approach offers added value or whether more extensive models only cause unnecessary additional complexity. Once the Baseline Model has been established, further parameters, model types and algorithms can be tried in order to obtain the best results possible and thus enable a useful prediction.
In the course of the four-week data collection, Federica and Jonathan gained an increasing understanding of how to best predict the "survival probability" of the sweets and refined the approach accordingly. This involved the use of models originally derived from actuarial and medical statistics.
Problems are there to be solved
Of course there were also unforeseeable factors in the process of recording. For example, when more than a dozen people came to Steadforce for a customer appointment and all of them reached into the bowls, which generally makes us happy. Because: Mi candy es su candy!
However, it wasn’t just visitors in the house, but a wide range of external factors that were decisive for the project.
For example, on Mondays there are typically more colleagues in the office than on Fridays, which naturally has an impact on the data. Some brought cake into the office for their birthday, which meant that less candy was consumed. Steadforce also has a pasta day and a salad day every week, which you can join if you wish. Here Federica and Jonathan could see without a doubt that the staff eats less sweets than usual after a pasta day and more sweets after a salad day.
Some colleagues have also put our Analytics Team to the test and caused minor manipulations. After all, this may well happen in "real cases".
For example, bowls were taken off the scales from time to time or the various sweets were poured together.
But our analysts were of course able to uncover the manipulations and clean up most of them.
Also to be corrected were deflections of the measurement curves, which were caused, for example, by reaching into the bowls and the resulting short-term higher load on the sensors. For this purpose, the team quickly wrote its own software for smoothing the curves.
Milk chocolate and gummy bears usually only survive for a short time
The fact that there is always something to eat at Steadforce naturally delights the colleagues. But how many sweets do our employees really eat?
"I think it’s staying within healthy limits," confirms Jonathan. "But you actually see a small bend in our data just before people go home. Then a much larger amount of candy is eaten. Some strengthening for the way home is necessary after all."
With regard to the different types of sweets, Federica and Jonathan noted that milk chocolate and jelly bears of all kinds are particularly popular. Foam gum and dark chocolate, on the other hand, did not do so well.
The project was definitely successful, because at the end it was possible to calculate candy consumption with a prediction quality of around 70%. And the buyers now know exactly how to make themselves popular with their colleagues.