Sales Forecast Monte Carlo Simulation

Click the link below to download a sample Excel spreadsheet I created that can develop sales forecast using a Monte Carlo simulation:

Monte Carlo Simulation Worksheet

(If you haven’t already, see my post “Why most sales forecasts suck…and how Monte Carlo simulations can make them better” for some background.)

Here’s a breakdown on how it works:

Traditional Method:

The first sheet illustrates a simplified version of how retail sales forecasts are traditionally developed. Basically, it looks at a couple of historical data sets (sales for the same week the prior year and previous iterations of the promotion we plan to run this week) and develops a forecast as follows:

  1. Start with base sales from the same time period last year
  2. Apply the current sales trend (which will be determined in this case by average of the previous 10 week comps). This method may vary from retailer to retailer, but this is the general principle.
  3. Add the average of the incremental results of previous iterations of the promotion
  4. Voilà! This is the sales forecast.

Of course, this number is impossibly precise and the analysts who generate it usually know that. However, those on the receiving end tend to assume it is absolutely accurate and the probability of hitting the forecast is close to 100%.

Which is pretty much impossible.

It would be far more accurate to forecast a range of sales with accompanying probabilities. And that’s where the Monte Carlo simulation comes in.

Base Business:

The second sheet in the file deals with the Base Business.

Inputs

On left side of the sheet are the inputs to the Monte Carlo simulation. Fields highlighted in yellow can be changed to modify the structure of the model. (NOTE: Please don’t change fields not highlighted in yellow as these fields all contain calculations.)

In this simple model, there are three input variables: Traffic, Conversion and Average Order Size. You’ll want to estimate the values and the probabilities of those values for each variable based on what you know about your business.

For example, we might look at five iterations of a particular promotion and determine that we got about 75,000 new visitors three time, 100,000 visitors once and 50,000 visitors once. Traditionally, we would average those and assume we’ll always get 75,000 new visitors. But those variances in this variable combined with variances in other variables, such as average ticket and conversion rate, can lead to wildly different sales. Instead, we can say we have a 60% probability of getting 75,000 new visitors, a 20% probability of 50,000 new visitors and a 20% probability of 100,000 new visitors. We’ll do something similar for average ticket and conversion rate.

Important: make sure your probabilities add up to 100%.

Once you get familiar with how the model works, feel free to add more input variables as suits your business. Email me if you have questions about how to do this.

Outputs

As you enter your input variable probabilities, you’ll be able to see the results in the charts and the columns to the right.

The pie chart is a simple binary chart to quickly show the overall probability that this week’s base business will be positive or negative compared to the same week’s base business the prior year. In this case, it looks like there’s about a 73% chance our base business will comp negatively to last year (which is probably really an indication that we hit on the high end of the scale last year).

While pie chart is useful as a quick look, it’s not really enough to base a decision on. The line charts show the probabilities at various comp percentage increments and dollar increments to give us a better sense of where the base business will fall, with about a 35% chance of being around 2% down and another 13% chance of being down 10%. The columns on the right under the Probabilities heading give all the breakdowns.

You’ll note there are yellow highlighted fields in the output section. Those fields won’t change the model, but changing them allows for showing probabilities in different ranges as might be appropriate for your business.

At the top right, you’ll see a section entitled “Probability Weighted Expected Value.” This is a single number that is the sum of possible sales numbers multiplied by their respective probabilities. It should be used with caution. It can be useful for comparing different options and for trending of multiple periods, but it should not be used as a specific forecast for a specific time period.

The actual Monte Carlo simulation, with its 10,000 iterations, can be seen below the charts. See below for details on how it works, if you’re interested.

Incremental promotion

Now that we’ve determined probabilities for where our base business might fall, we can layer in a promotion we might run to increase our odds of growing sales year over year.

The third sheet in the spreadsheet, entitled “Incremental Promo” uses a second Monte Carlo simulation and set up inputs to determine our complete forecast.

Inputs

This sheet works pretty much like the Base Business sheet, but now we’re feeding a simulation that incorporates the variables for the base business and anything we input here that is generated incrementally from the promotion.

Because we’ve likely only run any given promotion a relatively small number of times previously, we have to be really careful about the estimates we make as a small sample size can yield unpredictable results. We should allow for the fact that the promotion could actually cause at least some of our variables to decrement rather than increment. (I read that one of the problems that occurred in our recent economic meltdown was that many analysts built models that had no allowances for housing prices to go down. Let’s not make the same mistake with our models.) You’ll see in the example data that I’ve allowed for the possibility of both conversion and average order size to be lower than any possibility in the Base Business worksheet.

All of the inputs are now combined with the base business variables to create a new simulation, which can be seen on its own worksheet called “Simulation.”

Outputs

In the pie chart, we can see that the incremental promotion will increase our probability of generating a positive comp for the week, but we’re still slightly more likely to have a negative comp. Basically, though, it’s a coin toss. We can see with the line charts the various probabilities at different increments of where we’re likely to end up based on what we know. And now that we have a better understanding of our risks and potential rewards, we can make a better decision.

How the model works

The model relies on the RAND() and VLOOKUP functions of Excel. The RAND() function randomly generates a decimal between 0 and 1. We feed that random number into the VLOOKUP function, which references the appropriate “Engine Bins” to determine the variable value it uses.  VLOOKUP references the first column of a range (range: Two or more cells on a sheet. The cells in a range can be adjacent or nonadjacent.) of cells (Engine Bins, in our example), and then returns the value from the value column on that row. If an exact match is not found in the Engine Bin, the next largest value that is less than random number lookup_value is returned.

So, in our Base Business example, if the RAND() function returns 0.62467547 for one of the traffic iteration variables the VLOOKUP function will reference the Engine Bins for Traffic, see it doesn’t have an exact match and then reference the Engine Bin with the next largest value that is less than 0.62467547. In this case, it chooses the Engine Bin “0” and returns a traffic variable of 300,000. Because the Engine Bin ranges are based on our input probabilities, we will select the 300,000 about 70% of the time, 275,000 about 15% of the time and 325,000 about 15% of the time. You can see how this plays out in the simulation.

In this particular model, we then determine sales by multiplying traffic by conversion rate by average ticket.

The simulation then runs the calculations 10,000 times in order to give us a very statistically valid sample size. There’s no real magic to 10,000, but it’s a large enough data set to give us a reliable result without being so large that it takes a long time to calculate.

We use this data set to generate our probabilities.

Caveats

  • The mode is definitely dependent on the quality of the inputs, but those are generally easier to estimate than the outputs
  • Using the RAND() function will cause the entire spreadsheet to recalculate each time you load the spreadsheet, enter a new value or hit F9. This means all the probability outputs will change slightly each time. But they won’t change much. This really illustrates how we can tame the uncertainty of the future with ranges and probabilities, but it also shows how impossible it is to be extremely precise. All that said, I wanted the model to work for you as you enter new values, so I left the RAND() functions in. In actual practice, you might want to cut and paste the values after into another sheet after you’ve run the calculation just so you don’t have to deal with slightly fluid numbers.
  • It could take a few iterations of learning to get the inputs right
  • Running Monte Carlo forecasts in parallel with the current method can help to refine the inputs

16 Comments

  • By Varun Thakur, June 12, 2012 @ 6:38 pm

    hi

    Could you please tell me in the base business tab, what does the conversion table is based on.

    Cheers
    varun

  • By Kevin Ertell, June 13, 2012 @ 9:55 am

    Hi Varun,

    Thanks for your comment. The conversion data there is just sample data. To use it yourself, you should replace the rates and probabilities with the occurances you’ve seen in your own data. For example, you might want to look at the various conversion rates you’ve seen historically for the time period in question combined with your most recent trends to get a sense of the various conversion rates that might reasonably happen in the time period you’re forecasting. From that, look at what’s been happening to determine the probability of it happening again. From that, you can set your probabilities. I hope that makes sense.

    Thanks again,
    Kevin

  • By Shuchi, September 10, 2013 @ 2:38 am

    Hi Kevin,

    This is really helpful! I have somewhat of an open-ended question but are there applications of Monte Carlo simulation pertaining to Direct Marketing and Campaign Management?

    Cheers,
    Shuchi

  • By Kevin Ertell, November 6, 2013 @ 2:00 pm

    Hi Shuchi,

    Sorry for my slow response. I missed your comment when it first came in.

    You could certainly use Monte Carlo simulations with Direct Marketing and Campaign Management. Any time you’re trying to predict the future, I think simulations can be helpful. It’s just a matter of determining what input variables you want to feed to the model and how you want to calculate those variables to generate a random data set. I hope that helps.

    Thanks,
    Kevin

  • By Thang Tran, February 24, 2014 @ 1:22 am

    Hi Kevin,
    This is really helpful!. I have some questions though. Can i use Monte carlo simulation when I only have in hand historical sales? because I want to compare it with forecast bottom up from sale team to figure out the risk and opportunity in the sale forecast
    Thank you

  • By Kevin Ertell, February 24, 2014 @ 9:59 am

    Hi Thang,

    Good question. You don’t necessarily have to have historical sales to use a Monte Carlo simulation. You can really choose whatever inputs you want to put into the model. I like to use historical sales data when I can so that I have a justification for the levels of probability that I assign to each input. However, if you don’t have solid historicals, you can make a guess at the probabilities. The total quality of the results the model produces is dependent on those initial probabilities, so you’ll want to use some care with them. But even if you input several different potential variables and assign even probabilities to each, you’ll still get a more realistic picture of the possible results that you would if you just assumed a specific number for each variable. Does that make sense?

  • By Thang Tran, April 7, 2014 @ 10:33 am

    Hi Kevin,
    Sorry I didn’t notice your response to my questions.
    I think you may misunderstood me. What i have in hand is only historical sale. And I want to base from that “only”, to create future forecast?
    On the other hand, thinking about your response. We do have market data in our company, such as market growth rate, previous incremental promotion.I will dig on this options.
    But can you help me when there’re only historical sale?

  • By Kevin Ertell, April 11, 2014 @ 8:51 am

    Hi Thang,

    Sorry, you’re right that I misread your question. It’s actually great that you have historical sales data. Do you have any variables beyond that sales totals (i.e. traffic, conversion, etc.)? Any historical data like that can be used to generate inputs at future probabilities. Does that make sense?

  • By Thang Tran, April 11, 2014 @ 9:43 am

    Hi Kevin,
    Actually, we have number such as store coverage (number of stores has sold our products), promotion discounted scheme with historical record volume,… Can that help to calculate the probability

  • By Kevin Ertell, May 2, 2014 @ 9:54 am

    Hi Thang,

    Sure, that can definitely help. You’ll want to first choose the metrics that are your Key Performance Indicators. These will be the numbers that mostly impact your overall sales. Then look at those over time and see how they change. From that historical look, you can start to see the most common occurrences. If discount is a factor, you might try to determine how often a particular discount increases sales by say 10%, 20%, etc. Use the frequency of those increases (or decreases) as your probability inputs. You might want to play around with it a bit until it feels right, but that’s the general idea. I hope that helps.

  • By Thang Tran, May 4, 2014 @ 3:43 am

    Hi Kevin,
    Thank you for your inputs. I will work further on this topic and share with you my updated work later on

  • By boggs, March 11, 2015 @ 7:48 am

    Hello Kevin,
    Can I use this for retailing, to forecast a products demand or performance through a period? and which columns should i change? Which Sheet? sorry for asking but thank you…

    sincerely, boggs…

  • By roger, February 25, 2016 @ 10:09 pm

    great article! i recently got an anova cirulator and the results have been great.

    i’m in the market for a container now and this article was really helpful.

    i was thinking about getting the smallest possibe container that would fit my meat and machine – the idea being that if i can maximize the volume of the continer with meat vs. water then it will heat up faster. basically trying to maximize the water to meat ratio.

    was wondering if you had any opinions on this idea before i go ahead and buy one.

    thanks again,

  • By Ammar, November 3, 2016 @ 10:26 am

    Hi Kevin,

    not to try to solve the impossible but what if I have historical data on several inputs, like market trends, pay day effect, travel season effect, stock level and seasonality etc…

    I tried to introduce more inputs but formulas became much complicated and almost impossible to complete the cases!
    any helpful ideas for introducing new dimension of variants ? this module will work or we have to use other solutions ?

Other Links to this Post

  1. Why most sales forecasts suck...and how Monte Carlo simulations can make them better | Retail: Shaken Not Stirred — January 12, 2010 @ 11:50 am

  2. Why Most Sales Forecasts Suck…and How Monte Carlo Simulations Can Make Them Better « MarketingTypo.com — January 19, 2010 @ 3:53 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

Retail: Shaken Not Stirred by Kevin Ertell


Home | About