Category: metrics

How are retail sales forecasts like baby due dates?

Q. How are retail sales forecasts like baby due dates.

A. They both provide an improper illusion of precision and cause considerable consternation when they’re missed.

Our first child was born perfectly healthy almost two weeks past her due date, but every day past that less than precisely accurate due date was considerably more frustrating for my amazing and beautiful wife. While her misery was greater than many of us endure in retail sales results meetings, we nonetheless experience more misery than necessary due to improperly specific forecast numbers creating unrealistic expectations.

I believe there’s a way to continue to provide the planning value of a sales forecast (and baby due date) while reducing the consternation involved in the almost inevitable miss of the predictions generated today.

But first, let’s explore how sales forecasts are produced today.

In my experience, an analyst or team of analysts will pull a variety of data sources into a model used to generate their forecast. They’ll feed sales for the same time period over the last several years at least; they’ll look at the current year sales trend to try to factor in the current environment; they’ll take some guidance from merchant planning; and they’ll mix in planned promotions for the time period, which also includes looking at past performance of the same promotions. That description is probably oversimplified for most retailers, but the basic process is there.

Once all the data is in the mix, some degree of statistical analysis is run on the data and used to generate a forecast of sales for the coming time period — let’s say it’s a week. Here’s where the problems start. The sales forecast are specific numbers, maybe rounded to the nearest thousand. For example, the forecast for the week might be $38,478k. From that number, daily sales will be further parsed out by determining percentages of the week that each day represents, and each day’s actual sales will be measured against those forecast days.

And let the consternation begin because the forecast almost never matches up to actual sales.

The laws of statistics are incredibly powerful — sometimes so powerful that we forget all the intricacies involved. We forget about confidence intervals, margins of error, standard deviations, proper sampling techniques, etc. The reality is we can use statistical methodologies to pretty accurately predict the probability we’ll get a certain range of sales for a coming week. We can use various modeling techniques and different mixes of data to potentially increase the probability and decrease the range, but we’ll still have a probability and a range.

I propose we stop forecasting specific amounts and start forecasting the probability we’ll achieve sales in a particular range.

Instead of projecting an unreliably specific amount like $38,478k, we would instead forecast a 70% probability that sales would fall between $37,708k and $39,243k. Looking at our businesses in this manner better reflects the reality that literally millions of variables have an effect on our sales each day, and random outliers at any given time can cause significant swings in results over small periods of time.

Of course, that doesn’t mean we won’t still need sales targets to achieve our sales plans. But if we don’t acknowledge the inherent uncertainty of our forecasts, we won’t truly understand the size of the risks associated with achieving plan. And we need to understand the risks in order to develop the right contingency and mitigation tactics. The National Weather Service, which uses similar methods of forecasting, explains the reasons for their methods as follows:

“These are guidelines based on weather model output data along with local forecasting experience in order to give persons [an idea] as to what the statistical chance of rain is so that people can be prepared and take whatever action may be required. For example, if someone pouring concrete was at a critical point of a job, a 40% chance of rain may be enough to have that person change their plans or at least be alerted to such an event. No guarantees, but forecasts are getting better.”

Imagine how the Monday conversation would change when reviewing last week’s sales if we had the probability and range forecast suggested above and actual sales came in at $37,805k? Instead of focusing on how we missed a phantom forecast figure by 1.7%, we could quickly acknowledge that sales came in as predicted and then focus on what tactics we employed above and beyond what was fed into the model that generated the forecast. Did those tactics generate additional sales or not? How did those tactics affect or not affect existing tactics? Do we need to make strategic changes, or should we accept that our even though our strategy can be affected by millions of variables in the short term it’s still on track for the long term?

Expressing our forecasts in probabilities and ranges, whether we’re
talking about sales, baby due dates or the weather, helps us get a
better sense of the possibilities the future might hold and allows us
to plan with our eyes wide open. And maybe, just maybe, those last couple weeks of pregnancy will be slightly less frustrating (and, believe me, every little bit helps).

What do you think? Would forecasts with probabilities and ranges enhance sales discussions at your company? Do sales forecasts work differently at your company?

True conversion – the on-base percentage of web analytics?

I just finished re-reading one of my all-time favorite business books, Moneyball by Michael Lewis. While on the surface Moneyball is a baseball book about the General Manager of the Oakland A’s, Billy Beane, I found it to be more about how defying conventional wisdom (a topic I’ll no doubt return to over and over in this space) can be an excellent competitive advantage. In retail, we can be just as prone to conventional wisdom and business as usual as the world of baseball Lewis encountered, and site conversion rate is an excellent example of how we’re already traversing that path in the relatively young world of e-commerce.

In Moneyball, Michael Lewis tells the story of Beane defying the conventional wisdom of longtime baseball scouts and  baseball industry veterans. Rather than trust scouts who would literally  determine a baseball player’s prospects by  how he physically looked, Beane went to the data as a disciple of Bill JamesSabermetrics theories. By following the  James’ approach, Beane was able to put together consistently winning teams while working with one of the lowest payrolls in the Major Leagues.

Lewis describes how James took a new look at traditional baseball statistics and created new statistics that were  actually more causally related to winning games. Imagine that! For example, James found on-base percentage, which  includes walks when calculating how often a player gets on base, to be a much more reliable statistic than batting  average, which ignores walks (even though we’re always taught as Little Leaguers that a walk is as good as a hit). I won’t get into all the details, but suffice to say on-base percentage is more causally related to scoring runs than batting  average, and scoring runs is what wins games.

So why is batting average still so prevalent and what does this have to do with retail?

Basically, an English statistician named Henry Chadwick developed batting average as a statistic in the late 1800s and didn’t include walks because he thought they were caused by the pitcher and therefore the batter didn’t deserve credit for not swinging at bad pitches. Nevermind that teams with batters who got on base scored more runs and won more games. But batting average has been used so long that we just keep on using it, even when it’s been proven to not be very valuable.

OK, baseball boy, what about the retail?

As relatively young as the e-commerce space is, I believe we are already falling prey to  conventional wisdom in some of our metrics and causing ourselves unnecessary churn.  My favorite example is site conversion rate. Conversion is a metric that has been used in physical retail for a very long time, and it makes good sense in stores where the overwhelming purpose is to sell products to customers on their  current visit.

I’ll argue, though, that our sites have always been about more than the buy button, and they are becoming more and more all-purpose every day. They are marketing and merchandising vehicles, brand builders, customer research tools (customers researching products and us researching customers), and sales drivers, both in-store and online. Given the multitude of purposes of our sites, holding high a metric that covers only one purpose not only wrongly values our sites, but it also causes us to churn unnecessarily when implementing features or marketing programs that encourage higher traffic for valuable purposes to our overall businesses that don’t necessarily result in an online purchase on a particular day.

We still need to track the sales generating capabilities of our sites, but we want to find a causal metric that actually focuses on our ability or inability to convert the portion of our sites’ traffic that came to buy. We used our site for many purposes at Borders, so we found that changes in overall site conversion rate didn’t have much to do at all with changes in sales.

If we wanted to focus on a metric that tracked our selling success, we needed to focus on the type of traffic that likely came with an intent to buy (or at least eliminate the type of traffic that came for other reasons), and we knew through our ForeSee Results surveys that our customers who came with an intent to buy on that visit was only a percentage of our total visitors, while the rest came for other reasons like researching products, finding stores, checking store inventory, viewing video content, etc.

So, how could we isolate our sales conversion metrics to only the traffic that came with an intent to buy?
Our web analyst Steve Weinberg came up with something we called “true conversion” that measured adds to cart  divided by product page views multiplied by orders divided by checkout process starts. This true conversion metric was far more correlative to orders than anything else, so it was the place to initially focus as we tried to determine if we could turn the correlation into causation. We still needed to do more work matching the survey data to path analysis to further refine our metrics, but it was a heckuva lot better than overall site conversion, which was basically worthless to us.

Every site is different, so I don’t know that all sites could take the exact same formula described above and make it work. It will take some work from your web analyst to dig into the data to determine customer intent and the pages that drive your customers ability to consummate that intent. For more ideas, I highly recommend taking a look at Bryan Eisenberg‘s excellent recent topic called How to Optimize Your Conversion Rates where he explores some of these topics in more detail.


Whether or not you buy into everything written in Moneyball or all of Billy Beane’s methods, I believe the main lesson to be culled from the book is that it’s critically important that we constantly re-evaluate our thinking (particularly when conventional wisdom in assumed to be true) in order to get at deeper truths and clearer paths to success.

How is overall site conversion rate working for you? Do you have any better metrics? Where have you run into trouble with conventional wisdom?

How the US Open was like a retail promotion analysis

Last week’s US Open golf tournament had a surprise leader going into the final round in Ricky Barnes, who came out of relative obscurity to record the best 36-hold score in US Open history, beating out golf greats like Tiger Woods and Phil Mickelson.

The media played it up, talking about Barnes as finally coming into his own and really blossoming. Was he the next big golf star? From CBS Sports:

“Until this week at the 109th U.S. Open, keeping up with the big boys has always been difficult prospect (for Barnes), full of disappointment, figurative bloody noses and scabby knees.

‘I know he hates losing,’ brother Andy said. ‘Maybe because he did a lot of it when he was younger.’

And more than a bit as a young adult, too, which is what made his record-setting start at Bethpage Black all the more surprising. In a field full of the household names with whom Barnes has been so desperately trying to compete, he’s finally atop the leaderboard.”

As Barnes faltered during the final rounds and Woods and Mickelson improved, it was all about who was handling the pressure well and who wasn’t. Barnes’ score dropped off significantly over the final two rounds while the bigger names improved.

Was it the pressure? I would argue that what we really saw was what statisticians call a regression toward the mean (or average). Basically, Woods and Mickelson began the tournament with rounds that were well below their averages, but with each round they began to score closer to what they would normally be expected to score. Barnes basically did the opposite. When continuously measured over time, Tiger Woods is still clearly the world’s top golfer. This can be clearly seen in the world golf rankings where Woods is #1 and Barnes is #153.

So, why is this like a retail promotion analysis?
Because we retailers have a tendency to look at each short term promotion result in isolation and then make concrete conclusions and kick off immediate modifications. Come Monday morning, we’re looking to see how the weekend sale did, and we’re ready to change next weekend’s sale if this past one didn’t perform to expectations. We don’t take into account the possibility that we might have witnessed an outlier result that is not really indicative of the actual effectiveness of the promotion but is actually just the result of random luck — good or bad. After a single test, we could be ready to declare the promotion equivalent of Ricky Barnes the world’s greatest and the promo Tiger Woods an also ran. And the next time we run the Barnes promotion and it’s a dog, we’ll revert back.

An old colleague of mine used to call this the “full accelerator, full brake” syndrome. The net effect of all of this short term measurement and immediate reaction is a steady reduction in the average effectiveness of our promotions.

Instead, we should measure the effectiveness of promotions over a much longer period of time and over many instances. Because of the massive amount of variables that can affect a promotion (including the obvious and more visible variables like weather and road construction and the less obvious and invisible variables like an unusual number of people happened to plan family picnics at the same time and therefore didn’t shop like they normally would have) we simply cannot count on a short term measurement to provide the accuracy we need to make a wise decision. Short term, sometimes the promotions will show improvements and sometimes they won’t, just as Tiger Woods does not win every golf tournament he enters. Over time, though, we will come closer and closer to determining their true value.

This requires patience and courage that will be difficult in the fast paced retail environment, especially for public companies. However, it will produce a lot less churn and increase efficiency and effectiveness overall. And in an economic time when we’re trying to maximize the effectiveness of the staff we have left, less churn can go a long way.

What do you think? How are promotion analyses handled in your company? Do you measure over the long haul?

Retail: Shaken Not Stirred by Kevin Ertell

Home | About