Click the link below to download a sample Excel spreadsheet I created that builds probabilities for project timelines and budgets using a Monte Carlo simulation:
(If you haven’t already, see my post Why most project estimates suck…and how Monte Carlo simulations can make them better for some background.)
This model takes inputs for five milestone timeline and budget estimates and generates 5,000 possible outcomes. From those outcomes, a series of outputs metrics including overall timeline and budget probabilities are calculated.
Here’s how the file works:
On the left of the sheet are the inputs to the Monte Carlo simulation. Fields highlighted in yellow can be changed to modify the structure of the model. (NOTE: Please don’t change fields not highlighted in yellow as these fields all contain calculations.)
Payroll cost per hour: In this simple model, I’ve assumed a single cost per hour for all employees involved in the project. For actual usage, you would want to build upon this to consider different cost levels.
# of people: Again, this is overly simplified to assume the same number of people are working full time on all milestones. This number is used to convert hours into days in orders to calculate project dates.
Start date: The date the project is scheduled to start. The model uses this date to calculate project end dates.
For Each Milestone
# of hours: Enter up to four possible duration in hours estimated for each milestone. To the left of the number of hours, enter a probability of completing the milestone in that number of hours. The probability percentage of the final duration estimate for each milestone is calculated automatically to ensure the probabilities all four possible durations add up to 100%. In the example, the probabilities and durations are largely the same for each milestone. In reality, these are likely to vary significantly depending on the word involved and the general confidence of the estimate for each.
Non-payroll $: Payroll costs are automatically calculated in the model based on other entries, so here we’re just looking for non-payroll costs associated with the milestone. These could include computer hardware, software licenses or general material. Enter up to four possible non-payroll costs estimated for each milestone. To the left of the number of hours, enter a probability of completing the milestone with that costs. The probability percentage of the final duration estimate for each milestone is calculated automatically to ensure the probabilities for all four possible costs add up to 100%. In the example, the probabilities and durations are largely the same for each milestone. In reality, these are likely to vary significantly depending on the word involved and the general confidence of the estimate for each.
As you enter your input variable probabilities, you’ll be able to see the results in the charts and the columns in the Outputs section to the right.
The line charts show the probabilities for potential timelines and budgets. The possibilities are broken down in deciles between the minimum and maximum possible values.
At the top right of both the timeline and the budget sections, you’ll see an “expected” value. This is a single number that is the sum of possible durations costs multiplied by their respective probabilities. It can provide useful in comparing different potential investment options.
Monte Carlo Simulation
The actual Monte Carlo simulation, with its 5,000 iterations, can be seen on the tab called Simulations.
The model relies on the RAND() and VLOOKUP functions of Excel. The RAND() function randomly generates a decimal between 0 and 1. We feed that random number into the VLOOKUP function, which references the appropriate “Engine Bins” (hidden) to determine the variable value it uses. VLOOKUP references the first column of a range (range: Two or more cells on a sheet. The cells in a range can be adjacent or nonadjacent.) of cells (Engine Bins, in our example), and then returns the value from the value column on that row. If an exact match is not found in the Engine Bin, the next largest value that is less than random number lookup_value is returned.
NOTE: Because of the nature of the RAND function, the simulation and its results will rerun every time you load the spreadsheet or change anything within the spreadsheet. This means the output numbers and charts will change slightly with each load, which also proves the point about the effects of randomness. You can also hit F9 to recalculate at any time.
So, in our example, if the RAND() function returns 0.62467547 for one of the non-payroll cost iteration variables the VLOOKUP function will reference the Engine Bins for Non-payroll $ for that milestone, see it doesn’t have an exact match and then reference the Engine Bin with the next largest value that is less than 0.62467547. Because the Engine Bin ranges are based on our input probabilities, we will select the 1.0 about 60% of the time, 0.5 about 20% of the time, 1.5 about 10% of the time, and (0.5) about 10% of the time. You can see how this plays out in the simulation.
Now we’re ready to run some simulations. For each potential scenario, we add all the selected hours for each milestone to determine duration for that scenario. For each milestone, we multiple the number of hours by our inputted payroll cost per employee to get the payroll cost for the milestone. We then add each milestone’s payroll cost to its selected non-payroll cost estimate and sum them all together for get the scenario’s cost estimate.
The simulation then runs the calculations 5,000 times in order to give us a very statistically valid sample size. We use this data set to generate our probabilities.