I have run 1000 scenarios using monte carlo simulation. The top row represents the years: 2008, 2009, 2010………. Second row represents the number of scenarios in that year in which a certain event happened e.g. assets were depleted: 0,0,0,2,5,16,25,…………. I want to find out the 95th percentile of when the assets are depleted. For example, 2 scenarios had assets depleted in year 4, 5 depleted in year 5, 16 depleted in year 6, to make the following array: 4 4 5 5 5 5 5 6 6 . . . Is there a way to calculate the 95th percentile without having to form this array? The reason is that it will take me a lot of time to hand type the array and I don’t know how to create a macro for this. Thanks.
not sure if i understand, but dont you just sort each column. and then find the 95th number in each column as the 95th percentile?
=PERCENTILE(A1:A1000,0.95) if your data is in column A, from rows 1 through 1000
oh wait, that’s not how your data is presented… let me think about this
can’t you just do it the old fashion way: (1000 observations + 1)* 0.05 = 50.05. so the 95th percentile is the 50th observation from the end. – or to be technical, your k = 50th, but your d (fraction) = .05, so you need some adjustments or something (i forget)… if you have the final, say, 4 columns of data: row1: 2025 2026 2027 2028 row2: 15 16 25 18 the 95th percentile will just be the linear interpolation between years 2026 and 2027… so say, that’ll be May 2026. or you can just use year 2026 as the 95th percentile.
Say i run 100 scenarios and my trigger is when assets are depleted. So lets say scenario 1 asses depleted year 4, Scenario 2 year 7, scenario 3 year 10 and so on. i can easily take an average by doing =sumproduct(year serial number, countif(assets<0)/1000. But how do I calcualte the 95th percentile year? I can calculate the average year the assets are depleted for all 1000. But how do I do this for percentiles? Now lets say the columns are 2008, 2009, 2010 2, 4, 5 how this reads is 2 of 1000 depleted in 2008, 4 of THE REMAINING (1000 - 2) in 2009 and so on. so the weights are 2/1000 x year 1 + 4/1000 x year 2 etc… this is confusing me.
Take cumulative sum 2008,2009,2010 2,6,11 and so on. using your above example Now take the location where the cumulative value straddles the 95% of the assets so if 1000 were the total assets - where the value goes above 950… that would be the year you need to look at Does it make sense
needhelp: i think you’re over thinking this problem… just find the 50th observation from the right (i.e., 950th observation from the left). find that observation. now ask yourself, where does that one sit in your data. what year is it? a systematic way to do it is to do what CPK said above, do a cumulative sum in row3: A3 = A2, B3 = A3 + B2, C3 = B3 + C2, D3 = C3 + D2, etc… then add an IF function in row 4 A4 = IF(A4>950,A4&" is the 95th percentile of depletion","") A5 = IF(A5>950,A5&" is the 95th percentile of depletion","") etc…
Is the problem that they aren’t in order? What is the problem? Why would anyone do a sum to find a percentile?
This line from needhelp above in his post is what led me to think of the cumulative sum approach: " 2008, 2009, 2010 2, 4, 5 how this reads is 2 of 1000 depleted in 2008, 4 of THE REMAINING (1000 - 2) in 2009 and so on. so the weights are 2/1000 x year 1 + 4/1000 x year 2 etc… " 4 of the remaining 998 and so on…
thanks. i wil try the above solutions and think on it more. Joey et al, the question I am trying to answer is this: on average my assets will deplete in year 8. this is the average of 1000 scenarios over 20-25 years. Question is: at 95th percentile when will my assets deplete? the answer will be a ‘number of years’. when assets deplete for a scenario i assume they never get positive again. so when assets reach 0, i mark that cell as a ‘1’ and then note the year that happened. lets say for a scenario it happened in year 10 so i note down 10. then i take average of all the 1000 scenarios, which is the easy part. the tough part is ok at the 95th percentile when do the assets deplete?