Calculating IRR, DPI, RVPI, and TVPI, Part II
In the previous part, I used sample data on transactions, prices, and FX rates to calculate market values, cash flows, and DPI, RVPI, and TVPI ratios for Private Equity funds. The output was a time series with every cash flow and valuation date, and data calculated in two currencies, the Instrument Currency and the Portfolio Currency.
In this part, I will use SQL Server to calculate internal rate of return (IRR) for some simple sample data. In the next part, Part III, I will build an inline table-valued function (TVF) to calculate IRRs in two currencies for the funds and the total portfolio using the data prepared in Part I. The function will have two date parameters to limit the calculation period, if required, to enable reporting a specific time period. But, let's focus for a while on calculating the IRR with a set-based SQL query.
The whole SQL script is available in my GitHub by the name IRR Simple.sql.
Internal Rate of Return, IRR
By definition, internal rate of return (IRR) is the "rate of return that sets the net present value (NPV) of all cash flows (both positive and negative) from the investment equal to zero" (Wikipedia):
In the above, C0 is the initial investment, Cn are the cash flows, r is the IRR, and tn are the timings of the cash flows (in years, from the initial investment). The formula can be adjusted so that the initial investment is just another cash flow, at time zero. Also, if IRR is to be calculated for some sub-period, like for all calendar years separately, the value of the investment at the beginning of the period can be entered as a negative cash flow, and the value at the end of the period as a positive cash flow.IRR, in general, cannot be solved analytically, so a numerical, iterative method is required. I am using the secant method to iterate closer and closer to the solution.
Cash Flows Are What Matters
Let's first set up some simple cash flows and a couple of variables. Let's say we invested 1 mln back in 2015, and got money back in 2019 (800k) and 2023 (1.2 mln), making a 1 mln profit:
-- set up some sample cash flows declare @cashflows table ( ToDate date not null primary key, CashFlow numeric(19,2) not null ); insert into @cashflows values ('2015-06-30', -1000000), ('2019-03-15', 800000), ('2023-09-30', 1200000); -- set variables declare @from_date date = (select min(ToDate) from @cashflows); declare @guess float = 0.1; -- first guess as a starting point for the iterative process
What was the IRR? If you enter three dates and cash flows into Excel and use the XIRR function, you get 11.93 % as the result. Now let's use SQL Server to calculate that in a single query.
Guess What?
We need to start with a guess, and figure out a better and better guess until we find the answer, to a reasonable accuracy. The way to iterate towards a solution in an SQL Server query is a recursive CTE (Common Table Expression). First, an anchor member is required:
-- define recursive tree, calculating Net Present Value of cash flows with each successive IRR estimate, and estimating the new IRR. with iter_data as ( -- anchor member, with the first guess as IRR. select convert(int, 0) as IterNr, -- iteration counter convert(varchar(50), 'Starting') as IterStatus, convert(float, 0) as IRR_Previous, -- 0 % as "previous IRR" convert(float, @guess) as IRR, -- starting from guess as IRR convert(float, sum(cf.CashFlow)) as NPV_Previous, convert(float, sum(cf.CashFlow / power(1e0 + @guess, cft.CashFlowTerm))) as NPV, -- net present value with initial guess convert(smallint, 1) as row_nr from @cashflows cf cross apply (select convert(float, datediff(day, @from_date, cf.ToDate)) / 365 as CashFlowTerm) cft -- calculate cash flow term in years
The anchor member just initializes the first four columns and the row_nr column with some values, and calculates the NPV of all the cash flows on the first cash flow date with two different discount rates (i.e. the IRR). It does not matter that much, which interest rates we choose, although there are situations where there are multiple solutions, and then the solution might depend on the starting IRRs. The IRR_Previous is set at zero, so the NPV_Previous is just the sum of all cash flows (i.e. +1 mln). The first guess IRR is defined by variable @guess, above set at 10 %, so the first NPV will be 107,981.45, using the formula above. Note that I am using cross apply to calculate the time (tn in the formula), in years, from the initial investment date to each cash flow date.
Getting Recursive
Our CTE then needs to continue with a union all select that defines the recursive member:
union all -- recursive member, using iter_data itself in from clause. -- combining previous round IRR estimates with cash flows to calculate new IRRs and NPVs. select iter_set.IterNr, q.IterStatus, iter_set.IRR_Previous, iter_set.IRR, iter_set.NPV_Previous, iter_set.NPV, iter_set.row_nr from ( select id.IterNr + 1 as IterNr, id.IRR as IRR_Previous, n.IRR, -- the new IRR iteration result id.NPV as NPV_Previous, case when n.IRR <= -0.99 then null else sum(convert(float, cf.CashFlow) / power(1e0 + n.IRR, cft.CashFlowTerm)) over () -- use window function to sum over all cash flows end as NPV, -- net present value with new IRR convert(smallint, row_number() over (order by (select null))) as row_nr -- row number needed to pick just one of [count of cashflows] identical rows from iter_data id cross join @cashflows cf cross apply (select convert(float, datediff(day, @from_date, cf.ToDate)) / 365 as CashFlowTerm) cft -- calculate cash flow term in years cross apply ( -- iterate towards the solution by estimating the new IRR select case when id.NPV - id.NPV_Previous = 0 then id.IRR -- no change in NPV, keep the result when id.IRR - (id.IRR - id.IRR_Previous) * id.NPV / (id.NPV - id.NPV_Previous) < -0.99 then -0.99 -- avoid <= -100 % IRR and the resulting error else id.IRR - (id.IRR - id.IRR_Previous) * id.NPV / (id.NPV - id.NPV_Previous) end as IRR ) n where -- continue until result is final or an error status encountered id.IterStatus in ('Starting', 'Iterating') ) iter_set cross apply ( -- check if this was the final IRR - either accurate enough, maximum number of iterations reached, or no result select convert(varchar(50), case when abs(iter_set.NPV) <= 0.01 then 'Final' -- break loop when NPV close enough to zero when iter_set.IterNr >= 20 then 'Error: max recursive loops' when iter_set.IRR > 100 then 'Error: IRR > 10,000 %' when iter_set.IRR <= -0.99 then 'Error: IRR <= -99 %' else 'Iterating' end) as IterStatus ) q where iter_set.row_nr = 1 )
The recursive member is quite a bit longer, but not too complex. First, in the from clause in the middle, I use data from iter_data, which you can see as the name of the whole CTE (in the anchor member box). This self-reference is what kicks off the recursion. The query is, essentially, calling itself.
This is the first iteration, so iter_data will bring us the data from the anchor member. This single row is joined to @cashflows, which has three rows in our simple example, with a cross join. Thus, we have three rows, and again I am using cross apply to calculate the tn parameter (this could of course be part of the @cashflows data as well).
Another cross apply is used to calculate the next guess for IRR. The formula
IRR(new) = IRR - (IRR - IRR_Previous) x NPV / (NPV - NPV_Previous)
is a simple way of using the two points (in the first step, 0 % and 10 % IRRs, and their respective NPVs), drawing a line through them, and taking the IRR with zero NPV on that line as the next guess:
In the chart above, you can see the initial "previous IRR" at 0 % / 1,000,000 NPV, and the first guess IRR at 10 % / 107,981.45 (the blue dots). The new guess calculated as above is simply the IRR on the orange "Linear Estimate" line that makes the NPV zero, zero being our target, here 11.21 % (the green circle). However, the true NPV with 11.21 % IRR is not necessarily zero, as IRR is not linear, but must be calculated. The true NPV (the blue dotted line) seems to reach zero at an IRR only slightly less than 12 % (the green arrow). Also, a couple of special cases (NPV being equal to previous NPV, and IRR estimate getting too close or below -100 %) must be taken care of as well.
In the select clause, I am calculating the true NPV by summing up the three cash flows using the new guess for IRR: each cash flow's present value is
cf.CashFlow / power(1e0 + n.IRR, cft.CashFlowTerm)
The cash flows need to be summed up using a window function (the over() part of the sum()), because in SQL Server, the recursive member may not use group by or scalar aggregate functions. Distinct and top are also forbidden. For some reason, however, window functions are allowed - but then we have too many rows, one for each cash flow. I am using another window function, row_number(), to separate them and later in the where clause picking just one: row_nr = 1.
The select part of the recursive member will pick the new IRR and the new NPV based on that IRR, and set the "Previous" versions of both equal to the old ones. It will also increment our IterNr counter by one.
Stop Before It's Too Late
The loop will go on forever, or at least until the database default MAXRECURSION of 100 iterations. That's not necessary. That's why, besides for the need to pick just one result row, I have another select wrapped around the one doing the calculations. A cross apply then figures out the IterStatus; if NPV is close enough to zero, set it to "Final", if our loop counter passes 20, or IRR is out of bounds, set it to an error message, otherwise keep "Iterating". In the inner select, I only keep adding data if IterStatus in ('Starting', 'Iterating'), so when Final (or error) the recursion will stop and the loop will finish.
Finally, Some Results
The recursive CTE iter_data will now contain data for all iterations, one row per each. We can get the full results by
select * from iter_data order by IterNr;
Note that this cannot be executed on its own, but only with the whole with clause. The results are as follows (with some formatting for readability):
0 Starting 0 0.1 1,000,000.00 107,981.45 1
1 Iterating 0.1 0.112105 107,981.45 38,429.55 1
2 Iterating 0.112105 0.118794 38,429.55 2,456.11 1
3 Iterating 0.118794 0.119251 2,456.11 60.38 1
4 Iterating 0.119251 0.119262 60.38 0.10 1
5 Final 0.119262 0.119262 0.10 0.00 1
As you can see, the first row (0, Starting) is just the anchor member. Then the recursion goes through five iterations, with the last one being close enough (the NPV of the last round is ~0.0000039) to zero NPV for it to stop. In iteration 1, the linear estimate for IRR was 11.21 %, but the calculated NPV for it was not zero but 38,429.55. In iteration 2, then, the linear estimate for zero NPV based on this and the previous IRR (11.21 % and 10 %, respectively) was 11.88 %, but the calculated NPV was 2,456.11, and so forth, with every iteration using the preceding iteration as a starting point.
The IRR get's closer and closer to the final answer, 11.93 %, by every round. That's what we got from Excel's XIRR, as well.
Of course, if we were not interested in all the intermediate results and NPVs etc., we could use the following instead, to just get the final result:
select IRR from iter_data where IterStatus not in ('Starting', 'Iterating');
(I could just ask for 'Final', but then I would miss any possible error messages.)
In the next part, I will modify the IRR code above to suit the needs of my test database, and produce multiple IRRs by one function call.
Comments
Post a Comment