Performance Attribution Cumulative Time Series

Part IV: Cumulative Time Series of Attribution Effects

In parts I and II, we set up a test database with performance calculations, and in part III, calculated the daily Brinson-Fachler performance attribution results. Now, I will add another inline user-defined table function to calculate cumulative time series of all the contribution and attribution effects using Cariño scaling. These can then be used alongside a chart of portfolio and benchmark cumulative returns to explain the difference between them. And, of course, taking just the last item of the cumulative time series gives you the full period results only.

The function is quite long, some 250 lines of code, so the full version will only be available in my GitHub repository. However, much of the length is just simple repetition, as the statement must use subqueries to calculate values in phases, and there are a lot of descriptive fields to "carry over" from one subquery to the next.

Scaling required

The basic Brinson-Fachler attribution formulas perfectly explain the excess return of the portfolio - i.e. the sum of allocation, selection, and interaction effects for a calculation period equals the return difference between the portfolio and the benchmark. However, this only applies to a single calculation period, which must be the same for attribution calculations as for portfolio and benchmark returns. In the olden days, monthly, or even quarterly, valuation of portfolios might have been used; nowadays, daily is the standard.

Unfortunately, the attribution effects do not easily convert into time series so that they would still match the time series of portfolio excess returns. Consider the following simplistic example (see also Attribution Scaling.xlsx in GitHub):

Above, there are two periods of returns, where for both periods, the portfolio return is 2 %, benchmark return 1 %, the allocation effect 0.6 %, and the selection effect 0.4 % (let's assume interaction effect is zero, for simplicity). Cumulative returns are, obviously, calculated as the product of daily return coefficients, e.g. (1 + 0.02) x (1 + 0.02) - 1 = 4.04 % for the portfolio.

For both periods, the sum of attribution effects (1 %) fully explains the excess return (1 %). However, the cumulative excess return is 2.030 % while the cumulative sum of attribution effects is only 2.005 %. As stated in Carl Bacon's history review of performance attribution (p. 15), "No obvious way exists, however, to combine attribution effects over time." A number of smoothing and linking algorithms exist, however, to make these match. And, there is also the geometric attribution methodology, which bypasses the problem altogether - but it has gained limited popularity, as people tend to focus on the arithmetic return difference (above, 2.030 %) rather than the geometric one (above, that would have been (1 + 0.0404) / (1 + 0.0201) - 1 = 1.990 %).

Cariño scaling

The Cariño scaling method is, in the end, quite simple. Denoting R as the arithmetic return and r as its natural logarithm equivalent [in Excel, r = ln(1+R) and in SQL, r = log(1+R)], Cariño method first scales all the attribution effects so that their sum matches the difference in log returns instead of the difference in arithmetic returns. E.g. in the example above, the difference in arithmetic returns is 2 % - 1 % = 1 % for both periods, and the difference in log returns is log(1 + 0.02) - log (1 + 0.01) = 0.98523 %. The scaling factor for the attribution effects is then 0.98523 % / 1 % = 0.98523.

Now, applying the scaling factor to the attribution effects (i.e. multiplying by it) on both periods makes their sum (0.985 %) match the excess log return of the portfolio. And, since calculating the cumulative log return is done by summing up the daily log returns, summing up the scaled daily attribution effects (1.970 %) also matches the cumulative excess log return (1.970 %), so now also the cumulative time series match! The only thing remaining is to scale those attribution effects back to arithmetic so that they match the arithmetic excess return. Quite handily, we already have the required scaling factor, calculated with the same formula but based on the cumulative returns, and it is 1.970 % / 2.030 % = 0.97067 - only now as we are scaling "backwards" we need to use it as a divisor: 1.970 % / 0.97067 = 2.030 %, and similarly for the allocation and selection effects. 

There is a special case where the formula excess log return / excess return cannot be used: when excess return is zero. In that case, we'll need to use 1 / (1 + Rp) instead, where Rp is portfolio arithmetic return.

Similar scaling factors must be calculated separately for return contribution effects, just using only portfolio returns (or only benchmark returns) instead of excess returns.

It's a Sequel

A practical issue when using Cariño scaling in cumulative time series is that the scaling factor to convert the scaled attribution factors back to arithmetic effects depends on the upper level portfolio and benchmark return, which may well continue to change even if the time series of some specific item stops. For example, when calculating the attribution effects on three sectors within equities, let's say one of the sectors is an off-benchmark bet, and it is completely sold in the middle of the period. Firstly, for the attribution effects to continue to explain the excess return of equities, the cumulative effects of this sector must be taken into account even after it was sold (and there's no daily data for it left). Secondly, the scaling will still depend on the cumulative equity returns of the portfolio and benchmark, so the cumulative time series of that sector's attribution effect will still change after it has been sold. (Some other scaling methods handle this differently from Cariño. It can be considered as one of the weaknesses of the Cariño method that the cumulative effects still change even after the item has no data any more.)

When building the inline user defined table function uftPerformanceAttributionCumulative, I first define the items so that we know all the time series we need data for.


with items as (

	-- get a list of all items and their data starting date
	-- needed to make sure all time series continue from starting date to @to_date for daily sums of cumulative sums to work correctly
	select

		StructureIK,
		LevelCnt,
		LevelNr,
		Level1Name,
		Level1,
		Level2Name,
		Level2,
		Level3Name,
		Level3,
		Level4Name,
		Level4,
		min(Todate) as minDate

	from 

		PerfTest.uftPerformanceAttribution(@portfolioik, @bm_portfolioik)

	where 

		Todate between @from_date and @to_date

	group by

		StructureIK,
		LevelCnt,
		LevelNr,
		Level1Name,
		Level1,
		Level2Name,
		Level2,
		Level3Name,
		Level3,
		Level4Name,
		Level4

)

Then, this is joined to the Calendar table to get all dates required for each time series, starting from the minDate i.e. the beginning date of each, and going all the way to @to_date i.e. the end date parameter of the function.

from 

	-- join all items to calendar so that we have all time series from their start date up to function @to_date
	items i
	inner join PerfTest.Calendar c on c.Todate >= i.minDate
	-- get actual attribution daily data for the dates found
	left outer join PerfTest.uftPerformanceAttribution(@portfolioik, @bm_portfolioik) d on d.Todate = c.ToDate
		and d.StructureIK = i.StructureIK 
		and d.LevelNr = i.LevelNr 
		and (d.Level1 = i.Level1 or (d.Level1 is null and i.Level1 is null))
		and (d.Level2 = i.Level2 or (d.Level2 is null and i.Level2 is null))
		and (d.Level3 = i.Level3 or (d.Level3 is null and i.Level3 is null))
		and (d.Level4 = i.Level4 or (d.Level4 is null and i.Level4 is null))
	-- if no data, we need upper level performance data for scaling use
	left outer join PerfTest.uftPerformanceTimeSeries(@portfolioik, @bm_portfolioik) t on t.Todate = c.ToDate and d.Todate is null
		and t.StructureIK = i.StructureIK 
		and (
			-- top level: use itself as upper node
			(t.LevelNr = 0 and i.LevelNr = 0) 
			-- other levels: upper level LevelNr = item LevelNr-1, and match all upper level identifiers
			or (
				i.LevelNr > 0
				and t.LevelNr = i.LevelNr - 1
				and (i.LevelNr <= 1 or t.Level1 = i.Level1)
				and (i.LevelNr <= 2 or t.Level2 = i.Level2)
				and (i.LevelNr <= 3 or t.Level3 = i.Level3)
			)
		)

The actual daily attribution data are joined to these time series using a left join. Another left join is required to get the upper level data for the time series ending before @to_date; like Equities if one of the sector time series below it ends half-way through. By using left joins the time series will still continue all the way to @to_date even if there's no daily data for parts of the date interval.

Next, I am converting any currency returns to float data type for maximum accuracy in calculations, and setting those returns to zero for the beginning date i.e. @from_date so that the time series will neatly begin from zero. Also, when missing the upper level returns in the first join (if an item time series stops), I am using the second join to complement those data (note: showing only part of the statement rows here):

cross apply (
	select
		case when d.Todate = @from_date then null else d.ReturnPCLog end as ReturnPCLog,
		-- ....
		convert(float, coalesce(d.up_ReturnPC, t.ReturnPC, 0)) as up_ReturnPC,
		-- ....
) f

Then, it is time to add the daily Cariño scaling factors: one for portfolio contribution, one for benchmark contribution, and one for attribution - calculated as explained above:

cross apply (
	-- calculate Cariño scaling factors. Simply scaling the daily percentage factors to log factors --> sum of log factors will match the explained log returns. 
	-- R = return (perc)	r = return (log)	W = Weight	U = upper level	p = portfolio	b = benchmark
	select
		-- daily contribution scaling factors: by default rUp / RUp
		case
			when f.up_ReturnPC = 0 or f.up_lag_MarketValuePC = 0 then cast(1.0 as float)
			when f.up_ReturnPC / f.up_lag_MarketValuePC <= -1.0 then cast(1.0 as float)
			else log(1.0 + f.up_ReturnPC / f.up_lag_MarketValuePC) / (f.up_ReturnPC / f.up_lag_MarketValuePC)
		end as ContribScaling,
		case
			when f.up_bm_ReturnPC = 0 or f.up_bm_lag_MarketValuePC = 0 then cast(1.0 as float)
			when f.up_bm_ReturnPC / f.up_bm_lag_MarketValuePC <= -1.0 then cast(1.0 as float)
			else log(1.0 + f.up_bm_ReturnPC / f.up_bm_lag_MarketValuePC) / (f.up_bm_ReturnPC / f.up_bm_lag_MarketValuePC)
		end as bm_ContribScaling,

		-- daily attribution scaling factor: by default (rUp - rUb) / (RUp - RUb)
		-- if RUp = RUb: 1 / (1 + RUp)    (not using exactly equal but difference < 1.0E-12 - infinitesimal differences cause issues with division by zero)
		-- if no sensible data, 1
		case
			when f.up_lag_MarketValuePC = 0 then cast(1.0 as float)
			when f.up_bm_lag_MarketValuePC = 0 then cast(1.0 as float)
			when f.up_ReturnPC / f.up_lag_MarketValuePC <= -1.0 then cast(1.0 as float)
			when abs(f.up_ReturnPC / f.up_lag_MarketValuePC - f.up_bm_ReturnPC / f.up_bm_lag_MarketValuePC) < 1.0E-12 then cast(1.0 as float) / (1.0 + f.up_ReturnPC / f.up_lag_MarketValuePC)
			else (log(1.0 + f.up_ReturnPC / f.up_lag_MarketValuePC) - log(1.0 + f.up_bm_ReturnPC / f.up_bm_lag_MarketValuePC))
				/ (f.up_ReturnPC / f.up_lag_MarketValuePC - f.up_bm_ReturnPC / f.up_bm_lag_MarketValuePC)
		end as AttrScaling
) s

Again, like in the previous parts, I am using currency returns where possible to avoid issues when there are returns but no market value.

Now, what we have as data, are the daily returns, and contribution and attribution effects, from the uftPerformanceAttribution function, accompanied by the daily Cariño scaling factors for each item. In addition, all time series that end in the middle of the period, now continue to the end.

It is time to calculate the cumulative period returns and effects, in logarithmic terms:

select 
	c.Todate,
	-- [item data like sector name etc. removed]
	-- window functions to calculate cumulative period factors, scaled to log returns
	sum(f.ContribPerc * s.ContribScaling)		over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as ContribPerc_Cumul,
	sum(f.bm_ContribPerc * s.bm_ContribScaling)	over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as bm_ContribPerc_Cumul,
	sum(f.Allocation * s.AttrScaling)		over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as Allocation_Cumul,
	sum(f.Selection * s.AttrScaling)		over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as Selection_Cumul,
	sum(f.Interaction * s.AttrScaling)		over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as Interaction_Cumul,
	-- window functions to calculate cumulative period Log returns, upper level needed for scaling the above factors back to match percentage returns
	sum(f.ReturnPCLog)				over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as ReturnPCLog_Cumul,
	sum(f.bm_ReturnPCLog)				over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as bm_ReturnPCLog_Cumul,
	sum(f.up_ReturnPCLog)				over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as up_ReturnPCLog_Cumul,
	sum(f.up_bm_ReturnPCLog)			over (partition by i.StructureIK, i.LevelNr, i.Level1, i.Level2, i.Level3, i.Level4 order by c.Todate rows between unbounded preceding and current row) as up_bm_ReturnPCLog_Cumul

To get from daily data to cumulative time series, I am using a window function to sum up all dates up to c.Todate. Each daily contribution and attribution effect is scaled by the appropriate scaling factor calculated above. In addition, the cumulative returns are also calculated in the same fashion. What we now have is a set of cumulative time series data, where the contribution and attribution effects sum up to the log returns (for contribution) or log excess returns (for attribution) of the upper level of each item. Now, we need to scale these back to arithmetic returns and excess returns.

select

	a.Todate,
	-- [item data like sector name etc. removed]

	isnull(exp(a.up_ReturnPCLog_Cumul) - 1.0, 0.0) as up_ReturnPCPerc_Cumul,
	isnull(exp(a.up_bm_ReturnPCLog_Cumul) - 1.0, 0.0) as up_bm_ReturnPCPerc_Cumul,
	isnull(exp(a.up_ReturnPCLog_Cumul) - exp(a.up_bm_ReturnPCLog_Cumul), 0.0) as diff_up_ReturnPCPerc_Cumul,
	isnull(exp(a.ReturnPCLog_Cumul) - 1.0, 0.0) as ReturnPCPerc_Cumul,
	isnull(exp(a.bm_ReturnPCLog_Cumul) - 1.0, 0.0) as bm_ReturnPCPerc_Cumul,
	isnull(exp(a.ReturnPCLog_Cumul) - exp(a.bm_ReturnPCLog_Cumul), 0.0) as diff_ReturnPCPerc_Cumul,

	isnull(a.ContribPerc_Cumul	/ nullif(sp.ContribScalingPeriod, 0), 0.0) as ContribPerc_Cumul,
	isnull(a.bm_ContribPerc_Cumul	/ nullif(sp.bm_ContribScalingPeriod, 0), 0.0) as bm_ContribPerc_Cumul,
	isnull(a.Allocation_Cumul	/ nullif(sp.AttrScalingPeriod, 0), 0.0) as Allocation_Cumul,
	isnull(a.Selection_Cumul	/ nullif(sp.AttrScalingPeriod, 0), 0.0) as Selection_Cumul,
	isnull(a.Interaction_Cumul	/ nullif(sp.AttrScalingPeriod, 0), 0.0) as Interaction_Cumul

from

	dat as a
	cross apply (
		-- Scale log factors back to percentage factors using period log and percentage returns.
		-- R = return (perc)	r = return (log)	W = Weight	U = upper level	p = portfolio	b = benchmark
		select
			-- contribution scaling factors for the period: by default rUp / RUp
			case
				when exp(a.up_ReturnPCLog_Cumul) - 1.0 = 0 then cast(1.0 as float)
				else a.up_ReturnPCLog_Cumul / (exp(a.up_ReturnPCLog_Cumul) - 1.0)
			end as ContribScalingPeriod,
			case
				when exp(a.up_bm_ReturnPCLog_Cumul) - 1.0 = 0 then cast(1.0 as float)
				else a.up_bm_ReturnPCLog_Cumul / (exp(a.up_bm_ReturnPCLog_Cumul) - 1.0)
			end as bm_ContribScalingPeriod,

			-- attribution scaling factor for the period: by default (rUp - rUb) / (RUp - RUb)
			-- if RUp = RUb: 1 / (1 + RUp)    (not using exactly equal but difference < 1.0E-12 - infinitesimal differences cause issues with division by zero)
			case
				when abs(a.up_ReturnPCLog_Cumul - a.up_bm_ReturnPCLog_Cumul) < 1.0E-12 then cast(1.0 as float) / exp(a.up_ReturnPCLog_Cumul)
				else (a.up_ReturnPCLog_Cumul - a.up_bm_ReturnPCLog_Cumul) / (exp(a.up_ReturnPCLog_Cumul) - exp(a.up_bm_ReturnPCLog_Cumul))
			end as AttrScalingPeriod
	) sp

Starting from dat representing the cumulative time series, I am calculating another set of Cariño scaling factors, but this time based on the cumulative returns. Then, instead of multiplying by this factor, the log effects are divided by it, in effect returning the data to arithmetic.

Further Thinking

Since the function is quite long, I will try to summarize below what happens within it, phase by phase:

  1. Get all the items and their time series beginning dates from uftPerformanceAttribution.
  2. Join the items with Calendar table to get date series of every item from its beginning date to @to_date, the end date required.
  3. Get the daily attribution data for each item and date from uftPerformanceAttribution. If time series does not have data, fill the gaps for the needed upper level data by adding separate upper level data from uftPerformanceAttribution.
  4. Calculate the required daily Cariño scaling factors for each item and separately for portfolio return contribution, benchmark return contribution, and attribution effects.
  5. Calculate the daily cumulative time series of logarithmic returns and scaled contribution and attribution effects using window functions. 
  6. Calculate the Cariño scaling factors based on the cumulative returns, and use these to scale the contribution and attribution time series "back" to arithmetic cumulative effects.
Now, clearly, it does not seem optimal for efficient processing to access a heavy function like uftPerformanceAttribution (which itself uses a function and some views behind it) three times. The only reason for this is that the time series may have gaps, when the calculation does not allow gaps for everything to be accounted for. The gaps actually are created, on purpose, in the view vMarketValueAndCashFlow: it drops out data when Units for an instrument goes to zero. While it would be easy to remove this, and have all the time series, once started, to go on infinitely, that may not be practical. In the real world, many portfolios may exist for dozens of years, with instruments like commercial paper, bonds, or FX forwards expiring frequently; one would end up with a much larger number of zero time series than actual data.

In addition to the above, I have not stored any of the intermediate results for this exercise. For anything of scale, like an asset management company with thousands of portfolios, it would not make sense to calculate everything starting from transactions and prices every time something is reported. Usually, some levels of intermediate results are stored, which makes subsequent processing much faster - having the obvious downsides of using storage space and having to make sure all changes in the underlying data are re-processed and updated in the intermediate results. In a database, triggers and stored procedures would be some tools to consider for that purpose. On the other hand, since the Cariño scaling is dynamic, i.e. depends on the period chosen, it does not make much sense to try to process and store the final, "scaled back" time series.

Next: An Excel tool to look at the results.

Comments

Popular posts from this blog

Calculating IRR, DPI, RVPI, and TVPI, Part III

Calculating IRR, DPI, RVPI, and TVPI, Part I

Calculating IRR, DPI, RVPI, and TVPI, Part II