This package is a collection of the methods developed in (Bodnar, Parolya, and Thorsén 2021). It constructs sequences of estimated Global Mimimum Variance (GMV) portfolios. The concept of a portfolio is most often used in finance as a way to invest in financial assets (e.g. stocks), which are the applications we will have in mind. There are other applications of these types of estimators, such as signal processing. The simplest example of a portfolio is the equally weighted (EW) portfolio which says that you should invest equal amount in every assets. If you have p assets, then each asset gets 1/p of your money.
To install the package you need to run
devtools::install_github("Statistics-In-Portfolio-Theory/DOSPortfolio")
,
which demands that you have installed devtools
. To load the
package, run
The main interface to the package is
DOSPortfolio::DOSPortfolio
which acts as a wrapper for
DOSPortfolio::wGMVOverlapping
and
DOSPortfolio::wGMVNonOverlapping
. The big perk of using the
interface is that you get input validation and more informative errors.
You could use the functions directly, which would most likely not throw
an error (with an exception of specific degenerate cases), though the
results might not make sense. Lets simulate some data, in our simple
example we assume the (log) returns are given by a t-distribution with 5
degrees of freedom.
Notice that it is a matrix on long format. Each column is
assumed to be the time series of an asset. We assume that the
returns are ordered in time. The observation on the first row was
observed before the observations on the second and so forth. To
construct the DOS portfolio we need to specify the number of
reallocation points and when these happen. Since
data
is a matrix on long format, we work with the row
index. The DOS portfolio is then constructed as
reallocation_points <- c(25, 42)
# use the first subsample to estimate the relative loss
(portfolios <- DOSPortfolio(data, reallocation_points))
#> $weights
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0.06510282 0.06573879 0.06462835 0.06513924 0.06737282 0.06552301
#> [2,] 0.06466248 0.06547752 0.06405441 0.06470915 0.06757165 0.06520099
#> [,7] [,8] [,9] [,10] [,11] [,12]
#> [1,] 0.06799579 0.07042752 0.06718712 0.06615079 0.07082025 0.06596092
#> [2,] 0.06837005 0.07148649 0.06733367 0.06600553 0.07198981 0.06576220
#> [,13] [,14] [,15]
#> [1,] 0.06688559 0.06761335 0.06345362
#> [2,] 0.06694724 0.06787992 0.06254890
#>
#> $shrinkage_type
#> [1] "non-overlapping"
#>
#> attr(,"class")
#> [1] "DOSPortfolio"
The first parameter we configure is the number of reallocation points
and when they happen. These are specified in the
reallocation_points
vector and specify when you recompute
the weights. In this case we re-weight our portfolio on the 25th
observation and 42th observation. Notice that there we do not include
the last 10 observations. There are only 2 reallocation points and these
do not include the last observed value. If you wanted to include a third
reallocation point, just add the number of rows (or 50 in this case) to
the reallocation_points
vector. However, as we will see
later on, this will not work in our example though there is a simple
workaround.
Each row of portfolios$weights
is a shrunk GMV
portfolio. For each reallocation point the portfolio is computed using a
convex combination using the previously estimated portfolio and on the
first transition, the target portfolio. This is the “transition” which
is made optimally and is specifically tailored to work in higher
dimensions, when the number of assets is very large and possible close
to the number of observations.
The string seen in portfolios$shrinkage_type
is what
type of shrinkage estimator we used to make the transitions. There are
two options, "non-overlapping"
and
"overlapping"
. These can handle quite different scenarios
which we can illustrate by extending the
reallocation_points
vector to
reallocation_points <- c(25, 42, 50)
# This will not work
try(portfolios <- DOSPortfolio(data, reallocation_points))
#> Error : Non-overlapping estimator can not handle concentration ratios above one.
#> Consider excluding one (or more) break point(s) or provide more data.
However, this will work
(portfolios <- DOSPortfolio(data, reallocation_points, shrinkage_type = "overlapping"))
#> $weights
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0.06510282 0.06573879 0.06462835 0.06513924 0.06737282 0.06552301
#> [2,] 0.05645876 0.06061000 0.05336169 0.05669649 0.07127602 0.05920153
#> [3,] 0.05578955 0.06021294 0.05248943 0.05604286 0.07157821 0.05871213
#> [,7] [,8] [,9] [,10] [,11] [,12]
#> [1,] 0.06799579 0.07042752 0.06718712 0.06615079 0.07082025 0.06596092
#> [2,] 0.07534247 0.09121541 0.07006391 0.06329931 0.09377896 0.06205997
#> [3,] 0.07591125 0.09282479 0.07028663 0.06307855 0.09555641 0.06175796
#> [,13] [,14] [,15]
#> [1,] 0.06688559 0.06761335 0.06345362
#> [2,] 0.06809569 0.07284610 0.04569367
#> [3,] 0.06818937 0.07325121 0.04431871
#>
#> $shrinkage_type
#> [1] "overlapping"
#>
#> attr(,"class")
#> [1] "DOSPortfolio"
The issue concerns what is called the “concentration ratio” which we denote ci = p/ni, where ni is the size of each subsample. In our case these are 25, 17, 8. When the methods are derived for the “non-overlapping” shrinkage estimator it is assumed that ci ∈ (0, 1). This is due to the assumption that the sample covariance matrix needs to be non-singular. For the “non-overlapping” estimator we do not need to make such a rigid assumption, we only need that the first value c1 ∈ (0, 1) and the rest can be p > nj for j = 2, ..., T. One methods moves in windows of size ni and the other extends the data to include all observations. We use $N_I=\sum_i^I n_i$ data to estimate each portfolio weight.
The last two parameters that are available to configure is the target portfolio, which we will denote b, and an intial seed/estimate for the relative loss parameter. The target portfolio can be any portfolio whose values sums to one. The default value is the EW portfolio previously mentioned but it can be any other portfolio of your choice.
reallocation_points <- c(25, 42)
new_target <- runif(p, -1, 1)
new_target<- new_target/sum(new_target)
(portfolios <- DOSPortfolio(data, reallocation_points, target_portfolio = new_target))
#> $weights
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 0.08857778 0.07660415 0.1091866 0.1046776 0.04919528 0.07723626 0.04885424
#> [2,] 0.09291820 0.08069763 0.1079243 0.1006843 0.05099107 0.08311377 0.04476377
#> [,8] [,9] [,10] [,11] [,12] [,13]
#> [1,] -0.003160989 0.06368204 0.07418163 -0.02312972 0.08016004 0.06388415
#> [2,] -0.005106790 0.06010254 0.07547120 -0.01899556 0.08033195 0.06313534
#> [,14] [,15]
#> [1,] 0.05465194 0.1353990
#> [2,] 0.05140521 0.1325631
#>
#> $shrinkage_type
#> [1] "non-overlapping"
#>
#> attr(,"class")
#> [1] "DOSPortfolio"
The initial estimate for the relative loss parameter demands some
more detail. The relative_loss
refers to how the following
quantity r0 = 1′Σ−11b′Σb − 1
which describes how the variance of our target portfolio b′Σb
relates to the variance of the GMV portfolio 1/1′Σ−11.
A value of 0 would indicate that we
know the variance of the portfolio and it is the target portfolio. The
argument is equal to NULL
per default which implies that we
use a simple estimate of the quantity. We use the sample covariance
matrix S from the
first subsample to estimate Σ and its consistent inverse
estimate (1 − c)S−1.
The factor 1 − c comes from
the fact that the inverse sample covariance matrix is biased when p is allowed to grow together with
n, that is when p → ∞ s.t. n > p. We can use any
other estimate, though we need to make sure that the estimate is
consistent, e.g. the estimate r̂0 converges to r0 in probability.
The statistical model we make use of to derive the methods is Yni = 1niμ′ + XniΣ1/2 where i = 1, 2, ..., T is the number of reallocation points in the model. A reallocation point is just when we will reweight the portfolio, that is transition from one existing portfolio to another. Note that we can only observe Yi but when constructing GMV portfolios we are interested in estimating μ and Σ. We make the following assumptions
The first assumption is standard to make in finance but enforce the
end-user to verify that its true.
The second assumption implies that we do not tell you when you should
reweight a portfolio, we simply say that they are given. Although this
assumption has little practical relevance, it is an assumption that is
more philosophical and hard to test. The third assumption is also
somewhat philosophical. The target portfolio is chosen by the end-user
though we need to have that the quantity b′Σb
to be finite for our methods to work. This makes some assumptions on the
relation between the true covariance matrix Σ and how it relates to
b. An important
assumption that is hard to verify!
The sequences of portfolios are constructed to make as smooth transitions as possible from one portfolio to another. The transitions are made to be “optimal” when you have many assets p in comparison to observations n, which is usually characterized by the concentration ratio c = p/n ∈ (0, 1). When c is close to 1 we have very little data in comparison to how many parameters we need to estimate. Note that we do not cover c > 1 for non-overlapping estimators, though our aim is to do so in the future.