Nonparametric Estimation of the Family of Risk Measures Based on Progressive Type II Censored Data

Tail risk analysis plays a central strategic role in risk management and focuses on the problem of risk measurement in the tail regions of extreme risks. As one crucial task in tail risk analysis for risk management, the measurement of tail risk variability is less addressed in the literature. Neither the theoretical results nor inference methods are fully developed, which results in the difficulty of modeling implementation. Practitioners are then short of measurement methods to understand and evaluate tail risks, even when they have large amounts of valuable data in hand. In this paper, some nonparametric methods of estimation for the class of variability measures among proportional hazards models based on progressively Type-II censored data are derived. We showed some properties of these estimators. Simulation studies have been performed to see the effectiveness of the proposed methods, and a real data set has been analyzed for illustrative purposes. Some well-known variability measures, such as the Gini mean difference, the Wang right tail deviation and the cumulative residual entropy, are, up to a scale factor, in this class.


Introduction
In actuarial science, the study of large losses that occur with very small probability refers to the right tail risk analysis. In this framework, the value-atrisk (VaR), one of the more popular risk measures, is still widely used by insurance companies and financial institutions due to its conceptual simplicity. However, VaR has been criticized because it is not (in general) subadditive, and hence it is not coherent. An approach to solve this problem is to use the conditional tail expectation (TCE), which is the conditional expectation of the losses above the VaR. This measure is coherent and is the expected size of losses exceeding the VaR. Since tail events are subject to variability, the attention in the actuarial and financial literature is combining tail-loss measures (such as VaR and TCE). [1] combine TCE and different versions of tail variances into a single measure and derive explicit expressions in the framework of multivariate elliptical distributions. [2] consider additional tail variability measures to produce the so-called Gini shortfall. [3] obtain a new risk measure by combining TCE and the shortfall deviation. Following this approach, we estimate a wide class of variability measures based on distances among proportional hazards models. This family, besides including some of t he previously cited measures, also contains other variability measures that sometimes perform better, such as Wang's right tail deviation ( [4]) and the cumulative residual entropy ( [5]). [ 6 ] constructed a distortion risk measure based on a loss variable and a benchmark variable in extreme scenarios to study the comovement of the two variables. [7] extended the tail Gini functional of a univariate random variable to a case of a bivariate random vector as a tail risk variability measure that incorporates both the marginal risk severity of the loss variable and the tail dependence structure of the two variables. We study the nonparametric estimations f or these measures.
Censoring schemes of statistical experiments occur naturally in survival and reliability and medical studies. Progressive Type II censoring scheme is one of the most generall y popular censoring schemes. It has achieved great attention in the last decades. The progressive Type-II censoring scheme can be described as follows. Assume that identical units , , . . . , are located on a life testing experiment and each unit has lifetime distribution with pdf and cdf . Suppose the vector of censoring scheme is ( ; , , . . . , ), where demonstrates the number of units withdrawn from − ∑ − surviving units in the -th ( = 1,2, . . . , ) stage of censoring and is the prefixed number of removals. Based on this progressive censoring scheme, the following order statistics : : , : : , . . . , : : are derived. In such a life-testing experiment, we successively perceive the failure units, but at the -th stage of observation, units are randomly chosen among the remaining units. In other words, the sample size is reduced progressively by + 1 (one failed unit and removed items). It is obvious that this type of censoring scheme contains the conventional type-II right censoring scheme ( = =. . . = = 0, = − ) and the complete sampling scheme ( = =. . . = = 0, = ). More details about progressive censoring are may be located in [8] for more details about the progressive censoring. Suppose ( : : ) = : : be the -th order statistic from uniform (0,1) distribution. Now, define It is clear that 'is are independent random variables that follow + ∑ , 1 . It can be seen that: The rest of the paper is structured as follows. Different nonparametric methods of estimation and some theoretical properties for a wide class of variability measures among proportional hazards models based on Type-II progressive censoring are developed in Section 2. Simulation experiments, as well as analyses of real data, are performed in Section 3. Section 4 contains conclusions. [9] considered a non-negative random variable Let be a random variable with distribution function . Given , > 0, we have
We develop various nonparametric estimates for , ( ) quantity based on Type-II progressively censored samples. [9] represented , ( ) as: ( Also, it can be easily seen that , ( , ) can be expressed as (3)

Moment approximation method
In this method, the difference operator proposed by [15] for estimating the entropy is operated. This method was also used when estimating the extropy by [16] and [17]. This method is mainly based on the following fact: Based on [16], here, ( ) is a variability measure if it satisfies the following intuitive properties: 1.

1.
For all risk , we have , , ( ) ≥ 0. Proof. Since when → , the progressively Type-II censored sample becomes the complete sample. According to , , ( ) converges to the , ( ) which were proved to be consistent.

Kernel-based method
Suppose that ( ) is estimated by kernel function K as follows: ( ) = ∑ ( : : ), Where ℎ > 0 is a bandwidth, K is the kernel function that is a smooth and symmetric function that satisfies ∫ ( ) = 1, ∫ ( ) = 0. Our estimate is proposed assuming the kernel function to be the standard normal density function

Monte Carlo estimation method
The third estimator of , ( ) (2) is proposed as follows based on Monte Carlo estimation

Simulation study
In this Section, we present some results of the Monte Carlo simulation study to assess the performances of the estimates for risk measures under different censoring schemes. We consider different censoring schemes in this study, including: We compare the performances of the proposed estimates 's in terms of the mean square error (MSE) based on a Monte Carlo simulation study with 10000 iterations. Estimates depending on the window size were computed assuming = √ + 0.5 , which was used for extropy estimates by [15]. For a particular , , and a censoring scheme, we generate a progressively censored sample from the exponential distribution ( ) with = 1. In each case, we compute all the estimates , , = 1,2,3 of the risk measures , , . We replicate the process 10000 times and compute the MSE associated with each estimate. It can be seen from Table  1 that risk measure estimates are affected, at different levels, by the sample size, censoring scheme, and the parent distribution of the data. As expected, when the sample size increases, the MSE decreases for all estimates. For the exponential distribution, one can see from Table 1 that all risk measure estimates are satisfactory and perform almost the same under the MSE criterion except for . It is clear that the kernel-based estimates produce high MSE values under one step from the right and mixed with equal removals censoring schemes and produces the lowest MSE values among all risk estimates almost under all censoring schemes except for one-step from left censoring schemes, in which produces slightly lower MSE values. Accordingly, one might observe that if the data comes from an exponential distribution, then , mostly perform better than the other estimates.

Numerical Example
To illustrate the ideas discussed above, we consider a data set presented by [17] (p. 156). The amounts presented in the following are inflation-adjusted (to 1981 using the U.S. Residential Construction Index) hurricane losses from 35 hurricanes that occurred between 1949 and 1980 and resulted in losses in excess of 5,000,000. The numbers shown are the amounts in excess of 5 million in units of 1,000. We treat the numbers in the following Tabas as the outcomes of the independent and identically distributed non-negative random variables , . . . , The lognormal distribution adequately fits the data. When we use the lognormal model, the MLEs of the lognormal distribution parameters are ̂= 11.0447 and = 1.6828. Using the MLE of parameters, it follows that the MLEs of the risk measures of X are estimated depending on the complete data as  Table 2. Table 2 shows that its results are consistent with the ones concluded from the simulation studies for moderate and large samples. That is, the kernel-based method performs better than the other estimates when the data comes from Log-Normal distribution. Yet, other estimates also provide satisfactory results in some cases.

Discussion and conclusions
In this paper, we have considered the estimation problem of the class of risk measures based on Type-II progressive censoring samples. Nonparametric-based methods involving moments approximation, Kernel-based, and monte Carlo methods have been discussed. It is evident that the best estimates of the risk measures depend on the parent model of the data, sample size, and censoring scheme. Generally, the estimates based on the kernelbased method compete with the other estimates in most of the considered cases. In our future work, we examine the nonparametric methods to investigate tail variability measure estimators in the case of multiple variables to capture the risk exposure caused by the co-movement of these financial variables. How to estimate tail variability while incorporating a tail dependence structure is a very challenging and interesting task.