Environmental Performance Index 2008 [BETA]

Sensitivity Analysis
Michaela Saisana and Andrea Saltelli, Econometrics and Applied Statistics Group, Institute for the Protection and Security of the Citizen, Joint Research Centre of the European Commission

Assessing the robustness of the 2008 EPI results requires evaluating both the uncertainties underlying the Index and the sensitivity of the country scores and rankings to the methodological choices made during the development of the Index. To test this robustness, the EPI team has continued its partnership with the Joint Research Centre (JRC) of the European Commission in Ispra, Italy. A summary of the JRC sensitivity analysis follows. The more detailed version can be downloaded here.

Any composite index like the EPI requires subjective judgments, including the selection of indicators, the data treatment, the choice of aggregation method, and the weights applied to the indicators. Because the quality of the EPI depends on the soundness of its assumptions, good practice requires evaluating confidence in the Index and assessing the uncertainties associated with its development process. To ensure the validity of the policy conclusions extracted from the EPI, it is important that the EPI’s sensitivity to alternative methodological assumptions be adequately studied. Sensitivity analysis permits the examination of the framework of a composite index by looking at the relationship between information flowing in and out of it (Saltelli et al. 2008). Using sensitivity analysis, we can study how variations in EPI scores and ranks derive from different sources of variation in the assumptions. Sensitivity analysis also demonstrates how each indicator depends upon its input information. It is thus closely related to uncertainty analysis, which aims to quantify the overall uncertainty in a country’s score (or rank) as a result of the cumulative effect of uncertainties in the index construction. A combination of uncertainty and sensitivity analyses can help to gauge the robustness of the EPI results, to increase the EPI’s transparency, to identify the countries that improve or decline under certain assumptions, and to help frame the debate around the use of the Index.

The validity of the EPI scores and respective rankings is assessed by evaluating how sensitive the EPI is to the assumptions that have been made about its structure and the aggregation of the 25 underlying indicators. The sensitivity analysis carried out for the EPI is mainly related to: 1. the measurement error of the raw data, 2. the choice of capping at selected targets for the 25 indicators, 3. the choice to correct for skewed distributions in the indicators values, 4. the weights assigned to the indicators and/or to the subcomponents of the index, and finally 5. the aggregation function at the policy level.

The main conclusions are summarized below.

How do the EPI ranks compare to the ranks under alternative methodological approaches?

The frequency table of a country’s rank summarizes the position a country can take anywhere in the 149-rank ladder (grouped in blocks of ten) of the countries when accounting for different combinations of the five types of uncertainty mentioned previously. A total of 40,000 simulations were run in order to cover the space of uncertainties present in the 2008 EPI. We discuss ranks and not scores because non-parametric statistics are more appropriate in our case given the non-normal character of the data and the scores. In the relevant literature, the median rank is proposed as a summary measure of a rank distribution. The median rank of all combinations of assumptions indicates that for 1 out of 2 countries in the EPI, the difference between the EPI rank and the most likely (median) rank is less than 15 positions (recall that we have a total of 149 studied countries). Thus, for half of the countries studied, the modest sensitivity of the EPI ranking to the five assumptions (eventual measurement error in the raw data, the correction of skewed data distribution, the use of target values, the weighting of the indicators, and finally the aggregation function at the policy level) implies a reasonably high degree of robustness for those countries. For the remaining half of the countries, the EPI performance is highly sensitive to the methodological choices in the Index, and should thus be considered as merely indicative. A discussion of the top performing countries is illustrative. The top ten performing countries in the EPI include Switzerland, Sweden, Norway, Finland, Costa Rica, Austria, New Zealand, Latvia, Colombia and France. However, the Monte Carlo simulations indicate that most of these countries should be positioned much lower. Switzerland, for example, has a probability of only 31% to be ranked in the top ten countries, and the probabilities are even lower for Austria, Latvia and France. In 98% of our simulations, New Zealand scores in the top ten, followed by Finland, Costa Rica and Colombia. Panama, whose EPI rank is 32, should actually be considered as a top ten performing country, given that 73% of the time its score is among the top ten.

Which are the most volatile countries and why?

There are several countries with a relatively large difference between their best and worst rank. A very high volatility of more than 80 positions is found for Hungary (rank: 23), Denmark (25), Albania (27), Ireland (34), Uruguay (36), Bosnia & Herzegovina (48), Belgium (57), El Salvador (65), Laos (101) and Tanzania (113). The volatility of those countries is due to the combined effect of all five assumptions, although the most influential input factors are the (1) use of a geometric versus arithmetic average aggregation function at the policy level and (2) the use of equal weighting versus Factor Analysis weights at the indicators level.

What if measurement error is incorporated?

A normally distributed random error term was added to the raw data with a mean zero and a standard deviation equal to the observed one for each indicator. The countries that are most affected by this assumption are Luxembourg (rank: 31), whose rank would drop by 53 positions. On the other extreme, the Philippines (rank: 61) would improve its rank and be placed in the 10th position. Overall, the introduction of measurement error in the raw data has a median impact of 9 ranks and a 90th percentile impact of 29 positions. In other words, this assumptions leaves 1 out of 2 countries almost unaffected (less than 9 positions change), but 1 out of 10 countries would shift more than 29 positions.

What if skewed distributions are not winsorized?

Winsorization was not found to have a significant impact on the EPI ranking. In the best case, South Africa (rank: 97) improves its position by 16, while in the worst case, Botswana (rank: 98) declines by 21 ranks. For 1 out of 2 countries, the impact of this assumption is only 5 positions, while 1 out of 10 countries shift by more than 11 positions, but not more than 21.

What if capping at target values for the indicators is not undertaken?

Luxembourg (rank: 31) and Laos (rank: 101) would see the greatest shift in their ranks (a decline of 12 and 15 positions respectively). In the best case, El Salvador (rank: 65) will improve by 9 positions. Overall, for 1 out of 2 countries, the impact of this assumption is only 3 positions, while 1 out of 10 countries shift by more than 7 positions, but not more than 15. Thus, the impact of capping at the indicators’ performance targets exerts only a small impact on the EPI ranking.

What is the impact of alternative weighting schemes?

Four alternative weighting schemes, all with their implications and advantages, are deemed as the most representative in the literature of composite indicators and worth being tested in our current analysis.
  • current weighting vs. FA-derived weights at the indicator level;
  • current weighting vs. equal weighting at the indicator level;
  • current weighting vs. equal weighting at the subcategory level;
  • current weighting vs. equal weighting at the policy level;

The simulation study showed that all of these scenarios have significant influence on the EPI ranking. The scenarios with the biggest effect were equal weighting at the policy level, equal weighting at the indicator level, and Factor Analysis derived weights at the indicator level. In any of these three cases, 1 out of 2 countries shifts less than 15 positions with respect to the original EPI ranking, while 1 out of 10 countries shifts more than 50 positions.

What if the aggregation function is geometric instead of arithmetic?

When a non-compensatory aggregation is performed at the policy level using the geometric mean function instead of the arithmetic mean, the effect on the EPI ranking is moderate. Sri Lanka, Peru and Egypt improve their ranks by 18 positions or more, while the greatest decline is observed for Uruguay (down more than 51 positions). Overall, for 1 out of 2 countries, the impact of this assumption is merely 5 positions, while 1 out of 10 countries shift by more than 18 positions (up to 51 positions).

Download: Sensitivity Analysis full report (.pdf)

Add a comment
Required
Required (will not be published)
Comments determined to be spam will be filtered automatically. Comments may be removed if they are offensive or off-topic. For more information, please see Comments.
Charts on this website require the Flash plugin, version 8 or higher. Free download (easy to install).