Assessing Countries Sustainability: A Group Multicriteria Decision Making Methodology Approach

Sustainability is a complex and abstract concept. However, policy-makers and representatives of global and regional associations need to assess and track the sustainable development of countries and regions to define a sustainability strategic path. The objective of this research is to propose and validate a methodology to define a simple but proper sustainability index that serves as a proxy for the identification of the segments of most and least advanced countries according to their achievement of the sustainable development goals defined by the United Nations (UN). Several well-known quantitative methodologies are used to first define a summarized index of sustainable development. Second, multicriteria decision-making methods are applied to determine the relative importance of the elements or dimensions comprising the sustainability concept. Then, the simulated judgments of a group of experts is used to compute a group weight vector by applying the Fuzzy Analytic Hierarchy Process (FAPH). Different aggregation methods are used to compute the importance that decision-makers assign to the several dimensions of sustainability. Finally, segments of countries generated with the clustering algorithm k-means are rated to identify sustainability benchmark segment(s) and groups of countries in need of support to attain the UN sustainability goals.

growth which drive the transformative actions required to assure an inclusive, sustainable and resilient future for people (UN, 2020). The SDGs can be divided into three broad categories: 1) extensions of MDGs, 2) inclusiveness, infrastructure and industrialization, and 3) environmental protection and sustainable urbanization.
The Sustainable Development Report (UN, 2019) presents national and regional SDG indexes and Dashboards that summarize the assessment of countries based on the distance to the SDG targets. However, the elusiveness of the sustainability concept and the interdependence of its components contributes to the imprecision of these global indexes and other indexes and indicators that have been proposed to assess SD ( Van de Kerk & Manuel, 2008). Campagnolo, Eboli, Farnia and Carraro (2018b) and Mensah (2019) also noticed that the limited availability of data for all countries, and the interrelationships between the SD dimensions, represent a problem to Public Administrators who must monitor the sustainability progress of countries and regions to define strategies that satisfy the expectations of different stakeholders (e.g., local governments and global organizations). Mensah (2019) argued that further clarification of the SD concept, as well as the identification of key indicators associated with its three main dimensions can help to develop a better global SD index.
Sustainability is an abstract concept or latent variable whose magnitude needs to be assessed with the use of tangible and measurable indicators. Under this perspective, measuring SD requires identifying the dimensions that comprise the concept and to define a minimum number of indicators related with each dimension. The resulting SD index is a proxy variable of sustainable development and its validity depends on the clarity of the concept definition, how well it is distinguished from other concepts (e.g., economic development vs. sustainability), and the extent to what indicators used to measure SD are logically and highly correlated to its dimensions. In certain cases, the sub-dimensions or components of the main dimensions comprising a concept are also latent variables that need to be finally expressed in terms of observable indicators (Bauldry, Bollen, & Adair, 2015). Due to the large number of indicators that may be associated with the latent variables, the identification of key ones helps to overcome the difficulty of data availability and simplifies the measurement model proposed to operationalize the variable.
The identification of regions that face similar challenges in attaining the SDG is relevant to international and local organizations to identify what regions require major attention, monitor their progress, and allocate resources to decreasing the distance to the SDG targets at the same rate as other regions. The objective of this research is to propose and validate a methodology to define a simple but appropriate sustainability index, that serves as a proxy for the identification of segment(s) of most and least advanced countries in terms of the achievement of the sustainable development goals defined by the United Nations. The proposed methodology is based on well-known quantitative methods to first define a brief but meaningful SD index that represents the intuitive and abstract dimensions of sustainable development. Then, countries are rated and classified according to this summarized SD index. And finally, the multidisciplinary perspective of experts is taken into consideration to identify the segment(s) of countries that according to the decision-makers responsible for monitoring sustainable development, require(s) more attention.
This work is organized as follows: in the second section following this introduction, a review of the sustainable indexes that have been proposed is presented. This review also presents a summarized discussion of the group multi-criteria methods that can be used to support the group decision of quickly identifying the more advanced and disadvantaged segments of countries in terms of their sustainable development. The third section describes the methodology proposed to assist decision-makers and the databases used to demonstrate its applicability. The fourth section demonstrates the applicability of the methodology by using the indicators available in the databases of the Sustainable Development Report 2019 (UN, 2019) and the Sustainability Society Index (SSI, 2016). Although only the indicators available in these two databases were used to exemplify the proposed methodology, the procedure represents a generic approach to cluster entities based on their sustainability status, be these entities, countries, cities or organizations. Finally, the last sections states conclusions, limitations and future work.

Sustainability Indexes
According to Mori and Christodoulou (2012) and Campagnolo, Carraro, Eboli, Farnia, Parrado, and Pierfederici (2018a) the conceptual requirements of a sustainability index are: 1) to consider the three key dimensions of sustainability: namely economic, social, and environmental; 2) to capture the external impact of SD beyond the city/region/country; 3) to define tangible indicators to assess the sustainability dimensions, and 4) to be applicable worldwide to enable comparisons and directions for improvement. However, developing a sustainability index or SD proxy is a difficult task due to the vagueness of the sustainability concept (Mensah, The Index for Sustainable Economic Welfare (ISEW) (Daly & Cobb, 1989), as well as the Genuine Progress Indicator (GPI) (Cobb, Goodman, & Wackernagel, 1999), are centered in the economic dimension of sustainability, and particularly in the improvement of the gross domestic product (GDP). While the Commitment to Development Index (CDI-2006) developed by the Center for Global Development and published yearly since 2003 by an independent not-for-profit organization in the USA (Stapleton & Garrod, 2008) is mainly focused on human wellbeing. Specifically, the CDI reviews the level of support given by 21 countries to poor countries such that they may attain prosperity, good governance, and security. This composite index comprises indicators related to aid, investment, environment, security, and technology that are associated to six equally weighted dimensions.
Another outstanding set of sustainable development indicators is the Commission of Sustainable Development (CSD) Indicators which also resulted from a collaborative process between the Division for Sustainable Development (DSD) and the Statistics Division, both within the United Nations Secretariat (DESA, 2007). The initial set of 134 indicators comprising the CSD was voluntarily pilot-tested by 22 countries from 1996 to 1999. Most of participant countries conclude the number of indicators was too large. Consequently, a reduction to 58 indicators covering policy-oriented themes was presented in 2001. The CSD indicators provide a relevant framework for the discussion of how to achieve the SDG based on national indicators. The overlap between the set of indictors comprising the CSD and MDG has created some confusion among policy-makers and professionals. However, their general purpose is different: the CSD indicators only provide a reference to track the progress toward national goals related to SD while the MDGs are focused on monitoring the progress toward the achievement of global goals. Additionally, the CSD covers a broader range of issues to cover the three main dimensions of sustainable development. Meanwhile the MDGs have a more limited coverage biased towards human welfare.
The Sustainable Society Index (SSI) has proposed a simplified index that integrates the three main dimensions of sustainability in a simple and transparent way. The SSI comprises only 22 indicators, grouped into 5 categories and is based on the definition of sustainable development of the Brundtland Commission. The SSI has been published and refined since 2006 and gives an appropriate insight about the sustainability level of 150 countries (Van de Kerk & Manuel, 2008). This index represents an effort to capture the broad concept of sustainability with a manageable number of indicators grouped into seven sub-dimensions that represent a further decomposition of the three main dimensions of sustainability. Although the SSI and other existing indexes do not provide a completely valid measure of SD, the SSI represents a simple and quick proxy of sustainability.

Group Multi-Criteria Decision-Making Methods
Multi-Criteria Decision Methods (MCDM) can provide the support required for the multidisciplinary management of decisions such as the ones representatives of pro-sustainability organizations face when deciding what countries to support and how. There is an extensive variety of MCDM methods that can be used, among the most popular are: ELimination Et Choix Traduisant la REalitè (ELECTRE), Multi-attribute utility theory (MAUT), Analytical Hierarchy Process (AHP) and its variants, Fuzzy Analytical Hierarchy Process (FAHP) and Analytic Network Process (ANP), Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS), and Preference Ranking Organization Method for Enrichment Evaluations (PROMETHEE). No method can be considered the ideal for every decision problem but the use of an unsuitable method can represent a potential risk. Therefore, when selecting a MCDM, it is necessary to carefully analyze the decision context and match the settings with the qualities and disadvantages of available methods (Wątróbski et al., 2019). According to Guarini, Battisti, and Chiovitti (2018), this requires the identification of exogenous and endogenous variables related to the decision-making problem. Exogenous variables are determined by the decision context while endogenous variables are defined after the analysis of the MCDM literature.
Any MCDM is generally structured in two macro-phases according to Guarini et al. (2018). But we extended this process to the following three macro-phases: 1) Identification of the different alternatives and their evaluation (performance) based on a set of criteria and sub-criteria. This phase results in the specification of an evaluation matrix.
2) Processing of the entries of the evaluation matrix to rate (describe), sort, rank, or select alternatives. The procedure to follow and their results depend on the method being used, and may vary from a simple linear combination of criteria (i.e., a compensatory approach) to a complex knowledge-driven approach that consider tradeoffs between criteria (i.e., outlining the Pareto frontier) or the attainment of ideal solutions, and 3) Aggregation of the judgments of DMs in the case of group decision-making problems. Group decision-making is a common method when the desired decision needs to fulfill the expectations of different stakeholders or the solution affects the interests of the participant DMs.
For several decision contexts (e.g., characterization of marketing segments, selection of research projects for financial support, ranking of suppliers according to their performance) important exogenous variables are recognized. For the decision problem of selecting what countries with similar sustainability problems require most attention, two exogenous variables at the first macro-phase were identified: the operationalization of a set of criteria that may be highly complex and imprecise, and the large dimensionality of the decision space, i.e., the size of the evaluation matrix. In the second macro-phase, the exogenous variables identified are: the expected solution and the technical support. Finally, in the third-macro phase, meaningful exogenous variables are: the diversity in the preferences, experiences, and backgrounds of DM, and the vagueness of their judgments regarding the relative importance of one criterion over another.
The first exogenous variable, the expected solution, is stated as "identifying the target segments that require particular attention" either because their sustainability strategy is outstanding or because they are worst ranked in SD. Regarding the second exogenous variable (size of the evaluation matrix), the main problem is the availability of reliable data for all world countries. This justifies the construction of a sustainability index based on a selected set of commonly available indicators. At the second macro-phase, we recognized the technical support available to support decision making may be low. Thus, easy to apply multi-criteria methods that can be implemented with the support of basic tools are preferred, particularly in the case of developing countries. Additionally, if the number of evaluation criteria is too large (over five) the number of pairwise comparisons required by some well-known multi-criteria methods (e.g., AHP and its extensions) results excessive for practical purposes and frequently results into inconsistencies

Methodology
The classification MCDM framework proposed by Wątróbski et al. (2019) according to their appropriateness to solve different decision-making problems and the practical guidelines offered by Guarini, Battisti, and Chiovitti (2018) allowed the identification of the following difficulties associated to the group multi-criteria decision-making (GMCDM): 1) the multi-dimensionality and latent nature of sustainability requires of a simplified index to be used as a proxy; 2) the large number of alternatives to evaluate (195 countries in the world) required to only n-1 while FAHP requires n(n-1)/2 pairwise comparisons (Chen et al., 2011) which may become excessive when the number of criteria (n) is too large. Additionally, Fuzzy LinPreRa assures consistency of judgments and it is more convenient to acquire the judgments of DMs through a questionnaire online. Thus, the method is a simpler and convenient alternative to EAM-FAHP (Herrera-Viedma, Herrera, Chiclana, & Luque, 2004). We compared the two approaches to support the recommendation of using Fuzzy LinPreRa instead of extent FAHP.
b. The linguistic pairwise comparisons are transformed into TFN according to Table 1. For example, the linguistic judgment of DM1 when comparing dimension 1 (HW) versus dimension 2 (EW) is "very strong importance", then to entry A (1,2) of the individual decision matrix A 1 corresponds to the TFNs (5,7,9). Using this assignment scheme, the entries of the upper triangular part of the matrix A are found. The lower triangular part of matrix A is simply computed by the reciprocals of the TFN. For example, A (1,2) = (5, 7, 9) while A (2,1) = (1/9, 1/7, 1/5).
c. The modified EAM (Chang, 1992;Kabir & Hasin, 2011) is then applied to obtain synthetic extent values corresponding to the relative weights or priorities assigned to the criteria by each of the DM.
The alternative approach to compute crispy weights is Fuzzy LinPreRa (Chen et al., 2011). The main steps of this methodology are summarized as follows: a. Each of the DM expresses his/her judgments through only (n-1) linguistic pairwise comparisons between n dimensions at the same level of the hierarchy. For example, at the second level of the hierarchy there are n = 7 sub-dimensions. Then only six comparisons are required, while FAHP requires 7(6)/2 = 21 comparisons. Thus, LinPreRa method considerably reduces the number of required pairwise comparisons.
b. The linguistic pairwise comparisons are transformed into TFN. Following the procedure described in (Chen et al., 2011), a transformation function g(a ij ) = ½⋅(1+log 11 a ij ) is used to compute the entries of the A matrix. For example, the A (1,2) entry of the individual matrix A 1 (associated with DM1) after applying the transformation function becomes (0.84, 0.91, 0.96).
d. The transformation function f(x K )= (X K +c)/(1+2c) (K = L, M, U) is used to prevent negative fuzzy numbers while preserving reciprocity and additive consistency.
e. Finally, a defuzzification method is used to compute the individual weight vectors. Common methods for defuzzification are the mean of maximal (MOM), center of area (COA), and α-cut methods. We applied the COA method because it is simple and practical (Talon & Curt, 2017) thus resolving the jms.ccsenet.org Journal of Management and Sustainability Vol. 10, No. 1; difficulty of limited technical support.
The weight vector of DM1, computed with the extent analysis method of Fuzzy AHP and Fuzzy LinPreRa are shown in Table 2. An overall good correspondence between the weights or priorities computed with each method is observed for all sustainability sub-dimensions (the geometric mean equals 8% and the median is 8.5%). Two largest percentage difference (45.00%) is the priority assigned to the Human Wellbeing sub-dimensions of a well-balanced society. Regarding the main dimensions, the weights assigned to Economic Wellbeing registered the largest discrepancy. These differences may be explained by the forced consistency implicit in the Fuzzy LinPreRa method. But because Fuzzy LinPreRa reduces the number of pairwise comparisons and prevents inconsistencies, its application is recommended to facilitate the comparison process and computations. The next step of the methodology is the aggregation of the individual weight vectors to get a group weight vector. When priorities are similar, namely there is consensus among the DMs regarding the importance of each dimension, the usual procedure is to average the weights by using the geometric mean (Forman & Peniwati, 1998). However, when individual priorities are heterogeneous, other methods have been proposed. In this work, we applied the following aggregation methods: Weighted Geometric Data Envelopment Analysis method (WGMDEA), MEDINT method and Adopted extreme values method (ADEXTREME). The first method, WGMDEA is a hybrid method that combines the weighted geometric mean method (WGMM) with data envelopment analysis method (DEA) (Wang & Chin, 2009). The Median Interval (MEDINT) method (Grošelj et al., 2011) and the Adopted Extreme Values (ADEXTREME) method applied a different computational approach based on the use of interval comparison matrices. MEDINT uses values below and above the median for constructing the lower and upper bounds of the interval while ADEXTREME aggregates individual judgments into a group interval that reflects all individual judgments but the minimum and maximum values have the highest influence. Table 3 summarizes the results obtained by applying the three aggregation methods. WGMDEA is not compared with the other two methods because it is based on a linear programming (LP) approach while the other methods use a different rationale but they are simpler and thus attractive for practical purposes. The comparison of the three aggregation methods shows the ranking of importance of the three main dimensions and seven sub-dimensions comprising the sustainability concept is preserved. The group of DMs assign the highest priority to Human Wellbeing, this result agrees with the MDG. Regarding the sub-dimensions' priorities, satisfying the basic needs of the world population (access to education, nutrition, sanitation, etc.) are the most important component of Human Wellbeing. Meanwhile preserving natural resources and biodiversity has the highest priority among the sub-dimensions of Environmental Wellbeing. For the last sustainability dimension, Economic Wellbeing, keeping a steady economic growth was judged the most relevant component.
Minor percentage differences occur when WGMDEA is used to aggregate the individual weight vectors computed by using EAM-FAHP and Fuzzy LinPreRa (discarding the cases where weights are equal, the geometric mean of the percentage differences equals 5.52%). Differences between MEDINT and ADEXTREME methods are observed in the case of sub-dimensions. Again, discarding the cases where weights are equal, the geometric mean of the percentage difference is 20.88%. Because the difference among the tree aggregation methods is below 30% (a bound considered in statistics as low variability), any one could be a reasonable choice. However, the final recommendation is to use the WGMDEA method because the other two methods only offer possibilities that some alternatives are better than others, therefore they are expected to be more imprecise.
The next step of the methodology is to segment the 154 countries into clusters with similar sustainability indicators. K-Means is a simple unsupervised machine learning algorithm that requires basic technical support to be implemented. This method has been extensively used in several areas with satisfactory empirical results (Jain, 2010). The rationale of the algorithm is to find a partition of the alternatives such that the variability between the cluster's centroids (vector of sub-dimensions' averages) (SSB) is maximized while the variability within the cluster is minimized (SSE). If each alternative is assigned to a cluster (k = m countries), then SSE = 0, thus the goal is to identify a small value of k that still provides a low SSE.
To define the number of clusters we applied the elbow method which consists of plotting the number of clusters against the sum of squares of error (SSE) and identify the point k where SSE stabilizes or does not decrease substantially (Syakur, Khotimah, Rochman, & Satoto, 2018). The application of this method resulted in a partition of the alternatives (countries) into k = 10 clusters, a solution which was judged appropriate because groups are homogeneous and well separated. Table 4 shows the centroids of each cluster.
The cluster centroids represent the (Euclidean) distance to the associated sustainability goal which ideal value is 1. The larger the distance, the smaller the progress of the segment in achieving the SDG for which the sub-dimension serves as proxy. Authors such as Stapleton and Garrod (2008) have shown there is little justification for relaxing the equal weights assumption in the specific case of the commitment to development index (CDI). Taking his approach, then the global sustainability development score of each cluster is simply the sum of the sub-dimension's scores that is Ggs=∑SDi. According to this index, the most advanced cluster of countries is C10 with a score of 2.80 (the smallest average distance with respect to the ideal of 1.0) and the least advanced cluster is C8 with a global SD score of 4.79.  Figure 4 provides a graphical representation of the clusters in the space of the three main dimensions of sustainability, namely Human Wellbeing, Environmental Wellbeing, and Economic Wellbeing. The stars, which are visually well separated, are the clusters' centroids while the dots are the countries evaluated. In Figure 4, it can also be appreciated that, as expected, the maximum Euclidian distance between clusters corresponds to the "extreme" clusters, i.e., C10-C8. However, according to Table 3, there are important differences in the priorities the group of DMs assigned to the SD sub-dimensions and consequently to the main dimensions of sustainability. Therefore, a weighted global score that takes into consideration the judgments of the group of DMs regarding what sustainability aspects require more attention is finally proposed as the last step of the methodology (see jms.ccsenet Figure 1).

Clusters ar weight or p
Notice we resulting f "disutility" Table 5 rep four aggre clusters ca coincidenc recommen New ieved faced environmental disasters and civil wars that affect their performance in other sustainability dimensions. More specifically, some of these countries (e.g., Haiti and Sudan) are characterized by chronic widespread poverty and food insecurity that prevents fulfilling the basic needs of the population (access to education, public health, etc.) which is the most important sub-dimension of Human Welfare (Berridge, 2020;Gibson, 2020;Hendriks, Reis, Sostakova, & Berckmoes, 2020;Kwan et al., 2020;Swesi, El-Anis, & Islam, 2020).

Conclusions
The methodology proposed in this study offers a practical approach to assess the sustainability of different entities to design strategies and assign resources to advance sustainable development. The use of well-known and relatively easy to implement quantitative methodologies allows the reduction in the number of indicators required to operationalize the sustainability concept; enables the computation of group weight vectors associated to each of the dimensions and sub-dimensions comprising sustainability; identifies segments of countries with similar degrees of advance in the set of sustainability proxies, and ranks the segments to identify extreme groups.
The comparison of different approaches to obtain individual crispy priorities for the sustainability dimensions and aggregate the ambiguous judgments of DMs with different perspectives and backgrounds, indicates Fuzzy LinPreRa and WGMDEA are suitable methods that offer practical and computational advantages. Finally, the segmentation of countries reduces the number of alternatives to evaluate and simplifies the identification of countries that face similar challenges and thus can follow the same sustainability path.
The main limitations of this study are: a) the proposed methodology was demonstrated only for the specific case of assessing the advance of the world's countries in the attainment of the sustainability goals established by the United Nations, and b) the SD index is based only on the indicators associated to the SDG and SSI indexes. Therefore, extensions to this work include the use of additional indicators to improve the content validity of the SD index and application of the methodology to other cases where the sustainability of different entities need to be assessed as part of a GMCDM problem where the desired solution is the generation of ordered clusters, which considers the preference degree between sustainable aspects.