Evaluation of Performance Appraisal Methods through Appraisal Errors by Using Fuzzy VIKOR Method

Performance appraisal has a vital importance both for employees’ motivation and organizations’ effectiveness. However, unless using a true and equitable performance appraisal method, which is debugged from appraisal errors, an effective performance appraisal can not be attained. The aim of this study is to evaluate the performance appraisal methods with regard to the appraisal errors in an attempt to rank them according to their level of clarity from the errors. To this end, the evaluation results of 29 Human Resources managers evaluated for 11 performance appraisal techniques against 8 potential appraisal errors are dealt within this study. These evaluations were analyzed by fuzzy VIKOR method and a consequent list of a performance appraisal methods by rank was achieved. According to the findings of the study, the most accurate alternative was determined as the Graphic Rating Scales Method while the least one was the Comparison Method. It is suggested that human resources managers should choose the most appropriate appraisal method for their organizations by following the steps that presented in this study.


Introduction
Although there exists a degree of hesitation on the usefulness of performance appraisal (Deming, 1986), it has an outstanding importance in management (Judge & Ferris, 1993;Murphy & Cleveland, 1995) owing to the feedback they provide, needed to guide managerial practices effectively and fairly as well.
In addition to their inherent nature as a tool of awareness both for employers and employees, performance appraisal can also enhance management effectiveness by incorporating the aims and efforts of the employees, employers, and the organization as a whole as they help in the precise establishment of communication linkages among them as to achieve a synergetic outcome.
However, achievement of an effective performance appraisal is hardly observed in actual business life since the effectiveness mostly depends on its degree in providing equity.As such, unless the equity perception of the performance appraisal is not high, their results tend to be meaningless and unusable as well for the employees as they cause undesired results instead of effectiveness.Applying a Performance Appraisal Method (PAM) that is perceived as accurate and fair enhances their effectiveness acceptance as well.That is why the ratee and rater reactions to performance appraisal processes including specificly designed rating formats has seen a dramatic increase in the literature (e.g.Levy & Williams, 2004;Murphy & Cleveland, 1995;Hedge & Teachout, 2000;Jelley & Goffin, 2001;Nathan & Alexander, 1988;Wagner & Goffin, 1997).
Besides, due to the fact that fairness of performance appraisal is closely related to the evaluator, it is necessary to decrease the evaluators' errors whether they are the result of intentional or unintentional attitudes and behaviors.A way of doing this is to choose the most appropriate PAM that bears minimum level of errors with its structure and/or methodology.In this study, it is accepted that by evaluating PAMs with regard to their degree in decreasing the rater's errors, a helpful and useful method can be obtained.separately as aimed in the present study, the previous studies that compared PAMs have categorized them into two main categories, absolute vs relative performance rating methods (Roch et al., 2007;Goffin et al., 1996;Jelley & Goffin, 2001;Nathan & Alexander, 1988;Wagner & Goffin, 1997;Heneman, 1986) or one category alone (Blume, et al., 2009).In spite of the dominance of two-category group in researches, there has been no consensus among researchers whether which one is more effective while some of the researchers stated that relative performance rating methods may be more effective (e.g.Heneman, 1986;Nathan & Alexander, 1988;Wagner & Goffin, 1997;Landy & Farr, 1980).Also, none of them have compared absolute vs relative performance rating methods regarding to appraisal errors.
Although different reasons can be discussed concerning the decrease of performance appraisal's effectiveness, the present study focused on the appraisal errors, made by evaluators, and aimed to rank PAMs by their perceived decreasing effect on appraisal errors.
In this study, firstly the importance of the perception of fairness by the employees on the PAMs and errors were explained in literature review.Afterwards the procedure of the study including a brief explanation of fuzzy VIKOR method was provided, which is followed by the presentation of findings where the managerial and academic implications were discussed.

Performance Appraisal
The results of performance appraisal are used both for administrative as well as for developmental issues of employees.Being an administrative tool, performance appraisal used for; (i) Determining pay adjustments (e.g., bonus-pay); (ii) Employee feedback and development; (iii) Making job placement decisions on promotion, career development, transfer, and demotions; (iv) Employee disciplinary actions; Also as a development tool, a performance appraisal is a primary and most accurate way of obtaining information and feedback that often play key role on employees' development and career decisions.
To this end, the insufficiency and/or inaccuracy in performance appraisal cause(s) problem in two overarching goals of performance appraisals; (Fisher et al., 1996); (i) To encourage high levels of employee motivation; (ii) To provide accurate information to be used in managerial decision making.
Especially for employee motivation, an accurate and fair performance appraisal plays a vital role as stated in some theories on motivation-expectancy theory, equity theory, procedural justice theory and, goal-setting theory (George & Jones, 2012, p. 217).The following is a brief review of these motivation theories in terms of the importance of fairness perception of appraisal results by employees.
According to the expectancy theory, expectancy (the perceived connection between effort and performance) and instrumentality (the perceived connection between performance and outcomes) are two main determinants of motivation.And if evaluators (managers) appraise employees' performance accurately, employees are likely to adopt higher levels of expectancy, instrumentality and of performance.
In terms of equity theory, if employees perceive that they are receiving a proper outcome as compared to their contribution to the job, they will be better motivated.This theory implies that if they believe their performance is accurately evaluated, employees will be motivated as to perform more highly.
According to procedural justice theory, if the employees believe that the evaluators' appraisals are biased to evaluate performance, their motivation to perform is likely to decrease.Procedural justice theory suggests that procedures that used to appraise performance must be perceived as fair and accurate in order to increase the employees' motivation.
Performance appraisal is closely related to the goals of employees and organization.As such the goal-setting theory suggests that the goals of the employees have a major impact on their levels of motivation and performance, where the importance of accurate performance appraisal against the determined goals is emphasized.
In sum, all motivational theories mentioned above imply that, having a performance appraisal system that is perceived as fair and accurate by the employees plays an ultimate role in increasing the levels of employees' performance.Thus, as Roch et al. (2007, p. 303) stated "it is in the best interest of the organization to do everything possible to maximize employees' justice perceptions."Otherwise, the ultimate effectiveness of a performance appraisal system that is not accepted and supported by employees will be limited (Cawley et al., 1998, p. 616).

Performance Appraisal Methods
Many appraisal methods can be used to evaluate employee's performance.Because of many existing appraisal methods, some different categorizations of them were made by researchers (e.g., Decenzo & Robbins, 1998).In literature, most common and popular categorizations are firstly two-group one (Cascio, 1991): (i) Absolute appraisals; (ii) Relative appraisals; And secondly three-group one (Fisher et al., 1999); (i) Comparative appraisals; (ii) Behavioral appraisals; (iii) Output-based appraisals.
Although there were some studies that used two-group categorization (e.g., Roch et al., 2007;Goffin et al., 1996;Jelley & Goffin, 2001;Nathan & Alexander, 1988;Wagner & Goffin, 1997;Heneman, 1986), it is not easy to directly put every PAM into one category.Even if they may be forced to be in one category, the methods in the same category may have different features in terms of appraisal errors, which are chosen as an evaluation criterion of PAM, in this study.Then, instead of evaluating the performance appraisals' categories, it was preferred to evaluate PAMs individually.
By reviewing the literature, PAMs are determined as shown in Table 1.

Comparison (Sorting)
In this method; the rater ranks his/her subordinates on their working performance.Working performance of employees is compared and then sorted from the best to the worst.By putting a subordinate in a rank order, the relative position of each subordinate is tested in terms of his/her numerical rank.Paired comparison of subordinates, that involves comparing the working performance of each subordinate with every other subordinate, is also a version of this method.

Forced Distribution
This is an appraisal method that requires assignment of the subordinates to a limited number of categories.In this method; employees (subordinates) are inevitably evaluated according to the normal distribution.For example; 10 % of employees are at the very top of scale, 20 % of employees are at the top of scale, 40 % of employees are at the middle of scale, 20 % of employees are at the bottom of scale, 10 % of employees are at the very bottom of scale.

Graphic Rating Scales
Managers evaluate the employee according to defined factors, as the attributes printed on an evaluation form.Form has performance levels regarding attributes.There are numbers or scales (very good, good or weak) across the attributes on the form.Manager chooses one of them.
Being an oldest and most widely used method, the graphic rating scales are forms on which the evaluator simply checks off the subordinate's working performance.

Checklist
In this method; a checklist that presented work related descriptive statements, is used for every work position.Manager chooses "Yes" or "No" option that represents the effective or ineffective behavior on job that rater familiar with these work related descriptive statements.

Forced Choice
Manager is given some pre-defined expressions (a series of statements) to evaluate the performance of worker for each item.Managers indicate which items are most descriptive of the employee.Manager does not know the score equivalent of the expressions.

Composition (Essay)
Manager simply writes a narrative describing the performance of employee.This is a composition about the worker to define the worker and designates successful, unsuccessful, weaker or powerful sides of worker.This method is a non-quantitative method and rather than focusing day-to-day performance of employee it focuses on generally observed work behaviors of an employee to present a holistic view.

Critical Incidents
Manager writes down the extreme performances both negative and positive.These performances are named as critical incidents/events.These critical events should affect directly the success or failure of worker.This method requires the written records to be kept as highly effective and highly ineffective work behaviors.The manager maintains the logs of each employee to record the critical incidents to use them to evaluate the employee's performance at the end of the rating period.

360-Degree Feedback
Data from all sides, from multiple levels within the organization and from external sources, is collected in this method.Employees are assessed by his superior, inferior, work friends, clients and by themselves.By the way, this method provides an enhanced self-awareness for an employee about his/her work performance.

Management By Objectives
This is a method necessitating the attainment of the pre-defined objectives.According to this method, managers and employees determine collectively the objectives for employees to meet during a specific period.Attainment of an objective is more important than "how it was attained".Employees are then evaluated with a view to how they have achieved their determined goals.

Assessment Centers
Evaluation process is performed objectively by specialists or Human Resources (HR) professionals in the center.In this center the job of worker is simulated and worker is observed.Additionally, some tests, social and unofficial events and exercises are used to support assessment.This method is preferred by some organization due to difficulty faced with appraisal process and tends to use an assessment center as an adjunct to their appraisal system.

Team Based Performance Appraisal
As today's work life values the team work, rather than the individual performance, it is better to evaluate an individual performance as a team member.Then, employees are assessed not as individuals but as a team.
As there are many performance appraisal techniques/methods that have different features and evaluation procedures as presented Table 1, it cannot be stated that only one method can be used in a definite situation, sector or organization.We can easily see that even if some organizations that act in the same sector, have equal number of employees, similar structures, resembling visions and missions, these organizations may use different appraisal methods depending on their choice rather than the features they have.At this point, choosing the most effective appraisal method arises as a problem that (HR) practitioners' face.
Though, whichever method is chosen, it is more important to reach a precise evaluation at the end of the performance appraisal process.One of the most important factors helping to realize this, is to decrease appraisal errors being made by evaluators or at least minimize it by applying the most appropriate method(s) that prevent(s) appraisal errors.

Performance Appraisal Errors
The accuracy of the results of performance appraisal depends mostly upon the degree of error freeness achieved by the evaluators.Even if appraisal errors are partly as a result of evaluators' attitudes, regardless of which appraisal method is used, it should be accepted that the features of appraisal method affects the appraisal errors.Every PAM has its unique structure and procedure that cause performance appraisal errors to be effective to a certain extent.
Although it is hard to determine the level of this extent for each PAMs, the evaluation of the expert practitioners can give most precise results as accepted in this study.
Since the aim of this study is to evaluate PAMs with a focus on the errors, first appraisal errors that will be used as evaluation criteria must be determined.By a literature review, performance appraisal errors are determined as presented in Table 2, for the present study.This error emerges from misunderstanding of performance appraisal standards stated in the appraisal forms.Using a standard appraisal form consisting of the same criteria aiming to measure specific qualities does not always lead to standard appraisals due to different perceptions among the appraisers.This error results from lack of common understanding of the performance standards.

Perceived meanings of performance standards
This error emerges from misunderstanding of performance appraisal standards stated in appraisal forms.Using a standard appraisal form consisting of the same criteria aiming to measure specific qualities does not always lead to standard appraisals due to different perceptions of the appraisers.This error results from lack of common understanding of performance standards.

Halo/Horn effect
Evaluator's general perceptions of an employee influence his/her perception on specific dimension.This error has two opposite sides.One is the general evaluation of the employee according to his/her strengths (halo effect) and overseeing the other possible weaknesses.The other, the horn effect, is the opposite of the halo effect, where the employee is generally evaluated according to his weaknesses and his/her strengths is overseen.

4.Central tendency error
This error is ignoring the strengths and weaknesses of an employee and mainly tending to appraise the personnel in an average score.Some raters, rather than giving extreme poor or good grades, to evaluate all ratees tend to an average scoring even if the performance actually varies.

Positive or negative leniency error
Positive leniency is the tendency to give high evaluation points in general, usually above the deserved level.Negative leniency is visa versa, that gives generally low evaluation points, regardless of the deserved level.It can be said that positive leniency is more frequent than negative leniency, since, some raters are concerned about damaging a good working relationship by giving poor or negative rating.

First impression and /or recency error
This error results from putting too much emphasis of the evaluator's on his/her first impression of the employee or more commonly from focusing on recent interactions with the employee.Since the recent events or employee behaviors are more noticeable than the former ones, recent events are weighted more heavily than they should be, in the performance appraisals.As a result of this, some raters only tend to regard the latest events and/or behavior of the employee regardless of employee's actual performance.

Similar-to-me error
This error results from situations where the evaluator sees his employee's background, education, attitudes, characteristics very similar to himself/herself, therefore grading higher in performance appraisals.Due to this error, evaluators may tend to perceive others similar to themselves more positively than they perceive those who are dissimilar.

Contrast error
Contrast error is observed where the evaluator compares one employee with the other instead of the criteria dictated in the appraisal form.This often results in the under evaluation of some employees due to comparing him/her with an employee who is seen very successful by the evaluator.

Insufficient Observation
In some cases, employees are evaluated with lack of sufficient information or observation on how they really perform on their work.Here the evaluator gives his/her evaluation point or comments on his/her general perception without detailed idea about the employee over a specific criterion.

Method and Results
A Likert-7 (1-absolutely false, 7-absolutely true) questionnaire was prepared as an evaluation tool, in which each of the PAMs presented in Table 1 was evaluated by each of the performance appraisal errors presented in Table 2.The items read as follows: "By using 360 degree PAM the effect of central tendency error in performance appraisal can be decreased."In this context, .11(number of the PAMs) X 8 (number of the performance appraisal errors), 88 items were inserted in the questionnaire in a matrix structure.
The questionnaire was applied to 29 HR managers who attended a 3-day performance appraisal course in which both PAMs and errors were discussed in detail.It is to say that, the sample of the present study can be accepted as being expert by having a sufficient capability to evaluate the PAMs due to their profession as well as their latest information on performance appraisal in three-day course.The samples' average age is 42 ranging 29 to 61.They are from 25 different companies in both public and private sector that serve in major cities in Turkey.
VIKOR (Vise Kriterijumska Optimizacija I Kompromisno Resenje) which means "Multi-Criteria Optimization and Compromise Solution", was used as a multi-criteria decision making technique in this study due to its appropriateness to the aim of the study.This method ranks the alternatives that are chosen for the evaluation through selected criteria to their proximity to the ideal solution (Opricovic & Tzeng, 2007).
The fuzzy VIKOR method is a result of application of Fuzzy Set Approach (theory) to the VIKOR method.In Fuzzy Set Approach introduced by Zadeh (1965) to deal with ambiguity of human thought.The most important contribution of Fuzzy Set Theory is its capability of presenting ambigious data.
It is suggested in fuzzy VIKOR method that the decision makers use linguistic variables to evaluate the ratings of alternatives (performance appraisal techniques in this study) with respect to the defined criteria (appraisal errors techniques in this study).
The steps of fuzzy VIKOR method were as follows (Opricovic & Tzeng, 2004): Step 1: 29 HR managers as mentioned above were chosen as evaluators to evaluate the PAMs presented in Table 1, assuming they had familiarity and expertise in performance appraisal process.
Step 2: By a literature review, 8 appraisal errors were determined as evaluation criteria as shown in Table 2.
Step 3: The evaluators were given a questionnaire to evaluate one by one each PAM in terms of the appraisal errors included.
Step 4: The evaluators' appraisals stated in linguistic terms in the questionnaire were transformed to fuzzy numbers as presented in Table 3.
Step 5: Then aggregated weight of each criteria and aggregated fuzzy rating of alternatives were calculated to construct the fuzzy decision matrix.
Step 6: The best and the worst values of all criterion ratings were determined as follows: Step 7: The values of S, R and Q were calculated for all alternatives as in Table 4.
Step 8: The ranking of the alternatives by in decreasing order is shown in Table 5.
Step 9: By testing two acceptability conditions (named acceptable advantage and acceptable stability respectively) were satisfied (0.13-0.00=0.13>0.1 (1 /Number of Alternatives-1), the best alternative is determined as A3 (Graphic Rating Scales) and worst alternative is determined as A1 (Comparison) as shown in Table 5.The others are not fulfilled the acceptability conditions.

Discussion
With a professional and equitable perspective, it can be suggested that, almost every decision in managerial applications is somewhat a result of performance evaluation-or at least should be so.In this context, it is of at most importance to use a PAM that provides the most accurate and just results.However such results may be hindered by the performance appraisal errors made by the appraiser.
Since it is difficult to remove the performance appraisal errors completely, it is at least necessary to determine the most appropriate PAM(s) that is minimally affected by these errors.This study aimed to evaluate existing PAMs in the literature to rank them according to their degree in decreasing the appraisal errors.
According to the findings of the study the best alternative is determined as Graphic Rating Scales method while the worst one is Comparison method.There are possible different explanations for this result.The most obvious one is related to structure of the methods in terms of having a concrete process.That is, if the method is definitely structured and forces the rater to follow a step by step evaluation in an exact determined competency.It is accepted to be more suitable for preventing appraisal errors.On the other hand, if the method provides more flexibility and is relatively less structured, then it is likely to be accepted less appropriate for the prevention of appraisal errors.
Another explanation can be provided in terms of the degree of comparison made by the evaluators during the evaluation process.That is, if the method depends on the evaluation of employee individually, without comparing him/her with other employees, it is likely to be accepted as more appropriate while the comparison method tend to be evaluated as less appropriate in terms of decreasing appraisal errors.
However it should be stated that these above mentioned two major reasons are closely related to the cultural factors, where consequently the results of this study should also be discussed in terms of cultural dimensions.To this end, it is firstly proper to express the cultural features of this study's sample and then secondly to discuss the best and the worst determined methods depending on these cultural dimensions.
This study was conducted in Turkey which is a country with high power distance, stronger uncertainty avoidance tendency, more feminine and "short term" oriented (Hofstede, 1984, p. 123).In this connection, due to the high power distance, employees tend to accept the authority of the supervisors in performance evaluation, that is, they may not be so willing to be assessed by multi-source raters such as 360-Degree Feedback instead of by their supervisors.Also due the collectivist culture, they may be unwilling for being assessed by a comparison among their colleagues.As a result, they may prefer Graphic Rating Scales, which is a method that is more relevant to high power distance.This lack of willingness for comparison is deemed to be another result of collectivist mindset on the other hand.In addition to this, Graphic Rating Scales method is the least initiative -driven method preferred by the rater as compared to the other methods which decrease the rater's appraisal errors.

Constraints of the Study and Suggestion for Future Researches
Although the sample of this study presents 25 different companies serving in 15 different sectors, which can be regarded sufficient for such an initial study, to obtain more general findings on the national applications, it is further necessary to increase the size of the sample as well as the sectors contained On the other hand, the study was conducted in Turkey where the abovementioned cultural features have prevalence.Hence, repeating the study in different countries is necessary for increasing the acceptability of the findings.
Although the minimization of the appraisal errors is essential for an organization, the sensitivity of organizations toward the appraisal errors may vary.If the sensitivity is high in an organization, it becomes important to conduct a selection process for determining the accurate PAM by following the methodology of this study.Because the sensitivity of employees toward performance appraisal systems have an effect on their thoughts of their own appraisals (Mert, 2011).On the other hand, if the sensitivity is low, conducting such a selection method may not be so obligatory.
Although the results of the study's findings presented as such, it can not be stated that the best appraisal method is Graphic Rating Scales method and the worst is Comparison method since the methods were evaluated only in terms of their degree of decreasing appraisal errors.If different factors are taken to consideration as evaluation criteria, different results may be obtained.
In this study HR managers from different organizations were selected as evaluators to reach a more generic conclusion.The future studies may use the methodology of this study at the organizational level.Thus, organizational culture can be considered and comparable results among organizations can be obtained.
Since the performance appraisal process has an ultimate importance on the effectiveness of the organizations and appraisal method, being the core of this process, searching for the most appropriate appraisal method deserves considerable attention by the academic researches.As such, the studies that highlight the available methods for such a searching can contribute to the literature.

Managerial Application
The ultimate purpose of managerial application is to accomplish the organizational objectives effectively by having dedicated employees.Even if many factors can be suggested to improve the employee dedication towards their jobs, it still depends on their trust to the PAM used in the organization.Therefore, if they perceive that they are accurately and fairly evaluated by the raters, they will more likely be highly dedicated.Although the rater's personal fairness plays a dominant role in performance appraisal process, also the perception of employees toward the applied appraisal method has an important effect.Between these two factors (rater's fairness and appraisal method), it is easier for HR managers to handle appraisal method.In this connection, HR managers should be aware that because existing different kinds of appraisal methods, choosing the most appropriate one emerge as an important necessity.Also this necessity consists the questioning of the current applied appraisal method in the organization.In this manner, the following items can be recommended to implement by the HR managers and/or people dealing with performance appraisal process as relevant to the present study's methodology: • Determining the list of applicable PAMs for the organization by examining the organization's vision, mission and objectives.
• Searching and determining the appraisal errors being made by raters in the organization during the appraisal process.
• Evaluating the determined appraisal methods in terms of appraisal errors and selecting the most organization-fit method to apply by whether using the methodology of this study or by different methods.
• Beginning to use of the most appropriate one to obtain more effective results.
• Periodically reexamining the appraisal process by following the above items to test the appropriateness of the current appraisal method.
(v) Identification of training need; (vi) Job redesignings and other organizational interventions.

Table 2 .
Performance appraisal errors

Table 3 .
Linguistic terms and corresponding fuzzy numbers

Table 4 .
Indexes of R 1 , S 1 , Q 1 and the ranks of alternatives

Table 5 .
The rank of performance appraisal methods