A Prediction Model for Initial Trust Formation in Electronic Commerce

This research investigates trust-building strategies that may influence transactions between individuals and unknown Internet firms, focusing on three influential components that mediate the relationship between online shoppers and online vendors. Results indicate significant direct effects for trust in the Internet infrastructure, susceptibility to the social influence of media, and the presence of influential site characteristics on user willingness to provide personal information to unknown Internet firms. This study extends the research on trust in electronic commerce by providing a prediction model that is demonstrated to calculate the probability of user willingness to provide information. The utility of the model for identifying the relative importance of factors and predicting outcomes lends insight into important issues in online trust formation. Knowledge of effective trust-building strategies guide organizations that use the Internet for selling, marketing, or servicing customers to gain maximum benefits from investments in e-commerce applications.


Introduction
The increasing rate of data breaches (ITRC, 2009) and the increasing consumer fear of identity theft (Steiner, 2008) indicate a need for guidance on investment in e-commerce applications that meet specific data collection needs of organizations and communicate a credible expression of trustworthiness.Given the current challenging economic environment, it is especially important in the context of initial trust formation that organizations adopt a web strategy that maximizes user confidence yet minimizes investment in the e-commerce application.
Two information processing models offer a theoretical foundation for examining factors that influence online information-giving behavior.The heuristic-systematic model of persuasive communication (Chaiken and Eagly, 1983) and the elaboration-likelihood model (ELM) of persuasion (Petty and Cacioppo, 1986) are concerned with changes in attitude as a result of exposure to persuasive messages.Both theories assume that, in the absence of motivation for effortful cognition, individuals process information at a minimal level.Both theories describe cognitive processing as either deep/systematic or shallow/heuristic.Heuristic processing describes a minimizing effort that is more likely to occur when there is limited knowledge, time, or competing demands on cognitive ability (Chaiken, Wood and Eagly, 1996).Systematic processing describes a more effortful process that makes greater demands on cognitive resources (Chaiken, Wood and Eagly, 1996).In the process of initial trust formation, users consider a range of information and utilize a variety of cognitive processing strategies in their decision-making process relating to online information-giving behavior.
This paper provides a review of prior research, followed by sections describing the research objective, research methodology, and data analyses.The paper concludes with a discussion and conclusion section identifying important findings from this research, along with suggestions for future research.

Background and Literature Review
A review of research on user willingness to complete a transaction on the Internet reveals common themes of trust in the Internet store (Jarvenpaa, Tractinsky and Vitale, 2000), trust in the vendor (Pennington, Wilcox and Grover, 2003), trust in organizational practices (Smith, Milberg and Burke, 1996), and user perception of Web site features (Belanger, Hiller and Smith, 2002;Gefen and Straub, 2004;Pennington et al., 2003).The literature on trust includes five research streams: personality-based, cognition-based, calculative-based, knowledge-based, and institution-based trust (McKnight, Cummings and Chervany, 1998).Personality-based trust describes trust tendencies that are developed during childhood; cognition-based trust describes trust that develops as a result of first impressions and cues from the environment.Calculative-based trust is based on perceived economic outcomes, and knowledge-based trust occurs as a result of a history of interaction (Gefen, Karahana and Straub, 2003).Institution-based trust is generated by "guarantees, safety nets, or other structures" that convey a sense of security in a situation (Gefen et al., 2003;McKnight et al., 1998).

Trust in the Internet Infrastructure
The Lee and Turban (2001) model of consumer trust in Internet shopping (i.e., trust in the computerized medium) features the perceived technical competence, perceived system performance, and user understanding of the system or the medium.There is evidence of a link between positive perceptions about the trustworthiness of the Internet and Internet purchase intentions (George 2002(George , 2004)), and between institution-based structural assurance and trust-related Internet behaviors (McKnight and Chervany, 2001).Structural assurance is characterized as "technological Internet safeguards" such as encryption (McKnight and Chervany, 2001, p. 5).

Web Site Characteristics
Trusted third parties (TTPs) are organizations that work to reduce consumer fear about online security and privacy and increase trust in e-commerce transactions (Palmer, Bailey, Faraj and Smith, 2000).A TTP acts as a guarantor, providing an assurance of authentication or a brand image or reputation as a foundation for trust.TTPs may be classified according to purpose or intention.Privacy seals represent certified data collection and data usage processes (TrustE, n.d.;BBB, 2010), while security symbols provide assurance that the site uses the secure sockets layer (SSL) cryptographic protocol (GeoTrust, n.d.;VeriSign, 2010).A vulnerability symbol verifies third-party scans for vulnerabilities (HackerSafe, 2010).Reliability symbols vouch for the identity of the Web site and may affirm ethical practices (BBB, 2010;SquareTrade, 2010;WebAssured, n.d.).Consumer rating symbols indicate a satisfied customer experience with the Web site (BizRate, 2009).Although e-commerce literature offers contradictory findings on the ability of TTPs to influence online users, there is evidence of the positive effect of TTPs on purchasing likelihood (Fogg, Soohoo and Danielson, 2002) and information disclosure for some users (Miyazaki and Krishnamurthy, 2002).Additionally, as symbols of expertise, the presence of these artifacts may result in less thought given to scrutiny of information about the Web vendor (Chaiken et al., 1996;Petty and Cacioppo, 1986).And recent research found a strong correlation between use of Web assurance seals and user intention to use an online payment system (Ozkan, Bindusara and Hackney, 2010).
Web site social presence is a subjective quality based on user perception.It is defined as the perception of an interpersonal interaction due to the impression of human contact and the information richness of the medium (Gefen and Straub, 1997).Social presence features may include photographs of smiling customer service representatives as well as online-chat.Although Wang and Emurian (2005) found "social cue design elements" (p.49) to be less important in promoting trust than visual design and content design, Gefen and Straub (2004) found evidence that the perception of social presence increases trust in e-commerce.

Social Influence
Social influence, sometimes referred to as subjective norms, is frequently decomposed into relevant referent groups.For example, in research that examined the use of information technology (IT) in an organization setting, Taylor and Todd (1995) decomposed sources of social influence into three groups: peers, superiors, and subordinates.In the context of e-commerce, Limayem, Khalifa, and Frini (2000) decomposed sources of social influence into three groups (friends, family and media), finding the social influence of media and family to have an effect on online shopping.Hwang (2005) found all three dimensions of social influence (friends, family, media) to be significantly related to online trust, while Bhattacherjee (2000) found news reports, popular press and mass media to have a large effect on subjective norms leading to intention to accept e-commerce.
The existing literature suggests that these three factors are influential components in the complex relationship that occurs between an individual and an unknown online vendor: trust in the Internet infrastructure, Web site features of institutional trust and social presence, and social influence.These factors form the framework for the research presented here.

Research Objective and Hypotheses
In view of the inherent insecurity of the Internet and user concerns for information privacy, a question that should interest organizations seeking to maximize investments in e-commerce is: What cues of institutional trust and social presence are effective in overcoming low trust in the Internet infrastructure and social/media influences to persuade first-time users to provide personal information so that online transactions are facilitated?Specifically, three research questions are addressed: In the context of initial trust formation:  Does trust in the Internet infrastructure affect user willingness to provide personal information online?
 Do Web site elements of institutional trust and social presence affect user willingness to provide personal information online? Does general social influence affect user willingness to provide personal information online?
The research model is presented in Figure 1.
Trust in the Internet infrastructure is defined as trust in the safety and integrity of the fundamental security measures used to protect personal information during online transactions (McKnight and Chervany, 2001).Influential Web site characteristics are defined as artifacts of institutional trust (e.g., links to privacy policies and symbols of trusted third parties), and elements of social presence (e.g., e-mail links, images of service representatives, and options to speak online with service representatives in real time).User susceptibility to social or interpersonal influence is defined as the tendency of persons to change their online information-giving behavior as a result of social pressure (McGuire, 1968).The dependent variable is willingness to provide personal information ranging from data perceived as low risk (i.e., name, email address) to data perceived as high risk (i.e., credit card number, social security number) (Miyazaki and Krishnamurthy, 2002).

Research Methodology
The research consisted of a 3×3×3 between-subjects quasi-experiment designed to test the effects of (1) trust in the Internet infrastructure, (2) social influence, and (3) Web site features of institutional trust and social presence on user willingness to provide personal information.The context of the study was anticipated patronage of an unknown Web vendor that offered a desired product at an acceptable price.The subjects were undergraduate and graduate students, considered to be reasonable proxies for online shoppers based on age and education (Drennan, Mort and Previte, 2006;Mauldin and Arunachalam, 2002).A total of 628 survey responses were included in the final analysis.
Respondents were advised that the topic of the survey was "Using the Internet for Personal Business."Using an online instrument, subjects responded to questions that assessed trust in the Internet infrastructure and susceptibility to social influence before being assigned to a media treatment.Assignment to treatment groups was accomplished with alphabetic self-selection menus.That is, based on the first letter of the last name (using self-selection), subjects were assigned to one of three media conditions: positive, negative, or none.Then, based on the first letter of the first name (using self-selection), subjects were assigned to one of three Web site conditions: low-, moderate-, or high-level.According to Shadish, Cook and Campbell (2002), this procedure is quasi-experimental in that random assignment occurred by means of self-selection.
User trust in the Internet infrastructure was evaluated using measures adapted from previous research (Lee and Turban, 2001;McKnight, Choudhury and Kacmar, 2002;George, 2004).Following assignment to a media treatment, susceptibility to social influence was measured using scales developed and validated as part of this study.The media treatments were composites of positive or negative excerpts pertaining to the safety of the Internet (selected from national magazines or newspapers and government or non-profit online sources) presented as print media.To provide and control for source credibility, both messages were presented as an article in USA Today.Following assignment to a simulated Web site on which the type and number of elements that represent guarantees, institutional assurances of trustworthiness, and social presence were varied, the effect of these elements was evaluated using measures adapted from previous research (Miyazaki and Krishnamurthy, 2002).The simulated Web site created for this experiment was "product-neutral" in that it typified a "registration" page on which new users would provide personal information to learn more about a product or service.

Data Analyses
Statistical analyses included descriptive statistics, univariate analyses of factors affecting trust in the Internet infrastructure and susceptibility to social influence, cross-tabulations and chi-square tests to evaluate differences across treatment groups, and correlational analyses among the predictor variables.Logistic regression models were constructed to examine main and interaction effects.

Descriptive Statistics
The respondents were fairly evenly split by gender (342 males, 286 females), mostly young (453 were 18 to 24 years of age, 175 were 25 and older), and racially diverse (371 White, 182 Black/African American, 75 other races).
The majority of respondents reported using the Internet extensively: 584 reported daily use; time spent on the Internet averaged 17 hours per week.The majority of respondents (449) reported having used the Internet for 7 years or more.Approximately one fourth of the respondents reported making an Internet purchase on a monthly basis, while slightly more than half reported making an Internet purchase a couple of times a year.Table 1 presents characteristics of the participants.

Logistic Regression Analyses
Based on results of chi-square tests and correlational analysis, potential predictors of willingness to provide information included demographic characteristics, trust in the Internet infrastructure, susceptibility to social influence, media treatment, and Web site treatment.This paper focuses on the results of the logistic regression analyses.
Logistic regression relates one or more continuous or categorical predictor variables to a dichotomous dependent variable by analyzing the logit or natural logarithm of the odds of the reference outcome, defined as P i (the probability of an event).If P i is the probability of a "Yes" response, then 1-P i is the probability of a "No" response.
The logit transformation occurs in two steps: First, the odds of the event are determined (P i /1-P i ), then the natural logarithm (ln) of the odds is calculated (Pampel, 2000).It is the logit, or log(odds), that serves as the dependent variable in logistic regression.log(odds) = logit(P i ) = ln (P i /1-P i ) A simple logistic regression equation with independent variable X takes the form: logit(P i ) = a + b 1 X For a continuous covariate, b 1 gives the change in the log(odds) for an increase of one unit in X.For example, if b 1 = .555,this value is exponentiated to learn the log(odds) or odds ratio = 1.742.(In SPSS output, this is seen as Exp(B) = 1.742.)The odds ratio minus one (1.742-1) means a one-unit increase in X results in a 74.2% increase in the outcome of the target dependent variable (i.e., a "Yes" response).
For a categorical covariate, b 1 gives the extent to which the odds in favor of one outcome are raised when X is raised from the reference level to another level.For example, if b 1 = 1.888, this value is exponentiated to learn the odds ratio = 6.606.(In SPSS output, this is seen as Exp(B) = 6.606.)This means the odds of saying "Yes" (the target outcome) for those who did not receive a media treatment (the reference group) are 6.6 times greater than for those who received the negative media treatment (the comparison group).
An odds ratio greater than 1.0 signifies a positive relationship between two variables, and an odds ratio of less than 1.0 signifies a negative or inverse relationship.An odds ratio of 1.0 means there is no relationship between a predictor and the outcome (Menard, 1995).When an odds ratio is a fraction (i.e., .750), the reciprocal (1/.750 = 1.33) is interpreted such that the odds of saying "Yes" for those who received the negative media treatment (the comparison group) are 1.33 times greater than for those who did not receive a media treatment (the reference group) (Pedhazur, 1997).
The logistic regression models were constructed using a model-building strategy (Hosmer and Lemeshow, 1989) that calls for univariate analysis of each variable to select variables for multivariate analyses with subsequent analyses considering interactions among the variables.Because chi-square statistics revealed significant differences in outcome between gender, race, media, and Web site groups, those variables were included in the initial regression model as were the continuous variables of interest (trust in the Internet, social influence of friends, social influence of family, and social influence of media).The results provided a subset of five covariates with p < .10 that were retained for further analysis: race, trust, media, site, and social influence of family.The results of the reduced model showed these five variables to be significant at the .05level for at least one outcome variable (phone number, credit card number, social security number) or all six outcome variables.Table 2 contains the results of the reduced multivariate model.
Ten two-way interactions may be formed from the variables in the reduced multivariate model.Following the strategy suggested by Hosmer and Lemeshow (1989), further analyses examined each of these interactions with all variables retained from the reduced multivariate model.When all outcome variables are considered collectively, none of the interaction models provides a significant improvement over the main effects only model.Therefore, the main effects model was selected for further analysis using the subset of predictor variables identified as significant for willingness to provide credit card number.Those covariates are race, trust, media, site, and social influence of family.This model was selected because it shares the highest level of significance (.000) with the model identified for willingness to provide address, and it is the most inclusive model; that is, it includes all variables that are significant for the remaining outcome variables.Of the five parameters in the final model, four were statistically significant.The estimates of the main effects logistic regression model are presented in Table 3.
The most frequently used test of significance of an individual predictor is the Wald Chi-square statistic (Pampel, 2000).This value indicates the relative importance of the individual variable.The estimates shown in Table 3 indicate four covariates in the model are important factors for willingness to provide personal information on the Internet in the context of initial trust formation.
Continuous variables.The Exp(B) value or odds ratio value for trust (1.463) indicates a one-unit increase in trust results in a 46.3% increase in the odds of the subject providing a credit card number.(Trust in the Internet infrastructure ranges in value from -3 to +3 in increments of 0.25.)Based on a negative coefficient and a fractional odds ratio, using the reciprocal, the .868odds ratio for social_2 indicates a one-unit increase in social influence of family results in a 15.2% decrease in the odds that the subject will provide a credit card number.(Social influence ranges in value from -3 to +3 in increments of 1.0.)Categorical variables.The reference group for race is White; and race(1) compares Asian subjects to the reference group; race(2) compares Native Hispanic subjects to the reference group; race(3) compares Black/African American subjects to the reference group.Based on a negative B-value and a fractional odds ratio, using the reciprocal, the 3.67 odds ratio for race(1) indicates an Asian subject is 2.72 times less likely to provide a credit card number compared to a White subject (the reference group).
The reference group for site is low-level; and site(1) compares the high-level treatment to the low-level treatment; site(2) compares the moderate-level treatment to the low-level treatment.The odds ratio for site(1) indicates a subject who receives a high-level site treatment is 1.6 times more likely to provide a credit card number than a subject who receives a low-level site treatment.The odds ratio for site(2) indicates a subject who receives a moderate-level site treatment is 1.7 times more likely to provide a credit card number than a subject who receives a low-level site treatment.

The Prediction Model
A logistic regression classification table shows the overall success rate in predicting the outcome (yes or no).The overall accuracy of the reduced multivariate model to predict willingness to provide credit card number is 69.4%.The positive predictive value = 55/88 = 62.5%; the negative predictive value = 370/525 = 70.6%(Pedhazur, 1997).The classification table for the logistic regression equation for estimating willingness to provide credit card number is shown in Table 4.
The equation for calculating the probability that a subject will provide a credit card number is given by the equation: where z = the logistic regression equation derived from Table 2.For this model z = -1.111-1.003*race(1) + .380*trust-.142*social_2 + .470*site(1)+ .544*site(2)(Note: Race(1) = Asian; site(1) = high-level; site(2) = moderate-level.) This prediction model can be used to calculate the probability of willingness to provide credit card number based on the subject's race, scores on trust in the Internet infrastructure and susceptibility to social influence of family, and the level of the Web site treatment (low, moderate, high) (Chan, 2004).Several examples of the utility of the prediction model are provided below.

Example 1
Two subjects are presented with a moderate level Web site.Each subject scores 1.5 on trust in the Internet infrastructure and 1.0 on susceptibility to social influence of family.These scores indicate the subjects are moderately trusting of the Internet and consider the opinions of family members when making decisions about providing information online or making purchases on the Internet.The first subject is non-Asian; the second subject is Asian.For the first subject, indicator (dummy) coding for race(1) = site(1) = 0, coding for site(2) = 1, and the logistic regression equation is: indicating the Asian subject is unlikely to provide a credit card number at the moderate-level Web site in the context of initial trust formation.This comparison shows that holding constant all other factors of the model, an Asian subject is much less likely to provide a credit card number than a non-Asian subject.

Example 2
Three subjects are presented with a moderate-level Web site.Each subject scores 2.0 on trust in the Internet infrastructure.However, scores on susceptibility to social influence of family vary from -1 to +1.The first subject's score on social influence of family is -1, indicating a lack of consideration for the opinions of family members when making decisions about providing information online or making purchases on the Internet.For this subject, the logistic equation is: This comparison shows that holding constant race, trust in the Internet infrastructure, and site-level, increasing levels of susceptibility to social influence function to reduce the probability that subjects will provide a credit card number in the context of initial trust formation.

Example 3
Three subjects are compared who differ only on the basis of Web site viewed.These subjects are non-Asian, have moderately high scores on trust in the Internet infrastructure (score=2.0)and low positive scores on susceptibility to social influence of family (score=1.0)indicating moderate consideration of the opinions of family members.The first subject viewed a low-level site, the second subject viewed a moderate-level site, and the third subject viewed a high-level site.This comparison shows that holding constant race, trust in the Internet infrastructure, and social influence of family, a moderate-level web site results in the highest probability that a subject is willing to provide a credit card number.A high-level site results in a slightly lower probability, and the low-level site produces the lowest probability that a subject is willing to provide a credit card number in the context of initial trust formation.
In summary, the main effects model predicts with 62.5% confidence that, in the context of initial trust formation, Asian subjects are less likely than non-Asian subjects to provide a credit card number; increasing levels of social influence of family result in reduced probabilities that subjects will provide a credit card number; and a moderate-level Web site treatment results in the highest probability that subjects will provide a credit card number.

Discussion and Conclusion
The results of this experiment indicate trust in the Internet infrastructure, the presence of Web site features of institutional trust, and susceptibility to the social influence of media are positively related to willingness to provide personal information online in the context of initial trust formation.Additionally, significant differences in online information-giving behavior were observed between ethnic groups.
Evidence of systematic cognitive processing (Chaiken and Eagly, 1983) was provided by results that found significant differences in willingness to provide information across media treatment groups such that subjects who received the positive media treatment were more willing to provide information than subjects who received the negative media treatment.Because the media treatment required reading an article and answering manipulation check questions, these results describe systematic or "deep" cognitive processing (Chaiken et al., 1996).Evidence of heuristic cognitive processing (Chaiken and Eagly, 1983) was provided by results that found the presence of influential Web site characteristics influenced willingness to provide personal information.In the context of initial trust formation, when the online firm is unknown to the user, symbols of trusted third parties may provide brand recognition (Palmer et al., 2000).Because Web site features of institutional trust and social presence are processed as cues, these results describe heuristic or "shallow" cognitive processing (Chaiken et al., 1996).The heuristic-systematic model suggests that cognitive processing modes may occur simultaneously when motivation or capacity or both are high, and both modes of processing may have an impact on judgment.The primary difference in these two paths to attitude change lies in the amount of analysis given to the issue under consideration.
The results of this study provide insight for organizations that seek to adopt a strategy to maximize trust for new online users at the same time that they minimize investment in e-commerce.The utility of the prediction model for identifying the relative importance of factors and predicting outcomes can guide investment on Web site features that are sufficient for the specific data collection needs of the organization.
Because the quasi-experiment simulated a potential information-giving situation for a product-neutral, unknown (un-branded) Web site, the results should be interpreted within that limiting context.Also, this quasi-experiment included only one operationalization each of the media treatment and the Web site.Because media treatments were presented as an article in USA Today and the Web site was a fictional corporation, threats to construct validity include mono-operation bias such that the constructs of media influence and Web site features of institutional trust and social presence may have been underrepresented.Additionally, a threat to construct validity results from using one method of measuring outcome variables (i.e., self-report).
Although previous research found social presence to be effective in increasing trust in e-commerce (Gefen and Straub, 2004), this study found no evidence that Web site features of social presence increase user willingness to provide information.Given contradictory findings and anecdotal evidence that Web site social presence features continue to evolve with advances in multimedia technology, future research should explore the use of interactive social features such as live chat and other forms of online communication to learn how multimedia elements impact initial trust in e-commerce.
Regarding differences found in information-giving behavior between subjects who received the positive media treatment and those who received the negative media treatment and in consideration of marketing research that indicates two-sided advertising messages result in higher believability and greater purchase intentions (Golden and Alpert, 1987), future research on media influences could look at the effectiveness of Web site information features such as news links and/or blogs that present opposing media treatments to offset negative media influences.