Studying the Effects of Being Einstein: An Experiment in Social Virtual Reality

,


Avatars
When existing within a virtual space, a person's being is not physically present, and a virtual stand-in must be used instead. Avatars can be used to represent a person in a virtual setting, they are often customizable, this can range from an image that can be used for a social media profile, to a fully customized three-dimensional character built using a character customization menu.
There has been increased interest recently in the use of holograms and holographic representations in virtual environments (Casagrande, 2020;Verdict Media, 2022). However, the consensus seems to be that these systems will not replace the pervasive virtual avatars in the near future (Pakanen et al, 2022). Most experts agree that holographic virtual representations have advantages in specific application areas (such as healthcare), but expect avatars to remain as the ubiquitous form of virtual representation in most online virtual environments (O'Connor, 2021;Wolf et al, 2022).
Each SocialVR platform differs in their approach to how it handles avatars, allowing varying degrees of freedom to player representation through tools and methods users can represent themselves (Schofield & LeRoy, 2018;Freeman et al, 2020).
In VRChat, users are by default offered a range of community-created avatars or can upload their own. There is no built-in avatar customization based on a default model, instead VRChat avatars can take on almost any shape or form. Players can upload any avatars that fall within different performance and community moderation standards and allows custom assets, rigging and importation so long as the programmed behaviour functions in the Unity tools that VRChat provides through their website (VRChat, 2021).
As a result of the near-complete customization and freedom afforded to users, avatars using models taken from pre-existing game assets as well as personalized, handmade avatars are widespread across the user base in VRChat. Users can visit worlds called Avatar Worlds to browse and try on various avatars that have been uploaded by community members with just a single button press (VRChat, 2021)..

Body Ownership Illusion
Though avatars may be used to represent a person within a virtual space, the sense of ownership of that virtual body is illusory in nature. Within VR environments, the phenomenon of perceived ownership of a virtual form can be seen. To clearly define and discuss the phenomena, this paper will be using the following definitions from Spanlang et al. (2014): under a variety of conditions may give rise to the subjective illusions of body ownership and agency.
Prior research has demonstrated that it is not difficult to induce a sense of illusory body ownership in healthy participants, even with avatars that bear minimal similarity to the participant (Banakou et al., 2013;Banakou et al., 2016;Barberia et al., 2018;Hoort et al., 2011;Maister et al., 2015;Tajadura-Jimé nez et al., 2017). One of the most famous and most studied examples of a Body Ownership Illusion (BOI) is the rubber hand illusion. In the rubber hand illusion, a left rubber hand is placed in front of the participant at a position that it would be plausible to place their own hand. Then, two brushes would stroke both the hand of the participant and the rubber hand at approximately the same speed and location. This type of stimulation is termed -visuotactile synchrony.‖ Participants experience a drift in proprioception and will even respond to threats to the rubber hand as they would their own (Ehrsson et al., 2004;Lloyd, 2007).
The strength of a BOI can be affected by several variables other than visuotactile synchrony, including, but not limited to:  visuomotor synchronicity, where the virtual body visually mimics the physical movements of the user.
 the realism of the image being presented to the user, which includes elements such as skin tone and detail.
 and agency, the degree to which action and control of the body feels voluntary.
In different combinations, these can combine to create illusions of virtual embodiment that do not necessitate all elements to be present (Bertrand et al., 2018).
Body ownership illusions are not limited to different body parts but can include entire bodies. Unlike in the rubber hand illusion, immersive VR using an HMD can block the user's view of the outside world, including their own body. Without a visual connection to the physical reality, the perspective from which a person views the virtual world can lead to illusory ownership of an avatar body created in the virtual world.
This has been demonstrated in stereoscopic VR, where it has been shown that the user's body can be replaced by a virtual body that is approximately spatially mapped to their real body and achieve the same effects as the BOI achieved in the rubber hand illusion (Neyret et al., 2020). Using inverse kinematics, a technique that uses mathematical calculations to approximate how a body should be posed based on limited data, using an HMD, a user can look down and move their hands around and see a proper mapping of the virtual body to their real body's position and movements, strengthening their sense of agency and the BOI itself (Parger et al., 2018).

Avatars Changing Behavior
Through the process of embodiment and induction of a BOI, a user may experience a sense of -being‖ the virtual avatar. Prior research has shown that this can be a deeper feeling than just a sense of ownership, resulting in a multitude of psychophysical effects in the user. Previous research data also suggests that the attributes of the virtual body can impact the perception, attitudes, behavior, emotions, and cognitive capabilities of the person embodying that avatar (Banakou et al., 2016;Banakou et al., 2018;Osimo et al., 2015;Pan & Steed, 2019;Patané et al., 2020;Salmanowitz, 2018).
Yee and Bailensen (2007) first reported this phenomenon in a non-VR setting, suggesting that behaviors can be influenced by assumptions about how they think others expect them to behave, based on their virtual appearance, and referred to it as the Proteus effect. They found that when participants embody a virtual body with a face that appears to be more attractive than their real face, participants adjusted the interpersonal distance between themselves and study confederates, and that when in a taller avatar, participants are more aggressive when negotiating than when in a shorter avatar. However, not all the cognitive effects that varied virtual embodiment are accurately described by the Proteus effect. The Proteus effect is based on the idea of drawing expectations for behavior from traits of their appearance that are associated with stereotypes and individuals from their previous experiences and knowledge.
Other studies have found that when adults are embodied in a small virtual body, they overestimate the size of objects (Hoort et al., 2011), and when that virtual body has a child-like appearance, their implicit attitudes, self-identification, and behaviors shift towards being more childlike. However, these additional childlike cognitive changes are not seen when the small virtual body has the appearance of a scaled-down adult (Banakou et al., 2013;Tajadura-Jimé nez et al., 2017). While the childlike effects found in the study can be attributed to the Proteus effect, the difference in size estimation cannot be attributed to stereotype, individual or experience. This phenomenon falls outside of the definition of the Proteus effect but is still within the effects of virtual embodiment.
http://cis.ccsenet.org Computer and Information Science Vol. 16, No. 2; This process of virtual embodiment has been used to help build empathy and reduce biases in participants. When white participants are represented by a black avatar, studies have found a reduction in their racial bias against black people (Peck et al., 2013;Maister et al., 2015). This effect was found to last at least one week after a short exposure to the VR interaction showing promise for application of such techniques for societal adjustment and self-betterment (Banakou et al., 2016;Casullo & Colalongo, 2022;Chancel et al, 2022). Salmanowitz (2018) applied a similar experimental technique and examined how the influence from different-race embodiment could change implicit racial biases, in particular the outcomes in legal scenarios. It was found that five minutes of occupying an avatar exhibiting different racial characteristics was sufficient to reduce the implicit racial biases against that outgroup and resulted in more cautious legal judgements in trials with ambiguous evidence (meaning that vague evidence was evaluated to be less indicative of guilt).
Embodiment can be used to study racial in-group bias as well, where people are more likely to be biased against those who are perceived to be in a different type of group. Evidence of in-group bias can be found in mimicry of behavior. Hasler et al. (2017) found that there is a powerful effect on mimicry when white participants were put in a black avatar. Participants exhibited an in-group bias in mimicry based on their embodied avatar's perceived race instead of their actual body's race. Unlike previous studies that showed a difference in attitudinal change and change in implicit biases, this study shows that the effect can also generate behavioral change.

Being a Virtual Einstein
The increase in empathic ability is not the only cognitive performance measure that can improve from virtual embodiment. Banakou et al. (2018) reported that embodying an individual who stereotypically signifies superintelligence (e,g, Albert Einstein) resulted in increases in cognitive task performance. Participants took surveys and tests evaluating their self-esteem, general intelligence and implicit associations and biases before and after testing. Banakou et al. (2018) used The Tower of London task, this is a psychological task based on the famous Tower of Hanoi puzzle, used to assess executive functioning, and planning abilities (Krikorian et al., 1995). One week after taking the initial tests, participants were immersed into a VR environment and either embodied as a young adult male or Albert Einstein. The participants undertook a brief stretching exercise, then a simple task involving math, both while in view of a virtual mirror to establish agency and enhance the BOI. After the experience, they took the Tower of London task again, and the results found that there was significantly greater improvement in Tower of London performance between the first and second trial in the participants who embodied Einstein. Additionally, it was found that for participants who were embodied as Einstein there was a negative relationship between magnitude of Tower of London improvement and self-esteem, meaning that the lower the participant's self-esteem was, the greater the growth in performance between the first and second Tower of London tests.
In a similar manner to the studies about different race embodiment (Hasler et al, 2017), Banakou et al. (2018) also found reductions in implicit biases and associations in participants who embodied Einstein. However, instead of a reduction in racial bias, there was a reduction in age bias against older people, as the model of Einstein depicts the scientist as an older man.
The experiment undertaken by Banakou et al. (2018) consist of a single, small study, hence there is still need for further testing of the hypotheses of this study, including, but not limited to :  identifying if there was a noticeable change in self-esteem after a VR experience as Einstein,  the importance of a sense of embodiment on the effect,  and whether or not this phenomenon is generalizable to other examples or measurable by other metrics, such as intelligence and executive control testing outside of the Tower of London task.

Porteus Maze Test
Like the Tower of London test, the Porteus Maze Test (PMT) is a non-verbal test designed to evaluate intelligence, and is useful in evaluating executive functioning, particularly planning and decision making. Designed by Stanley Porteus, it was created to assess planning capabilities in a restricted environment, with the thinking that planning is a key component of intelligent behavior (Porteus 1965).
The PMT consists of a series of eight or twelve mazes (depending on the version administered) printed on paper and given to the participant to complete in succession. Mazes are labeled as years, which increase in difficulty with each successive year. Verbal instructions vary depending on the maze year, however generally participants are expected to complete the maze without crossing the lines of the walls of the maze, their own lines, or doubling back. The PMT has two main indices for measuring performance:  Test Age (TA), a quantitative score based on a participant's ability to complete and progress through the maze years and measures planning and foresight (Porteus, 1965).
 Qualitative Score (Q-Score), a qualitative score based on the errors in style and strategy the participant makes during testing, which measures behavioral disinhibition, directly impulsive behaviors that impede planned task execution such as failure to follow instructions and carelessness (Tuvblad et al., 2017).
The PMT has three versions: The Vineland Revision, which contains the original series of mazes by Porteus; the Extension, and the Supplement (Tuvblad et al., 2017). The additional versions of the PMT are designed to be supplemental to, and not alternative to the original series, with the second and third versions useful for eliminating practice effects where experimenters would like to repeat the test. Tuvblad et al. (2017) created a revised and modern Porteus Maze Test Administration Manual that clarifies and expands on the ambiguities found in previous literature, utilizing all three versions of the PMT, and unifying a method for their administration.

Experimental Method
In a similar manner to Banakou et al. (2018), this experiment was a between-subjects design with a single variable -Virtual Body‖ with two conditions:  Einstein -an avatar that looked like Albert Einstein  Control -an avatar that looked like a generic older white male Participants were randomly assigned to one of the two conditions evenly. Measures of self-esteem and PMT metrics are repeated-measures variables.

Participants
The experiment used 30 adult male participants recruited from players within VRChat. Only participants who were using tracked HMDs with controllers were recruited. Male participants were recruited to prevent sex or gender effects from avatar mismatch on performance. All participants had normal or corrected-to-normal vision. Exclusion criteria was based on similar previous VR embodiment studies (Banakou et al., 2018;Kocur et al., 2020) and excluded participants with: epilepsy, use of medication, recent consumption of alcohol, intellectual disability, mental health difficulties, or age <18 or >55 years.

Control Measures
A number of standard data collection metrics were used :

Demographics Questionnaire
Participants were asked their age, race or ethnicity, their level of previous VR experience, and to give an estimate of how many hours they spend playing video games per week. The self-report of previous VR experience was evaluated as a Likert scale (with 1 meaning the least experience and 7 meaning the most experience). The questionnaire was largely adapted from Banakou et al. (2018), with two changes:  a question about race was added,  and the Likert scale identifying the average video game hours per week was removed to prevent ceiling effects, and instead the exact numbers reported were used.

Standardized Embodiment Questionnaire
The Standardized Embodiment Questionnaire (SEQ) by Gonzalez-Franco & Peck (2018). was used to measure and evaluate the different aspects of embodiment. The 16 questions (shown in Table 1) were presented with accompanying 7-point Likert scales (Strongly Disagree to Strongly Agree) to be answered by participants. Each question was prefaced by -During the experiment, after switching avatars for the second time, there were moments in which‖: The SEQ contains procedures for adapting the questions to suit novel experimental methods, with guidelines and possible alternative questions outlined by the creators. Due to the unusual methodology of this experiment, two further changes were made.
 Question 6 was changed from -I felt like I was wearing different clothes from when I came to the laboratory‖ to -I felt like I was wearing different clothes from when I put on the VR headset‖ to accommodate the remote testing nature of the experiment.
 The phrase -after changing avatars for the second time‖ was added ain the header for clarity.
The answers to the SEQ are analysed into sub-scales denoting :  Appearance (appearance association)  Response (motor control and agency)  Ownership (body ownership)  Multi-Sensory (location and tactile) Table 2. Rosenberg's Self-Esteem Scale (Rosenberg, 2006) 1.
On the whole, I am satisfied with myself. 2.
At times I think I am no good at all. 3.
I feel that I have a number of good qualities. 4.
I am able to do things as well as most other people. 5.
I feel I do not have much to be proud of. 6. I certainly feel useless at times. 7.
I feel that I'm a person of worth, at least on an equal plane with others. 8.
I wish I could have more respect for myself. 9.
All in all, I am inclined to feel that I am a failure.

I take a positive attitude toward myself
These sub-scale values can be averaged together to calculate the overall Embodiment score. These sub-scale values range from 1 to 7, with 7 meaning a strong sense of what the scale measures and 1 meaning little to no sense of the measure.
An additional Agency sub-score was also calculated, for the purposes of comparison with other studies. The scores most relevant to this study are Embodiment, Agency, and Ownership, as those represent successful embodiment and are the most comparable metrics to those evaluated by other researchers.

Self-Esteem
Self-esteem was measured the same way as Banakou et al. (2018). Participants were assessed on Rosenberg's Self-Esteem Scale (Rosenberg, 2006  When using Rosenberg's Self-Esteem Scale items 2, 5, 6, 8, 9 are reverse scored. Scores were summed for all 10 items, as a continuous scale, with higher scores indicating higher self-esteem (Rosenberg, 2006; University of Maryland, n.d.).

The Porteus Maze Test
The Porteus Maze Test (PMT) was selected as the means to evaluate executive functioning as it was found to have a decent correlation and shared a planning component with the Tower of London (TOL) task used by Banakou et al. (2018), both tasks measuring executive and planning skills (Porteus 1965;Krikorian et al., 1995). A version of the PMT that measures planning and executive control was implemented in VRChat.
The PMT Vineland Revision was administered in the first part of the experiment, followed by the PMT Extension in the second part to curb practice effect (Tuvblad et al., 2017). Images of the test can be found in supplemental material. Two performance indices were used to evaluate PMT performance:  Test Age (TA), quantitatively measuring their ability to complete mazes  Q-Score, analyzing qualitative errors in style and strategy.
This study primarily focuses on TA performance, as Q-Score measures may not be directly comparable due to the virtual medium in which the lines are being drawn. For example, having a wavy line is considered a Q-Score error in normal PMT conditions. However, this could be attributed to tracking issues with the controllers in the VR version of the PMT. There are three types of Test Age Errors (Porteus 1965):  Blind Alley, which occur when a participant enters an area that is blocked at the end  Cut Alley, which occur when the participant cuts across from one alley to another to reach an open space without drawing around  Inability to Complete the Maze, which occurs when the participant says, -There is no way out,‖ or they have paused for more than five seconds or lifted their pencil. In the case of this test, this error is detected when the participant intentionally releases the button that triggers the virtual pen to stop drawing.
The experiment was carried out using VR headsets in an environment built in VR Chat.

Virtual Environment Implementation
The experiment was implemented as a custom, private VRChat world with four different areas. Each instance was limited to three possible players to ensure the privacy of the experimental sessions.
Buttons to display the Likert scales for the self-esteem and SEQ questions were toggled on and off as needed.  When joining the world, players initially appear in the Testing Room (Shown in Figure 1). The Testing Room is a small, office-like environment with a table in the center of the room. The Control Room can be seen on the left, with the virtual informed consent forms displayed on the wall to the right, and a doorway leading to the Avatar Room at the end of the room. The testing room also contains a divider to the left, across which the research facilitator avatar stands in the Control Room.
On the table in the Testing Room is a piece of virtual paper on which different mazes from the PMT appear to the participant, controlled by the facilitator's control panel. Invisible cameras are positioned strategically in the virtual environment to give a clear view of the status of the maze and participant to collect data and record the session.

Control Room
Across the divider in the Testing Room is the Control Room, as shown in Figure 2 where the Testing Room can be seen through the window. The Control Room is where the researcher controls all elements of the experiment from the control panel, such as switching mazes and erasing pen marks. There is a display of the virtual paper to oversee the progress through the mazes, as well as buttons controlling which avatar pedestals are visible.

Avatar Room
Both the Testing Room and Control Room lead into the Avatar Room (shown in Figure 3), which contains an Avatar Pedestal which, upon interaction, switches the virtual body the user currently has to the one stored in the pedestal. The researcher will control which avatar pedestal is visible at any given time based on the test condition using the control panel in the Control Room.
On the other side of the Avatar Room is a virtual mirror and 10 blocks for the stretching and stacking activity (shown in Figure 4). The virtual mirror allows the participant to carry out the stretching and block-stacking exercises while being able to see themselves in their new virtual body, to try to increase their sense of embodiment.
There is also a button on the wall which places the blocks back in their original positions, used to reset the stacking exercise between trials.

Recording Room
The recording room is a small hidden room containing two screens, one displaying the status of the virtual paper and maze test progress within, the same view as seen from the Control Room. The recording room also displays a view capturing the researcher and participant in the Testing Room to capture anything that might be happening during the trial.

Experimental Implementation
A number of special interaction metaphors and tools needed to be set up to allow this experiment to proced.

Virtual Pen
In the Testing Room, a virtual pen was used to draw on a virtual paper surface. The VRChat community had already created a range of 3D pens that work in the virtual environment (Saffo et al., 2020;Saffo et al., 2021). These virtual pens create lines visible wherever and whenever they are activated and stop creating lines when the activation button is released. An existing community-created 3D virtual pen known as QV Pen was placed near the table for participants to pick up and use to complete the PMTs. The virtual paper consists of a translucent layer on which the picture of the maze is visible, and an opaque, white backstop. A common problem identified with writing surfaces in VR is that the tip of the virtual pen may occasionally pass through the surface, rendering anything drawn beneath the surface invisible. Having an opaque backdrop gave contrast to the translucent layer, which allowed tolerances for the drawn line to be both above and below the page. Buttons from the Control Room switched between maze year images (textures) which appeared on the virtual paper.

Avatars
This experiment used three different avatars: Neutral, Control, and Einstein avatars ( Figure 5) :  The Neutral avatar was a default avatar that comes with VRChat, depicting a grey humanoid with a slightly robotic and overall minimalistic appearance (Saffo et al., 2021).
 The Control avatar was used by participants during the second half of the test was a Microsoft Rocketbox avatar with the appearance of a generic older white male (Microsoft, 2020).
 The Einstein avatar depicts an older Albert Einstein and was a commercially available model.

Facilitation and Recording
Two PCs with at least the minimum specifications to play VRChat were used to run the test, both logged into VRChat accounts as players, and both account avatars were present in the testing world: one to stand idle, positioned in a virtual room hidden from the participant while using a capture software (OBS Studio) to record the test, and the other for the researcher/facilitator to actively access through a HMD to facilitate testing.

Experimental Procedure
After having agreed to participate and having been given a brief overview of the procedures, participants joined the world at their scheduled time. Participants were presented with a digital informed consent form to agree to and were told that they may stop participation or leave at any time.
Participants were then asked to change their avatar to the Neutral avatar, after which they completed the demographics and self-esteem survey, and were then asked to go to the Avatar Room. Once in the Avatar Room, participants were led through a short stretching exercise to enhance embodiment, this consisting of guided stretches and gentle movements in front of a virtual mirror. Then, participants were asked to stack blocks present in the room in front of the mirror in any way they saw fit, and to tell the researcher when they felt that the blocks had been sufficiently stacked. Participants needed to have interacted with all the blocks at least once for this task to be completed.
The participant and instructor then returned to their positions in the Testing Room, and the participants were shown a demonstration of, and given an opportunity to practice drawing, with the virtual pen on the virtual paper for up to two minutes. When comfortable with the pen, the participant was given instructions on the PMT (Tuvblad et al., 2017). Participants were told to begin whenever they were ready after the first maze appeared.
During this part of the experiment, participants were recorded by the facilitator using a VRChat account logged in on a second computer, recording the perspective of an invisible virtual camera positioned above the virtual paper with a screen recorder, giving an overhead view of the maze and the participant's progress through it. This footage was used to later evaluate and score each maze attempt (Tuvblad et al., 2017). The facilitator erased the virtual pencil marks after each test, and switched to the next maze. The participants were then given a 5-minute break five-minute break.
After the break, participants in the Einstein group were asked to enter the Avatar Room, where they were presented with and asked to change their virtual avatar into the Einstein avatar, while the Control group changed into the Control avatar. The whole experiment was repeated, starting with the stretching and block-stacking exercises and ending with the completion of a set of mazes. After the maze trials were completed, participants were shown a labeled Likert-type scale and asked to complete the self-esteem questionnaire and the embodiment questionnaire, and were given an opportunity to give comments, and ask questions, before they were debriefed on the experiment.

Statistical Analysis
For all independent t-tests conducted, Welch's t was chosen, as it is resistant to unequal variances in a population and provides comparable results even when the equal variances assumption is not violated (Delacre et al., 2017).
Q-Score and TA Score results for each participant were calculated (Tuvblad et al., 2017). The repeated measures in the experiment were coded into three pre-and post-experimental intervention variables:  Q-Score (preQ, postQ)  TA Score (preTA, postTA)  Self-Esteem (preEsteem, postEsteem) Additionally, values measuring the difference in performance were calculated for both Q-Score and TA Score (Qdiff = postQ -preQ, TAdiff = postTA -preTA) to check for outliers and to aid in the statistical analysis. Extreme outliers that were more than 3*IQR from the median were removed from analysis. Only one data point was removed in the Control condition for TAdiff, and one extreme outlier in Qdiff.
To evaluate the effect and possible interaction of the Avatar Body and the repeated measures in the study, three two-way mixed ANOVAs were run, with Avatar Body as the between-subjects factor (Control, Einstein), and Intervention, as measured by Q-Score (preQ, postQ), TA Score (preTA, postTA), or self-esteem (preEsteem, postEsteem), as the within-subjects factor.
Pearson correlation tests were run comparing the VR experience and the SEQ measures to examine possible correlations. Independent Welch's t-tests were used to compare the SEQ measures by Avatar Body condition to check for differences between groups. Paired-samples Student's t-tests were run for all repeated measures variables, both split across Control and Einstein and for all participants to see if there were significant differences in the data pre-and post-intervention.
For the purposes of exploratory analysis, participants were binned into two even groups, depending on their reported number of hours spent playing video games a week (VgHrsWeekBinned). Participants who reported hours spent playing video games per week greater than the median (Mdn = 20) were binned into a higher group, and the rest were binned into the lower group. VgHrsWeekBinned was used as the between-subjects factor (Lower, Higher), and Intervention, as measured by Q-Score (preQ, postQ), TA Score (preTA, postTA), or self-esteem (preEsteem, postEsteem), as the within-subjects factor.
All analyses were carried out in SPSS 27 with an alpha level of .05. Exact significance and effect size values are reported for all statistical tests.

Results
Based on video footage and outlier analysis, five participants were excluded: three due to technical issues such as excessive jitter and connectivity issues, and two participants were removed as outliers. Table 3 contains the descriptive statistics of the SEQ response variables. On average, participants reported lower-to-medium embodiment (M = 3.2, SD = 1.7), with medium reports of target sub-scales: agency (M = 3.8, SD = 1.4) and ownership (M = 3.4, SD = 1.5).

Standardized Embodiment Questionnaire (SEQ) Responses
The distribution of these results can be seen in Figure 6. There was a significant negative correlation between the VR experience and Multisensory subscale, r (23) = -0.4, p = 0.048. Welsh's t-tests indicated there were no statistically significant differences in the mean scores between Control and Einstein on any of the SEQ outcomes.

Porteus Maze Test (PMT) Tests
Two main indices for measuring performance on the Porteus Maze Tests, the Test Age (TA) and the Qualitative Score (Q-Score).

Qualitative Score (Q-Score)
It was predicted that participants would have a postQ value similar to or lower than their preQ value, indicating either no change or an increase in test performance. If an increase in test performance was oserved (a lower postQ), there would be an interaction effect between Intervention and Avatar Body.
The interaction effect was nonsignificant, F(1, 23) = 0.03, p = 0.874, η p 2 = 0.001, suggesting that any effect Intervention had on Q-Score did not depend on the Avatar Body condition. It was predicted that participants would have a postTA value greater than their preTA value, indicating an increase in test performance, and that if there was an increase in test performance (a higher postTA), there would be an interaction effect between Intervention and Avatar Body.

Reported Average Video Game Hours per Week (VgHrsWeekBinned)
An exploratory analysis of the possible effect of the amount of time spent playing video games and the change in TA Score was undertaken.
The main effect of Intervention on TA Score was significant, F(1, 23) = 29.73, p < 0.001, η p 2 = 0.564, suggesting a decrease in test performance as a result of Intervention.
The interaction effect was significant, F(1, 23) = 5.95, p = 0.023, η p 2 = 0.205, suggesting that the effect of Intervention on TA Score did depend on if a participant played more video games per week.
Participants in the higher bin started with higher preTA than those in the lower bin, but experienced a greater decrease in performance, ending with a postTA lower than that of the lower bin, while the lower bin participants had more consistent performance. This relationship can be seen in Figure 8.

Discussion
This study aimed to find if an embodiment test could be constructed and run remotely, entirely within the social VR platform VRChat. The test was successfully built to specification and implemented.
The first result in this study is that participants across both groups experienced a lower-middling sense of embodiment. These results do not indicate the absence of a sense of embodiment in participants, they indicate that the sense of embodiment was not one of high strength. Best-practices for strong embodiment were followed, such as having a virtual mirror, visual movement synchronization exercises, and activities in front of a virtual mirror (Banakou et al., 2018;Matamala-Gomez et al., 2021;Osimo et al., 2015;Seinfeld et al., 2018;Spanlang et al., 2014).
There may be an effect at play specific to the population sampled, since the participants all use VRChat, which contains enthusiast players with higher levels of VR experience than most previous studies in the field (Banakou http://cis.ccsenet.org Computer and Information Science Vol. 16, No. 2;Barberia et al., 2018;Maister et al., 2015;Osimo et al., 2015;Salmanowitz, 2018;Seinfeld et al., 2018).
One noted effect was an average decrease in participant performance through the experiment, regardless of the body avatar the participants were using. One possible explanation for this phenomenon is that there was not a long enough break between the testing sessions, so participants were unable to replenish their mental resources before the second session. Four of the participants commented afterwards that they were feeling fatigued after the testing sessions, and some had subtly expressed that the testing session was more taxing than anticipated. From the experience running this experiment, a suggested minimum break period between testing sessions would be 30 minutes to an hour.
Another element to consider is that participants may be coming into the session with different degrees of fatigue starting out, and one factor somewhat unique to the methodology is that participants may have already engaged in a VRChat session of unknown length prior to testing. These factors are difficult to control using the participant recruitment methods used in this study.
Despite this study being similar in design to the one used by Banakou et al. (2018), this study was not able to repeat any of the effects stated in that study. Perhaps, the differences between the studies could possibly explain the diverse findings. It is possible that the PMT does not measure the specific cognitive feature that was influenced by the TOL test, or lacks the sensitivity needed to detect such a change. Another study successfully implementing the TOL test in VRChat would help eliminate that doubt. Banakou et al. (2018) used participants who had little-to-no previous VR experience. The participants in this study had medium-to-high levels of previous VR experience (Med = 6, M = 5.4). It is possible that participants with less VR experience may react differently to the various illusions and effects of VR embodiment. One line of thinking is that when a person is inexperienced with VR, they are more receptive to the various illusions that VR presents, and that when a person grows more experienced, they are more accustomed to the differences between the virtual and real worlds and have a greater separation between the two. The data from this study leads to a possible conclusion that there is a significant negative correlation between the multi-sensory embodiment subscale and reported level of previous VR experience.
Recruiting from the VRChat community was relatively easy. People in VRChat are likely to be interested and enthusiastic about VR and keen to learn more about it. This may be due to a novelty factor, that most players in VRChat have never previously participated in a research project before.
One potential disruptive variable when running experiments on social VR platforms is that participants may have different HMDs that may vary in their resolution, refresh rate, FOV, and potentially quality of tracking. For tests such as the PMT, which are dependent on dexterity, issues with connection or hardware limitations could slightly affect the results of the test. Some participants may have enhanced or full-body tracking (e.g. feet trackers), which could contribute to a change in reported embodiment. This study recorded whether participants had enhanced tracking, but since only three participants had this addition, there were not enough data points to draw any conclusions. The effect of consumer full body tracking on the outcomes of VR testing has been the subject of previous research (Caserman, 2021;Eubanks, 2020), but is not comprehensive, nor has it been correlated with the SEQ scale. Research on tracking differences needs to be performed using the SEQ to set a baseline understanding for future studies.
There are a number of novel features in the experiment described in this study, leading to differences between the data measured here and some of the existing literature. Unlike many other studies, which typically involve participants exiting the virtual environment before answering questionnaires or completing tasks, this study was conducted entirely within VR. Apart from the short break midway through testing, participants have an unbroken VR experience from the beginning to the end of the test. Schwind et al (2019) found that completing questionnaires in VR does not change measured presence in questionnaires, but can increase the consistency of the variance, as well as decreasing the study's duration and reducing disorientation. However, a repeated version of this study with participants answering questions outside of the virtual environment may bring the data more in line with previous studies.

Conclusion
This study did not find any effect or correlation of the virtual avatar body with any of the factors measured, in contrast to the results reported by Banakou et al. (2018). This may be due to the population of VRChat being more experienced with VR, which suggests that the effect of VR experience on virtual embodiment should be the subject of further study.
http://cis.ccsenet.org Computer and Information Science Vol. 16, No. 2; The work described in this paper remains an innovative attempt to show the effect of avatar representation, despite not reproducing the effects found in the original research, this experiment further demonstrates the potential of using social VR for research. Running studies in SocialVR, particularly VRChat, can be beneficial to both researchers and participants. For example, using VRChat to recruit participants meant that participants benefited from increased anonymity, as their appearance is never directly seen, usernames are used in place of real names, giving an extra layer of anonymity in addition to regular participant data privacy practice.
When using VRChat, researchers do not need to worry about any potential issues that may occur during software installation, and participants do not need to download and install external software, which may be seen as suspicious and burdensome. With the avatar pedestals and custom model creation and uploading capabilities, VRChat allows easy avatar switching without disruption to the experiment. VRChat allows mirrors as a standard feature in their world creation, this provides the benefits of visuomotor synchrony to help induce BOI in participants. This experiment has hopefully demonstrated the benefits of using VRChat (or any other social VR platform) to conduct virtual embodiment research.