Relative Embodiment of Japanese Verbs

Studies examining visual word recognition have revealed that sensorimotor information is associated with the meaning of and influences the processing of words. In this study, we collected ratings of relative embodiment, which reflects how much physical movement is involved in a word meaning, for 219 Japanese transitive verbs. We then investigated how the ratings affect visual word recognition, using three different tasks: a word-naming task, a lexical decision task, and a syntactic classification task. We found that reaction times were faster and correct rates were higher (in the lexical decision task) for words with higher relative embodiment ratings than for those with lower ratings. These findings indicate that relative embodiment affects processing of Japanese verbs as well as of English verbs.


Introduction
Many aspects of words affect how they are processed. According to one view, the processing of words takes the form of parallel distributed processing (e.g., Seidenberg, 2005), which involves the following three elements: orthographic, phonological, and semantic. Research investigating the mechanisms of visual word recognition has collected multiple indices that reflect the characteristics of these elements and on that basis has explored their effects on word recognition. For instance, studies have revealed that longer words take longer to process than shorter words (the word length effect; Baddeley, Thomson, & Buchanan, 1975) and non-homophones are processed more efficiently than homophones (the homophone effect; Pexman, Lupker, & Jared, 2001).
In addition, indices reflecting the semantic aspects of word processing have been collected. According to Connell and Lynott (2015), elements related to semantics have three levels. Level 1 reflects the specific qualities of the semantic content. One example is the imageability effect (Paivio, Yuille, & Madigan, 1968) in which words referring to concepts that are easier to image are processed more efficiently. Level 2 pertains to enumeration of the semantic content of the words. Previous studies showed that concepts that hold more variation in meaning are processed faster (the semantic-richness effect; e.g., Muraki, Sidhu, & Pexman, 2019). Level 3 reflects the number of words associated with a given word. This is known as the effect of the semantic neighborhood (e.g., Andrews, 1989).
As semantic qualities (Level 1) of words, recent literature has examined the effect of sensory and motor information comprising a referent concept in terms of word processing. Siakaluk, Pexman, Aguilera, Owen, and Sears (2008), for example, collected body-object interaction (BOI) data to reflect the ease with which a human body can physically interact with a noun (see also Pexman, Muraki, Sidhu, Siakaluk, & Yap, 2019). They reported facilitatory BOI effects such that responses for words with high BOI ratings were faster and more accurate than for words with low BOI ratings in lexical decision and phonological lexical decision tasks, which required participants to decide whether each item is a real word or not.
Besides BOI, researchers have proposed other concepts reflecting the sensorimotor experience: the sensory experience effect (Juhasz, Yap, Dicke, Taylor, & Gullick, 2011), modality-specific perceptual strength effects (Connell & Lynott, 2012), manipulability (Salmon, McMullen, & Filliter, 2010), and graspability (Amsel, Urbach, & Kutas, 2012). Many of these variables are discussed based on the framework of grounded cognition (Barsalou, 2008), in which cognition, including concept processing, is largely grounded in the information acquired through sensorimotor experience. Indeed, in one study involving the BOI effect, greater activation in the left inferior parietal lobule, involved in perception and planning of goal-oriented interaction of hands and object, was observed in the high-BOI word condition than in the low-BOI word condition, suggesting that the availability of sensorimotor information facilitates word processing (Hargreaves et al., 2012).
All variables described so far are for nouns; in addition, the relationship of the meaning of verbs with sensorimotor information has been examined. Sidhu, Kwan, Pexman, and Siakaluk (2014) have pointed out that few studies have explored the characteristics of verbs in visual word processing. On the basis of the fact that different types of verbs activate the corresponding modality-specific brain regions (e.g., Hauk, Johnsrude, & Pulvermüller, 2004), they proposed the idea of relative embodiment, in which the degree of sensorimotor information held by a word differs from word to word, and assumed that this difference might affect word processing.
Accordingly, Sidhu et al. (2014) asked participants to evaluate the extent to which human actions, states, and interactions with the environment were related to the meaning of each verb in a survey. They then examined the effect of relative embodiment on visual word recognition by conducting a lexical decision task and a syntactic classification (verb-noun categorization) task and by analyzing the latencies of an action-naming task from a database (Szekely et al., 2005). All tasks and analyses showed a significant effect of relative embodiment, where verbs with high relative embodiment ratings had shorter response times than those with lower ratings. In addition, the rate of correct answers in the syntactic classification task was higher for the high relative embodiment condition than for the low relative embodiment condition. In the study of Sidhu et al. (2014) all variables related to word processing were controlled both in the stimulus selection and in the analyses, to remove their influences; nevertheless, relative embodiment still affected the processing of words. Furthermore, although Sidhu et al. (2014) did not discuss in detail the mechanism of the effect of relative embodiment on word recognition, it was suggested that rating reflects a dimension that forms the semantics of verbs, meaning that a higher rating value indicates higher semantic richness, resulting in efficient processing. These results entail that we should take relative embodiment into account when examining verb processing, selecting verbs as experimental materials, or constructing stimulus phrases or sentences.
In the framework of grounded language comprehension, many studies conducted for non-Western languages such as Japanese (for a review, see Mochizuki, 2015) use materials with stimuli from Western languages. Meanwhile, some studies show that language characteristics, including syntactic structure or part-of-speech breakdown of words, yield different results for Japanese and English language (e.g., Sato & Bergen, 2013). Language differences might have only a small effect when examining the characteristics of individual verbs; but at the language level, it is nevertheless unclear whether measurements and results obtained in studies using English can apply to languages other than English as well.
Thus, the purpose of this study was to evaluate relative embodiment in Japanese verbs and how relative embodiment rating affects visual word recognition in an experimental setting. Similar to Sidhu et al. (2014), we employed a lexical decision and syntactic classification tasks. If the relative embodiment of Japanese verbs has a similar effect to that of English verbs, we can predict its effect, that is, higher-rated words will be processed more efficiently than lower-rated words, which might be seen in both a lexical judgment task and a syntactic classification task. On the other hand, although Sidhu and colleagues (2014) revealed the influence of relative embodiment by analyzing extant action-picture naming latencies from a database, they did not examine the effect of relative embodiment on the processing of verbs itself. Accordingly, we conducted a simple word-naming task in which participants were asked to name a visually presented word, in order to consider differences of the task effects more directly. Sensorimotor information is considered to relate to the semantic level of a words (Connell & Lynott, 2015). If the significant embodied effect found in the action-naming task by Sidhu et al. (2014) was due to action-picture processing, which primarily requires accessing meaning from the visual depiction and generating a label of that concept, the impact of the effect on simple word-naming might be weak, because the importance of sensorimotor processing is relatively small for execution of this task.

Verb Selection
For easier control of stimuli, we collected relative embodiment ratings for 3-mora verbs (Note 1). We used the following procedure to choose the verbs to be rated. We only selected transitive verbs, because relative embodiment relates to humans' sensorimotor information, and thus is more important in verbs expressing how humans interact with objects or the environment. We then chose verbs for which data on imageability, familiarity, and frequency (we used log-transformed values) are available in the Japanese lexical norms provided by Amano and Kondo (1999Kondo ( , 2000 and Sakuma et al. (2005). Moreover, we excluded compound verbs and homonyms from the set. This resulted in a final set of 219 verbs.

Procedure
We created a questionnaire asking participants to rate relative embodiment for each verb; on the questionnaire's cover, we presented a Japanese translation of the instructions in Sidhu et al. (2014, p. 38). The verbs were arranged in random order in the questionnaire. Each verb was presented in both kanji (ideographic character) and kana (phonetic character) form. In the rating task, an experimenter read the instructions to participants orally and then asked them to rate the relative embodiment of each word on a 7-point scale, with 1 indicating actions, states, or relations that hardly involve the human body or physical movement and 7 indicating actions, states, or relations that highly involve the human body or physical movement.

Results and Discussion
The mean score of the participants' ratings for each word was taken as the rating value. All ratings for all the verbs are available in the Appendix. Table 1 shows the correlations among character length, orthographic Levenshtein distance (OLD, Yarkoni, Balota, & Yap, 2008), log-frequency, imageability, familiarity, and relative embodiment for these verbs. Although there was a moderate positive correlation between imageability and the relative embodiment, no significant correlation was found between relative embodiment and either of the other metrics. We argue that the correlation with imageability is caused by a shared characteristic: that the action can be imagined; on the other hand, since the images are not restricted to physical or observable motion, the correlation might be only moderate. Sidhu et al. (2014) also demonstrated no correlation between relative embodiment and frequency (r = .03, n.s.) but a significant correlation between relative embodiment and imageability (r = .70, p < .001). This suggests that the results of the rating task for the Japanese verbs were roughly equivalent to the ratings for the English verbs. In the next section, we report an experiment comparing the effect of relative embodiment on the three different word processing levels.

Participants
Thirty-two undergraduate and graduate students took part in the experiment (18 females, M = 20.6, age range: 18-23). All participants declared themselves right-handed native Japanese speakers with normal or corrected-normal vision. This research was reviewed and approved by the research ethics committee in the first author's affiliated institution.

Design
A 2 (relative embodiment: high/low) × 3 (type of task: word-naming/lexical decision/syntactic classification) within-subject design was adopted. Dependent variables were response time and accuracy for each trial.

Stimuli
We selected 20 high relative embodiment verbs (high-RE verbs) and 20 low relative embodiment verbs (low-RE verbs) based on the rating data (see the Appendix). The rating values of the high-RE verbs ranged from 5.00 to 6.52, while those of the low-RE verbs ranged from 1.37 to 2.04. In addition, 40 nouns were selected for filler trials in the naming task and syntactic classification task. Furthermore, we created 40 non-words for the lexical decision task. Each noun and non-word consisted of three characters with a -u vowel as the third character, because the end part of the Japanese verbs in the base form has the -u vowel. All stimuli were presented in 40-pt MS Gothic kana font on a screen.

Procedure
The experiment was conducted individually in a quiet room. We carried out three experimental tasks in random order among participants who sat in front of a computer screen at about 60 cm distance. Each task began with instructions; then, participants underwent a 5-trial practice session. Stimuli used in this session were not used in the experimental session, which we administered following the practice session.
In the word-naming task, each trial was initiated by presenting a fixation point for 500 ms, followed by a word; participants were asked to read the word aloud as quickly and correctly as possible. The word was presented for 1,500 ms the participants started naming; then, a blank screen appeared for 800 ms as an interval. If participants did not begin to speak within 2,000 ms, the trial expired. The lexical decision task also began with a fixation point for 500 ms, followed by a character string; participants judged the string as indicating either a word or a non-word by pressing one of two buttons on Chronos (rightmost button for a word and leftmost for a non-word). After the participants' decision, a feedback display was presented for 500 ms if the decision was incorrect or if there was no response within 2,000 ms after the onset of the word. The syntactic classification task was identical to the lexical decision task except that in the classification task, we asked participants to judge a presented word as either a verb (rightmost button) or a noun (leftmost button). The stimuli used in the classification task consisted of the same set used in the word naming task. The experiment took about 20 min to complete.

Results and Discussion
Preceding the analysis, three experimenters, including one of the authors, listened to the recorded files for the word-naming task and judged whether the response to each trial was correct or not. Two or more of the three experimenters judged the trials, identifying the trials with an irregular pronunciation, trials without pronunciation, and trials in which the participants prolonged the utterance unnaturally, as errors. We found that the accuracy rates of three participants in the naming task was quite low (< 34%). This might be because there were many trials where participants could not begin the utterance within 2,000 ms and/or the voice key was not triggered by the utterance. We therefore excluded the word-naming data for these participants from the subsequent analysis.
In the response time analysis, incorrect trials and trials with response time exceeding 2.5 standard deviations from each participant's mean for each task were eliminated as outliers. This led to the exclusion of 7.02% of the data. We then applied the negative inverse or reciprocal transformation (i.e., -1 / RT) to normalize the response time data.
The data were analyzed using a linear mixed-effect model (for response times) and a generalized linear mixed-effect model with a binomial distribution and the logit as the link function (for response accuracy). These analyses were run with R (ver. 3.6.3; R Core Team, 2020) and the lmerTest package (ver. 3.1-0; Kuznetsova, Brockhoff, & Christensen, 2019). Both analyses included relative embodiment, type of task, and their interaction as fixed factors. In addition, log frequency, OLD, familiarity, and imageability of each item were included as control variables. If an interaction was found, we then divided the data by task and analyzed them separately in the subsequent analysis. Since participants were required to give different responses for each task, we predicted a significant effect of task. However, we did not conduct any post-hoc analyses, because discussing those differences was not the purpose of this study. With regard to random-effects structure, Barr, Levy, Scheepers, and Tily (2013) recommended that models include both random-slope terms and random-intercept terms (i.e., a maximal model) to reduce Type I error. On the basis of their recommendation, we basically adopted a maximal model. If the model failed to converge, we then progressively simplified the random-effects structure until it did converge, which it ultimately did in the response time analyses. In contrast, in the response correctness analysis we adopted a random-intercept model which included the two fixed effects and their interaction. In this analysis, the model in which all control variables were included did not converge, so we adopted a model that excluded the imageability, which did converge. In a separate analysis, maximal models were adopted in the naming task condition and the classification task condition although the random-intercept model was adopted in the lexical decision task condition. Table 3 summarizes the mean values and standard errors of the measurements as a function of relative embodiment and type of task. A linear mixed-model analysis for response time data showed a significant main effect of relative embodiment (Table 4). The interactions between the main effects were not significant. This means that the words in the high-RE condition were processed faster than those in the low-RE condition regardless of the task. A generalized mixed-model analysis for response correctness indicated that an interaction between relative embodiment and type of task was significant (Table 5). In a separate analysis, a significant main effect of relative embodiment was found for the lexical decision task but not for the naming task or the classification task. The accuracy was higher in the high-RE condition than in the low-RE condition in the lexical decision task, however, no differences were found for the other two tasks.
The results thus showed that relative embodiment affects word processing. Although the overall tendency of measurements was similar across the tasks, there was no significant difference in the correctness of the syntactic classification task, which showed a significant effect in Sidhu et al. (2014) (Note 2). We do not have a decisive explanation for these results, but they might be explained by a speed-accuracy trade-off. We argue that the syntactic classification task might be relatively difficult for the participants because, normally, Japanese nouns have several vowels other than the -u vowel in the last character. On the other hand, in the experiment, not only the verb stimuli but also the noun stimuli included the -u vowel represented in the third character, so it was difficult to classify whether the word was a verb or a noun when looking at the end part of the word. For this reason, participants had to pay attention and learn from to the correctness of their judgments, resulting in significant differences in their response times but not in the correctness.  Note. SE: standard error RE: relative embodiment OLD: orthographic Levenshtein distance (Yarkoni et al., 2008) * In the analysis of simple effects, log frequency, OLD and familiarity were included as control variables, but the model did not converge in the analysis of the word naming task. Therefore, we excluded the log frequency variable from the model.

General Discussion
We collected data on relative embodiment, which reflects the extent to which the human body and proprioceptive state are involved in word recognition, for 219 Japanese transitive verbs. In the following experiment, we examined the effect of verb embodiment on word recognition through three different tasks.
From the rating task, we obtained evaluations of words for relative embodiment, from high to low. Verbs related mainly to the use of the hands, such as naguru (なぐる[beat], M = 6.78) and tataku (たたく[hit], M = 6.52), were assessed as highly relevant to the body; in contrast, verbs involving a psychological process, such as thinking, preference, and decision, were evaluated as low relative embodiment words, including for instance netamu (ねた む[envy], M = 1.37) and omou (おもう[think], M = 1.44). These results suggest that the relative embodiment of Japanese verbs captures bodily sensation and experience through this task. Furthermore, the embodiment rating showed a positive correlation with imageability but no correlation with the other variables. This result was consistent with Sidhu et al. (2014), indicating that relative embodiment partly shares an aspect captured by imageability but also holds other aspects included in the present study.
In the experiment, we found significant effects of relative embodiment on response time and correctness. This result indicates that, as with English verbs, verbs that include content highly related to the human body are processed more efficiently than verbs that include content less related to the human body. Sidhu et al. (2014) found that high relative embodiment verbs were processed faster than low embodiment verbs in a naming of action-pictures task, but they did not explore the embodiment effect on a word naming task. In the current study, we found the main effect of relative embodiment on the response time, while an interaction between relative embodiment and type of task was not observed. This suggests embodied information has an influence on a given task making it require relatively less semantic-related information. We might be able to interpret the results in terms of a semantic feedback effect, in which the impact of processing at the semantic level affects processing at the orthographical and phonological levels (e.g., Pexman, Lupker, & Hino, 2002). Although relative embodiment is considered to be a variable that reflects information related to semantics (Connell & Lynott, 2015), this result suggests that it might affect word processing that seems to mainly require other (e.g., phonological) components ijps.ccsenet.org International Journal of Psychological Studies Vol. 12, No. 3;2020 8 rather than the semantic component. The precise role the embodied information plays in verb processing in each task remains to be elucidated, however.
Some considerations related to stimuli selection should be mentioned. First, in the current study, we presented the stimuli in kana form in order to control for variables such as the length of the stimuli. However, most Japanese speakers/readers are used to reading verbs in kanji form in their daily lives. Therefore, the impact of the relative embodiment might be slightly different when processing verbs in kanji form. Second, we collected ratings for 219 3-mora verbs. This was for ease of controlling the length of the words included in the experiment, but resulted in a relatively small word list size compared to those typically used in studies conducted in English. In addition, we chose to collect ratings only for transitive verbs. This could have influenced the distribution of the ratings, as the raters' evaluations might differ for intransitive verbs. Hence, we would need to collect ratings from a larger set of items (including both intensive and transitive verbs) to ensure the generalizability of the effect. Moreover, we used the frequency, OLD (Yarkoni, Balota, & Yap, 2008), imageability, and familiarity as variables to control for the selection of target stimuli in the experiment. It is also known, however, that other variables (e.g., concreteness, age of acquisition) also affect word recognition (cf. Cortese & Balota, 2012). Nevertheless, few related variables have been studies for Japanese verbs; thus, we ought to have more consideration for such circumstances when choosing more appropriate stimuli. Of course, we need to develop other norms as well, in order to fully understand the lexical characteristics of Japanese words.
We should also take stimuli that falls in the middle of the rating range into consideration. Pollock (2017) pointed out that the mean rating values for semantic psycholinguistic variables do not reflect people's actual judgments: words in the middle of the scale tend to vary more (have larger standard deviations) than the values which are theoretically expected, because each mean value reflects both participants' higher ratings (e.g., more concrete, more imageable, and so on) and their lower ratings (e.g., less concrete, less imageable). This trend was also seen in the ratings of relative embodiment. In our experiment, we avoided this problem by selecting only stimuli where the value of the standard deviations was lower than 2 in both experimental conditions. If in future studies words that fall in the middle of the rating scale is to be included, it will be necessary to consider not only the mean values but also the standard deviations.
In conclusion, we collected ratings of relative embodiment, reflecting the extent to which word meaning is associated with the human body or physical movement, for 219 Japanese transitive verbs. Furthermore, consistently with previous studies conducted in English, we confirmed that words with higher ratings were processed more efficiently than those with lower ratings. The results suggest that sensorimotor information related to word meaning affects the processing of verbs in Japanese. Notes Note 1. Mora is a unit that determines syllable weight. The moraic system, rather than the syllabic system, is the basis of the phonetic system in Japanese. One mora generally corresponds to one kana-character, so that three-mora words are expressed by three kana-characters.

Copyrights
Copyright for this article is retained by the author(s), with first publication rights granted to the journal.
This is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).