Education Influential Factors of University Attendance

Absenteeism among university students is a widespread issue today. It is known that absenteeism may incur negative effects on students' academic performance as well as many social problems. This study was carried out to investigate and highlight students' perceptions of the factors affecting university attendance for online and in-person classes. The study surveyed students from a variety of disciplines at the University of Toronto Mississauga. The results of the survey indicated that the statistically significant factors affecting university attendance include student's current university CGPA (Cumulative GPA)


Introduction
The term absenteeism has been defined as the conscious and deliberate act of being away from the physical space of the University classroom, conditioned on some factors that influence the search for alternatives to the use of time (Crespo et al., 2012). Within education assessment, absenteeism is a key variable that can affect students' academic performance (Hakami, 2021). During middle of the spring semester in 2020, when most countries went into lock down, most universities distinguished and continue to distinguish between regular online classes (which formats predated the lock down; mostly asynchronous) and remote classes (which aimed to be as in-person like as possible; mostly synchronous) for the purpose of students' academic life from the COVID-19 pandemic. By doing so, students now have the liberty of completing their education at home instead of attending in-person classes. However, class attendance, whether in-person or online, is one of the most crucial factors for academic success. Attendance is a key component in students' retention, progression, achievement, and employability (Fayombo et al., 2012). The current study aims to identify the social, biological, and environmental factors that play a role in absenteeism. Moreover, this study seeks to determine possible remediations that may be imparted by university officials to increase class attendance. Some of the factors studied are the gender of the student, the lecture time, a lecturer's teaching method, as well as whether the class mandates participation through educational software. The findings of this study will incite future teaching professors, teaching assistants, and other university staff on the difficulties that students face and how these struggles may be alleviated to improve students' class attendance and performance.

Literature Review
The correlation between undergraduate attendance and grades has been surveyed several times before and they have all come to the same conclusion -attendance has a positive and significant effect on students' grades (Cecilia et al., 2019). Romer (1993), in his seminal work, found that one-third of the class was absent during a typical class meeting and noted some key points. Firstly, attendance seems to be higher for core courses and lower for electives. Secondly, attendance is higher for a smaller class size university as opposed to a larger class size university.
Romer's results have been corroborated by Schmidt (1983), Park andKerr (1990), andMarburger (2001) who have all controlled for various factors and concluded similar results to varying degrees. Similar results have been accomplished by various studies, but several authors mention minor inconsistencies which could be attributed to the location and the demographic information (Kirby & McElroy, 2003).
The conclusions made by Lotz & Lee (1999) verify the hypotheses from Marburger (2001) and Romer (1993). They find that students cite a negative self-image and low self-esteem as reasons for the absence from the lectures. Interestingly, a view provided by Lotz & Lee (1999), Williams (1999) as well as Wadesango (2011) shows that compulsory attendance might contribute to elevate the absentee problem.
Studies indicate that absenteeism is also caused by several other factors such as a desire for hedonistic activities with peers; lack of personal interest in studies; lack of confidence in a professor; teaching style of their lecturer and commute time to the university (Mayer & Mitchell, 1996;Weller 1996;Williams, 2000;Marburger, 2001). Interestingly, Triado-Ivern et al. (2020) corroborated these findings in Barcelona and found that these factors vary based on year of education. Students in earlier years noted that their main reason for not attending the class was non-mandatory class attendance. Those in later years were more likely to attend classes based on how their time management.
While most of the research points to the lack of interest and inadequate relations between students and the lecturer as the primary cause for absenteeism, Majeed et al. (2019) found that the main causes of absenteeism in students were physical; where they either had health issues, were generally too tired to attend the class, or they always went to bed late which made it harder for them to attend early morning classes. Approximately 84% of the participants agreed that their cause of absenteeism was their tiredness to attend the classes. Mohamed et al. (2018) conducted a study to determine the factors influencing student absenteeism in universities. Approximately 140 students from the Universiti Teknologi MARA, Pulau Pinang branch were surveyed. Of the 140 students surveyed, approximately 82.6% were male students. It was determined that a student's attitude toward learning, class activities, and their family matters were the significant factors causing absenteeism. A student's attitude (negative academic perception, lack of interest in course material, and lack of motivation) was determined to be the dominating factor that influenced them to miss the class.
Finally, we also look at the role student gender plays on absenteeism and their CGPA (Cummulative GPA). As shown in the study by Hakami (2021), gender was considered a confounding variable and he found that absenteeism is a negative predictor for males but not females studying medical sciences. These results were corroborated by Valli  who measured academic performance using CGPA and showed that student grades are affected by gender, extracurricular activities, and age.

Research Methodology
This study was conducted at the undergraduate sections of the University of Toronto Mississauga (UTM), Canada. A quantitative methodology was used to study the factors affecting university attendance (CGPA, commute time to the university, living in-campus/off-campus, the program of study, lecture style, and mandatory activities during lecture). Essentially, the focus is to determine whether there is any correlation between the factors mentioned above and student attendance and if so, how big is it. To answer this, the following research questions were addressed in the study: • RQ1. Will a student's academic profile (a program of study, lecture style, mandatory activities, etc.) influence absenteeism?
• RQ2. Will a student's gender influence absenteeism? Based on practical rational and past literature, the following hypotheses were tested: • H1. A student's academic profile (a program of study, lecture style, mandatory activities, etc.) influences absenteeism.
• H2. There will be no significant effect of gender on a student's absenteeism.
Even though the above hypotheses are based on the past literature, it should be noted that considering the pandemic, the results of this study could differ. Asynchronous classes undermine the effect of student absenteeism on grades (since students can still watch the lecture recordings).
The study aimed to understand the various items affecting students' attendance through observational and numerical data collection from a large sample, which will help to explain why individuals (e.g., undergraduate students) make certain decisions (Cairs & Sears, 2012). For instance, a survey design may provide quantitative descriptions regarding an individual's perception (Creswell, 2012). These aspects will be compared in consideration to both in-person and online classes.

Population
Population data for this study is from the University of Toronto Mississauga, Ontario, Canada. The population size is roughly 15,200 students; Male: 44.6% and Female: 55.4%, and Domestic: 76.1% and International: 23.9% (University of Toronto Mississauga, 2021).

Data Collection
Absenteeism data on the students were taken through a survey administered to all registered students at the end of Fall 2020. The survey was emailed to all the registered students by the Office of the Registrar, where participants could voluntarily answer by filling out the survey. A total of 1130 students completed it, amounting to a 7.53% return rate. The surveys were completed near the end of the fall semester in the academic year 2020-2021. The number of individuals who participated in the study is considered an appropriate sample size to represent the population (Kerjie & Morgan, 1970). All the questions asked through the survey contained categorical responses and you can refer to the survey questions in Section 3.3 and Figure 1 below.

Survey Instrument
A quantitative survey was carried out for all students to analyze various factors affecting university attendance. A survey with 19 questions was designed, where each question refers to a factor that may contribute to one's attendance. The questions were designed to be as concise as possible to avoid any ambiguity. Next, the Office of the Registrar emailed the survey to all registered students. The survey is anonymously completed. The final questionnaire consisted of the following items: • Demographic information: Year of study, Degree, Gender, student status (domestic or international), and accessibility requirements.
• Academic questions: CGPA, Online/In-person classes, Lecture style, Mandatory participation during lecture (class discussions, quizzes using software such as Quercus, TopHat, etc.), and Impact of lecture attendance.
• Reasons for skipping in-person classes: Commute time to the university, living on-campus versus off-campus, Type of commute to school (walking, bus, train, car, etc.), and Class start time.
• Reason for skipping online classes: Time of class as per the student's time zone, To study (e.g., prepare for a test or assignment), To work, Dislike of a lecturer's teaching method, Accessibility needs, Poor sleep habits, Mental health as a result of COVID-19 (stress, etc.), Technical issues (connection, equipment, etc.), General sickness (headaches, cold, fever, etc.), Easy to use external resources if needed (Internet, friends, etc.), and Others.
Before analyzing the survey's results, it was ensured that the questions in the survey measured the research topic's consistency, reliability, as well as inter-relation. To determine the reliability and consistency of the test items in the survey, Cronbach's alpha was run on the entire sample size of 1130 students. Cronbach's alpha measured the internal consistency of the scale, with preferred values between 0.7 and 1 (Reynaldo & Santos, 1999). The obtained value of Cronbach's alpha for the questionnaire is 0.72, which concludes that the items are internally consistent and hence making the questionnaire reliable.

Characteristics of the Participants
This section looks at the survey responses. Figure 1 below demonstrates the differences in the demographic information and their responses for the survey questions. The gender distribution of the sample (1130 students) was relatively skewed with 67.9% of the respondents being females. Figure 2 below shows the percentage of the class being skipped, grouped by gender. It follows from this figure that the male students are a little more likely to skip the class (19%) as opposed to the females (11.4%). fourth year (15% for females and 20% for males). While there was similar participation from the students studying in different years of the university, 78.9% of the students were domestic and 57.3% of them were pursuing an Honours Bachelor of Science degree. Most students (n = 1067; 96.1%) were taking only online classes due to the COVID-19 pandemic. It was noted that the most frequently missed classes by the students (n = 646; 58.2%) were the morning classes occurring between 7 am and 12 pm (Eastern Standard Time). Overall, most students (n = 936; 84.3%) indicated that they would not skip a class where attendance in a lecture was mandatory, which included participating in class discussions and quizzes.

.3 Student Perceptions and Replies
Most of the students (66.8%) strongly agreed that attendance affects their overall CGPA. Furthermore, over half of students surveyed (85.9%) indicated they attended their classes. Approximately 37% of the students indicated that they would rarely (<5%) skip a class. Moreover, the top three reasons indicated by students as to why they miss their classes were studying for the test or assignment during class time (60.1%), mental health issues due to COVID-19 (53.1%), and poor sleeping habits (41.5%). This information is displayed in Figure 1 above.

Analytical Methods
Several statistical models were employed to analyze the relationship between student attendance and the surveyed variables (Agresti, 2018). First, a chi-square test for independence was performed to verify the initial hypotheses made in Section 3. The p-values obtained from the test are summarized in Section 4.2.1.
Second, to verify the results from the chi-squared test, a logistic regression model was built where the target variable indicates whether the student identifies as someone who skips or attends the class. In this model, we used all the variables defined in Figure 1 except one -student responses of how much they skip classes. This variable was excluded due to its relation to the dependent variable (nearly a one-to-one relation). The results of the logistic regression are covered in Section 4.2.2.
Third, we performed ridge logistic regression. This model is generally used in cases with high multicollinearity and to perform the variable selection. However, while our data has low multicollinearity (or nearly zero), we want to confirm the results of variable selection obtained from the chi-square test for indepdence and the standard logistics regression model. The results of the ridge logistic regression are given in Section 4.2.3.
These different methods help us identify the existence of a correlation between absenteeism and student grades and find the statistically significant factors that affect absenteeism. Furthermore, it helps us to test any discrepancies that may arise between past literature and our paper due to the pandemic and impact of asynchronous lectures on attendance.

Data Analysis and Results
The hypotheses considered in Section 3 are addressed in this section by studying the association among the variables of interest. The R software is used in the data analysis.

Correlation Analysis
Before diving into our analysis, we look at the correlation between the variables tested in the logistic regression model and the chi-squared test. As shown in Figure 4 below, most variables seem to be weakly correlated with correlations ranging from -0.3 to 0.3. There seem to be relatively higher correlations among the lecture types. The students were asked questions on their preference of lecture delivery (traditional, hybrid, live, or recorded). Traditional and hybrid lectures have a correlation of -0.21 while traditional and live lectures have a correlation coefficient of 0.27 and recorded and hybrid lectures correlate -0.26. This helps us get a brief look at the similarity between student preferences.
Furthermore, the correlation matrix disputes some of the initial hypotheses mentioned earlier in Section 3. We expected a moderate to strong negative correlation between student attendance and student CGPA along with a relatively strong positive correlation between students' attendance and traditional lectures as opposed to hybrid lectures. However, there is a weak correlation between the attendance and CGPA, and no correlation between the attendance and lecture types.

Figure 4. Correlation matrix of all variables
While the results of the correlation analysis do not necessarily point to a relation (or lack thereof), they undermine our hypothesis in Section 3. Our initial assertion that students' academic profile plays a role in the attendance but there seems to be a weak correlation.

Statistical Tests and Models
An investigation of how the attendance of students associated with the variables of interest is considered in this section. The likert type graph between attendance and other factors can be found in Figure 1.

Chi-Square Test
To investigate which of the factors in the questionnaire vary by attendance, a chi-squared test for independence was conducted, which in turn helps us answer the research questions we asked earlier. The p-value of all the variables is summarized in Table 2 below. It was found that there are 4 statistically significant tests (p-value <0.05). Those significant tests are for testing whether there is a relationship between Attendance and 5 other factors: CGPA, Class time skipped, attendance impact on CGPA, gender, and Mandatory graded assignments/quizzes during class. There is enough evidence to conclude that these factors vary by attendance; see Table 2. Using the chi-squared test results as shown in Table 2, we can accept both the hypotheses that we built in earlier sections.

Logistic Regression Model
An alternate method was employed to check the consistency of our results. As mentioned earlier, the data consisted of only categorical variables and hence the first model we considered was a logistic regression model with a dichotomous dependent variable -"attend or miss classes". This variable indicates whether the student identifies as someone who skips or attends classes. It is evident by the results in Table 3 that the feature "Mandatory Graded Assignments/Quizzes During Class", is the most statistically significant factor with a p-value of 0.0000143. "Mandatory Graded Assignments/Quizzes During Class" refers to students being graded on assigned assignments or quizzes during class, usually through software such as 'TopHat'. Additionally, "Student's CGPA", "Timing of Missed Classes", "Attendance Impact on CGPA", and "Student's Gender" appear to be significant factors. These results were found to be consistent with the previous chi-square test findings.

Ridge Logistic Regression
In this section we use the ridge logistic regression model for variable selection. The ridge logistic regression performs via a continuous shrinkage operation, minimizing regression coefficients in order to reduce the risk of overfitting (Zou & Hastie, 2005). We utilized the "glmnet" in R to fit this model. Figure 5 shows the results of the coefficients of the ridge logistic regression model. It shows agreeable results with the simple logistic regression model and the chi-squared test.

Discussion and Proposed Solution Strategies
Throughout the paper, we have looked at the results from the survey and performed several tests. The results from our initial analysis, along with the chi-squared test and the logistic regression seem to agree not only amongst each other but also with the past literature. Data analysis suggests that the statistically significant factors associated with university attendance are gender, degree type, current CGPA (cumulative GPA), and lectures where mandatory participation is required (class discussions, quizzes, etc.). When considering the categorical effect of gender, we identified that male students are significantly more absent than female students and these absenteeism results are corroborated by past literature in Nja et al. (2019) and Khan (2018).
This paper also identifies the CGPA of students and the class type to be influential grouping factors for the absenteeism and the results are consistent with those found in the past literature. Researchers have emphasized the importance of lecture attendance for one's academic success (or, CGPA). Specifically, the studies completed by Aden et al. (2013), Kassarnig et al. (2017), andNja et al. (2019) reflect these sentiments. Finally, the most significant factor affecting attendance was observed to be the lecture delivery. Lectures, where mandatory participation was required, observed the highest attendance.
When looking at potential solutions to resolving the absenteeism problem, the model indicates that there is no simple solution due to multicollinearity between variables. When addressing the issue of absenteeism, one approach may be to encourage absent students to attend classes by providing counseling sessions to highlight the importance of attending classes, and its consequent impact on their performance henceforth. Therefore, mandatory participation in classes may be the key to resolving the problem of absenteeism. This may be implemented in a variety of ways. Instructors may increase the number of in-class quizzes and participation while reducing the weight given to term tests and final exams. Instructors may also utilize technology in class such as Top Hat, iClicker, and Kahoot, which will improve the students' attendance while also enhancing student's subject understanding (Al-Labadi & Sant, 2021).
This paper also looks at the student's reasons for absenteeism during lecture time. The students mentioned several reasons for this, but the most common reason given for skipping online classes was to study and prepare for the other courses during the lecture time (60.1% of respondents). This includes preparing for the assignments, quizzes, or tests for other courses that they were taking during the term. Similar studies like the one conducted by Kottasz (2005) support these results, indicating this factor as one of the key reasons leading to student absenteeism. To remediate this, the university may perhaps organize work/time management workshops, as well as providing one on one advice sessions to students. Alternatively, the hybrid lecture style which enables students to watch the recorded lectures or attend live lecture sessions will remediate the absenteeism of students who could not come to the class due to commute issues.
The second reason selected by students as to why they may miss an online class was their current mental health situation (53.1% of respondents). Sometimes students may have to face personal or family-related issues, and absence in class is unavoidable following these circumstances. The stress and pressure that arise from these situations tend to have a major impact on a student's overall well-being. When students face conflicts from school or home, this may result in a reduction of focus during class which in turn affects their academic performance as well. To ensure students have access to any support they may need during this time, it is recommended to encourage students to be aware of the resources provided by the university.
Another reason indicated by students for skipping the class was poor sleeping habits. According to the findings, 58.2% of the students responded that they are more likely to miss classes in the morning than at other times of the day. Thus, the study found that there was a negative relationship between the quality of sleep and the absenteeism rate for students. A likely cause for this behavior is students staying up late at night, which may result in their failure to attend early morning classes at the scheduled time. It is theorized that students who have a lack of sleep are more likely to be absent because of fatigue associated with staying up late to work or study in addition to the long commute time between school and home. Some approaches to increasing students' attendance may be by providing more class times in the afternoon or evening, as well as increasing bus routes for students who need to travel long distances to attend in-person classes. An alternative approach could be to offer the same courses over the multiple sessions in the same academic year.

Limitations and Conclusion
We have considered several models in our analysis, but it is important to keep a note of the limitations they present and any cases we may not have accounted for. There might be an interaction effect between variables, but it was not considered into the model due to the complexity of interpretability in predictors and the limitation of jedp.ccsenet.org Journal of Educational and Developmental Psychology Vol. 13, No. 1; consistency to be compared with the chi-square independence test.
The current study was conducted during the COVID-19 pandemic; therefore, most of the responses were reflective of students currently attending classes online. It would be interesting to investigate whether the same factors would be selected as the reason for absenteeism when in-person classes resume. This in turn points to the differences with past literature.
Secondly, this survey was limited to students from one university, therefore the sample size with a variety of degrees were limited as this university currently offers only 4 types of degrees (Honours Bachelor of Arts, Honours Bachelor of Science, Bachelor of Commerce, and Bachelor of Business Administration). Including students from other campuses may be considered for future work which will incorporate students from other degree types, giving us more in-depth information about the factors causing absenteeism among students. Furthermore, including students from other campuses might help to tap into a new demographic of students whose commute times, and class types differ.
However, we do have to consider the feasibility of including more institutions as well. Different campuses or universities may lead to differences in course structure and delivery because of which the survey might have to revised.
Finally, the design of the questionnaire could have also hindered the results. It is possible that other important factors were missed due to lack of open-ended question/answer responses. The curation and selection of the factors subsequently resulted in a more controlled study.
Past literature has already shown the importance of absenteeism on academic performance and this paper helps us further strengthening the result of the analysis while also showing the significant factors that affect student's absenteeism in online classes. We hope that this paper would lead to new research in this area to account for the limitations and that it would encourage universities to work with students and help reduce the absenteeism. It is not a problem that students face alone, and we have seen that the primary reasons for absenteeism are a busy schedule and mental health -both of which could ideally be curbed with some help.

Ethical Statement
Comprehensive from the ethics standard at University of Toronto Mississauga. This has been authorized based on the approved number: 2019-036 from the ethics review committee by the University of Toronto.