Effects of Cognitive Training upon Working Memory in Individuals with ADHD: An Overview of the Literature

Commercial cognitive training programs have been proposed as a non-pharmacological treatment of ADHD-related outcomes, such as learning difficulties and academic achievement. Most of these programs focus on working memory, an essential cognitive ability sustaining nearly every conscious mental activity. In this article, we present and summarize the main studies assessing the effectiveness of such training programs on working memory. The reported studies have failed to show a positive far-transfer and long-term effect of cognitive training both in typically developing individuals and children with ADHD. In the end, we present emerging alternative approaches to the use of cognitive training to improve working memory functioning in children with ADHD.


Introduction
Working memory is the cognitive system responsible for the simultaneous processing and storage of information in order to accomplish ongoing tasks. It has access to long-term memory and is responsible for the retrieval of long-term knowledge to allow us to interpret reality and solve daily problems. As so, it is an essential executive function to goal-directed behaviour, learning, recall of information, and conscious control of mental activities. Working memory is a limited storage system in terms of capacity (i.e. the amount of information stored) and time (i.e. how long information is stored), and attention is a powerful mechanism to boost its operations: through attention, it is possible to protect information from temporal decay, to select and prioritize items, to filter disruptive stimuli from the environment, and to control the treatment, manipulation, and recovery of the information therein (Atkinson et al., 2018;Awh et al., 2006;Camos & Barrouillet, 2014;Gazzaley & Nobre, 2012;Thigpen et al., 2019;Vergauwe & Cowan, 2015).
There are individual differences in working memory capacity in the general population, and the literature shows that children with attention-deficit/hyperactivity disorder (ADHD) are especially impaired in this cognitive ability (Holmes et al., 2014;Kasper et al., 2012). They exhibit difficulties in the short-term storage of information, executive control, planning, sustained attention, and inhibition (Barkley, 1997;Douglas, 1972;Sonuga-Barke et al., 2008;Tucha et al., 2017). When the verbal and visuospatial components of working memory are considered separately, ADHD-related deficits are found in all domains (verbal, visuospatial, and the central executive component), with more pronounced impairments in the visuospatial domain and tasks involving the central executive (Alderson et al., 2013;Gu et al., 2018;Martinussen, 2005;Rapport et al., 2008). Children with ADHD also benefit less than their typically developing peers from the simultaneous presentation of stimuli in different modalities (such as oral and visual) in working memory tasks (e.g., the word "cat" written on the screen and through audio speakers). This poorer benefit of the multimodal bounded presentation in ADHD children suggests impairments in the episodic buffer component of working memory (Alderson et al., 2015; but see also Kofler et al., 2018). All these working memory deficits affect the functioning of children with ADHD in schools (Fried et al., 2017), with negative impacts on their reading comprehension (Miller et al., 2013), mathematical ability (Kuhn et al., 2016;Tosto et al., 2015), and academic achievement (Zendarski et al., 2017).
Jungle Memory, Multiflex Braingame Brian, Cognifit, Happy Neuron); as so, many studies since the past decade have been dedicated to assessing their effectiveness in improving working memory's functioning. This article aims to present and summarize evidence on the impact of cognitive training of working memory as a non-pharmacological treatment of ADHD.

What Is Cognitive Training?
The aim of cognitive training is to enhance working memory capacities through repetition or explicit instruction of mnemonic strategies (Corbin & Camos, 2013). Training through repetition aims to enhance domain-general abilities after the repetition of a given activity (e.g., repeat the same activity every day), based on the possibility of generalization of skills to other cognitive domains such as fluid intelligence, reasoning, and executive control. On the other hand, training based on instructions will focus on teaching specific coding or maintenance strategies (e.g., phonological rehearsal, chunking).
Studies about the effects of cognitive training usually evaluate the short-term and long-term transfer of knowledge in a non-trained battery of tasks to assess generalization effects. If the battery of non-trained tasks is identical or very similar to the practised task, we consider that there is near-transfer of skills; if the task comprises identical elements but different characteristics (e.g., both tasks require memorizing word sequences, but one involves immediate recall and the other text comprehension), we consider that there is far-transfer of skills. For instance, after the training of visuospatial working memory in preschoolers using classical tasks such as the Corsi blocks and the dots matrix, near-transfer effects could be measured on these same tasks (or very similar ones), whereas far-transfer effects could be measured on numeracy skills, because these skills are well-known for relying on visuospatial working memory (e.g., coding numbers into spatial representations in a mental number line). The short-term transfer is assessed immediately after the training session, whereas the long-term transfer is assessed weeks or months after the training protocol. This procedure is applied both for near and far-transfer measures. The most important variable when appraising the effects of cognitive training is the far long-term transfer, as it reveals the generalization and the temporal permanence of the training effect.

What do Studies on Cognitive Training Reveal?
Studies in the past two decades have suggested that cognitive training can improve the working memory of children with ADHD (Holmes et al., 2009;Klingberg et al., 2005;Klingberg et al., 2002;van der Donk, 2015;van der Donk, 2017; see Klingberg, 2010 for a review). In general, those studies have found near-transfer effects by using computerized training batteries and/or subtests of neuropsychological batteries. For instance, Klingberg et al. (2002) applied a computerized battery of classical working memory tasks (the backwards digit span, a visuospatial span task, the letter span test, and a choice reaction time tasks) in seven children with ADHD (ages 7-15 years) and four typically developing adults, for 25 minutes a day for 24 days. The authors advocate that the training program significantly improved the working memory capacity of both children with ADHD and adults, by increasing the number of items they can store. The tasks used in this training program were equivalent to the ones used in the test-retest sessions (e.g., a Corsi blocks task similar to the computerized visuospatial span task). In a different study, Wiest et al. (2020) compared pre-and post-test measures of working memory (using the visual and auditory subtests of the WRAML2 battery) of nine children who performed a computerized cognitive training program and eight children in a control condition (mean age = 11.35 years). All children were enrolled in a private school for students with learning needs and the authors did not report their diagnosis. Children in the training condition took 20 hours (distributed in 4 weeks, daily during school hours) of the Captain's log training program, whereas the control group did a silent reading for the same period outside the classroom. The authors found statistically significant training effects for auditory (p < 0.05), but not for visual working memory. The effect size (Note 1) of the training condition upon auditory working memory scores was η 2 = 0.22.
However promising those results are, the studies cited above present methodological limitations, such as small sample sizes and the absence of a matched control group without ADHD or specific learning needs in their intervention program. This makes the results dubious about the near-transfer effects of the training on ADHD-specific deficits in working memory. Moreover, these studies found no evidence of far-transfer effects on broader constructs such as IQ, academic performance, behaviour in the classroom, and quality of life (Holmes et al., 2009;van der Donk et al., 2015;van der Donk et al., 2017).
The generalization of trained working memory skills in children with ADHD is limited even when the post-test measures are intimately correlated with the target of the training program. In one study (M.R. Jones et al., 2020), children with ADHD (aged between 7 and 14) were trained in a spatial n-back task, and the short and long-term (3-month delay) near-and far-transfer measures were compared to those of participants in an active control group. The spatial n-back task consisted in increasingly long series of spatial locations appearing on the screen, and participants have to recognize the n th location before the last one presented (e.g., in a 1-back trial, participants should answer whether the probed location corresponded to the penultimate item of the series). The active control group took part in a computer game of "questions and answers" on vocabulary and general knowledge. The training program had 20 daily sessions of about 15 minutes. The authors assessed the near-transfer effect with an object n-back task, participants memorizing series of objects instead of the locations in the trained task. The far-transfer effect was assessed by a composite score of verbal working memory tasks and a continuous performance test (CPT), which taps sustained attention and inhibitory control. Bayesian analysis revealed no evidence of a short-term far-transfer effect upon the verbal working memory score (BF = 0.61), and moderate evidence of a short-term far-transfer upon the CPT score (BF = 3.73). Regarding long-term transfer effects, the evidence was null or anecdotal for all the measures taken three months after the intervention. The only strong evidence of training effects was in short-term near-transfer (BF = 29.50), as measured by the object n-back task. We would like to highlight the fact that the trained spatial n-back and the object n-back in post-test were extremely similar in terms of task requirements (i.e, both requiring the updating of mental representations and the inhibition of competing items whilst responding, and both tasks tapping the visuospatial component of working memory). Even so, the near-transfer effects as measured by the object n-back did not last in the delayed post-test (BF = 1.74).
Some authors suggest cognitive training as an addition to treatment with psychostimulant medication (e.g., Muris et al., 2018). When combined with psychostimulant medication, cognitive training also did not improve behavioural (based on parents and teachers' ratings of inattentive symptoms) and neurocognitive (working memory, sustained attention, inhibitory control) outcomes in a randomized trial conducted by Oliveira Rosa et al. (2021). These authors compared two groups of children diagnosed with ADHD (aged between 6 and 13) under pharmacological treatment and presenting residual symptoms. Participants were assigned either to a cognitive training condition (4 hours/week for 12 weeks) or a non-active control condition. In this study, there was no evidence of improvements in the cognitive training condition compared to the psychostimulant-only condition in all measured outcomes.
The absence of far-transfer effects after cognitive training of working memory is not exclusive to the population of children with ADHD. Jones et al., (J. S. Jones et al., 2020) conducted a well-controlled randomized trial with a sample of 95 typically developing children (aged 9 to 14) to compare the effectiveness of a commercial program of cognitive training (Cogmed), a combination of cognitive training and metacognitive strategy training (MetaCogmed), and a control condition with no intervention. In the MetaCogmed group, participants received a workbook with reflection exercises on planning, monitoring, evaluation, self-motivation and refocus. They were prompted to remember how to plan, monitor and evaluate while doing the Cogmed training, but no task-specific strategies instructions were given. To control for expectancy effects and motivation effects driven by the adaptive nature of Cogmed (i.e., the tasks getting increasingly harder during the training) and the content of the metacognition workbook, the control group took part in sessions of a visual search training program and received a placebo workbook. The authors found evidence (p < 0.05, no effect size reported) of a near-transfer effect to working memory scores in a neuropsychological battery (the Automated Working Memory Test Battery -AWMA) to both in the Cogmed-only and MetaCogmed groups compared to the controls. Regarding far-transfer effects, they found evidence of transfer to mathematical reasoning both in the Cogmed-only and MetaCogmed group compared to the controls (all p < 0.05), but this effect did not last three months after the intervention. No far-transfer of skills to reading comprehension occurred in any training groups.
Meta-analyses of studies with the general population (all ages included) showed evidence of near-transfer for both visuospatial and verbal domains of working memory, but inconsistent measures of long-term far-transfer irrespective to the domain (Melby-Lervåg & Hulme, 2013;Melby-Lervåg et al., 2016;Schwaighofer et al., 2015). Also, they found no evidence of generalization of working memory training upon reading and mathematical tasks and, critically, they revealed that long-term far-transfer does not occur in the general population (Melby-Lervåg & Hulme, 2013;Melby-Lervåg et al., 2016;Schwaighofer et al., 2015). The effect size of far-transfers ranges from g (Note 2) = 0.8 to g = 0.16, and are strongly modulated by specificities of the training protocol (e.g., the duration of training sessions and supervision during the training, Schwaighoefer et al., 2015), and by confounding effects in the control group (e.g., unexpected decrease in performance in pre and post-test measures for the control, Melby-Lervåg et al., 2016).
These results from the general population were replicated by a meta-analysis targeting studies about the effects of cognitive training upon academic skills (e.g., mathematics, literacy, fluid intelligence) in typically developing children and adolescents (3 to 16 years) (Sala & Gobbet, 2017). The authors found effect sizes of g = 0.46 1 for near-transfer in tasks strictly involving working memory; these transfer effects were modulated by age in short-term measures and persisted in the long-term, which may also reflect maturational effects during development. Nonetheless, the effect size for far-transfer in broader academic skills was g = 0.12 and this result was strongly influenced by the quality of the design of the studies included in the analysis. The authors conclude that working memory training is ineffective in enhancing typically developing children's academic skills and that far-transfer rarely occurs; when it does, its effects are minimal. Scionti et al. (2020) present more optimistic results in a meta-analysis of 32 studies on cognitive training of executive functions (including, but not exclusive to, working memory) in preschoolers. Their meta-analysis gathered more than 2,000 children aged from 3 to 6, and the studies included typically developing children and children at developmental risk. The authors report results irrespective of the executive function targeted by the intervention, but they are nonetheless worthwhile for this article, because working memory is closely correlated to other executive functions (Diamond, 2013;McCabe et al., 2010;Nweze & Nwani, 2020). First, they found significant overall effect sizes ranging from g = 0.34 to g = 0.05, with significant heterogeneity among the studies. Second, both near-and far-transfer (Note 3) were significant, with no significant difference between the magnitude of the effect sizes (g = 0.35 and 0.31 for near-and far-transfer, respectively). Third, they found no evidence of transfer effects to broader abilities relying on executive functions (e.g., early numeracy and literacy) and behavioural outcomes (e.g., parents and teacher's ratings of social skills, inattention, hyperactivity). Finally, two variables related to the type of training modulated the outcome measures: effects of group training (g = 0.44) were twice as large as those of individual training sessions (g = 0.21) and effects of non-computerized training (g = 0.37) were greater than those of computerized programs (g = 0.28). The authors did not present, however, data about short-term and long-term transfer effects. In sum, the results suggest that cognitive training of executive functions can be beneficial to preschoolers with regards to transferring trained skills to simple tasks requiring executive functions, but not to broader abilities; and those preschoolers especially benefit from off-screen, group training programs of executive functions.
Regarding children with atypical development, Peijnenborgh and colleagues (2016) present more promising results regarding the effects of cognitive training in children with learning disabilities (LDs). They carried out a meta-analysis of 13 studies, the majority of which was composed of samples of children with ADHD (10 out of 13 studies included exclusively ADHD children; the remaining studies included both ADHD and children with non-specified LDs). In near-transfer measures, the authors reported significant moderate effect sizes for verbal and visuospatial working memory both in short-term and long-term post-test conditions (g = 0.64 for verbal, short-term; g = 0.63 for visuospatial, short-term; g = 0.54 for verbal, long-term; g = 0.39 for visuospatial, long-term). Age was a moderator variable in the verbal domain, with older children (above 10 years) benefitting more from cognitive training. In the visuospatial domain, the effect sizes were particularly sensitive to the type of cognitive training program. The immediate effects of cognitive training in near-transfer measures persisted in the long-term, including the moderator effect of age -again reflecting maturational effects in WM development. For the far-transfer measures, the authors reported significant effects of training only upon reading, and this beneficial effect persisted in long-term (g = 0.48 for word decoding measures and g = 1.47 for the broader category "verbal ability", which clusters performance in verbal IQ tests from different neuropsychological test batteries).
However, as optimistic as the results above may seem, they were not replicated by a more complete meta-analysis including 15 studies exclusively with individuals with ADHD (Cortese et al., 2015). The authors controlled for reports about the level of ADHD symptoms according to the type of informant (proximal or blinded rater) as well as the type of measurements of executive functions (parent ratings or laboratory tests). Proximal raters were people closely related to the study participants (e.g., parents, teachers) who followed the training sessions. Blinded raters, in their turn, were experimenters who had no previous relationship with the participants and did not follow the training sessions. The only significant differences between control and training groups were reported by proximal raters in the total scores of ADHD symptoms (pooled standardized mean differences, SMD = 0.37) and in inattention scores (SMD = 0.47). Nonetheless, there was a substantial decrement in these differences when the outcomes were measured by raters blinded about the training situation, with an SMD = 0.20 for total ADHD symptoms and SMD = 0.32 for inattentive symptoms. These training effects upon symptoms did not persist in the long term. As for the near-transfer effects of training upon working memory, the authors found significant differences ranging from SMD = 0.47 to SMD = 0.58 in the verbal and visuospatial domains, which reproduces the patterns found in the general population and in typically developing children. Regarding the far-transfer effects, the authors did not find any statistically significant differences in control and training groups, irrespective of the outcome measured: overall, cognitive training did not yield any benefit in measures of inattention, inhibition, reading, or arithmetic. These results are aligned with recent meta-analyses in which the effects of far-transfer were near-zero or null Sala et al., 2019).
To conclude, a recent meta-analytical review of non-pharmacological interventions in ADHD has shown that cognitive training is the least effective intervention in terms of improving executive functions, including working memory (Lambez et al., 2020). The review included 19 studies evaluating the effectiveness of cognitive behavioural therapy (CBT), cognitive training, neurofeedback, and physical exercise upon measures of working memory, attention, inhibition, flexibility, and higher executive functions. Physical exercise was the most effective intervention, with a mean effect size of d = 0.93 (which is a large effect), followed by CBT (d = 0.7, medium effect), neurofeedback (d = 0.61, medium effect), and finally cognitive training (d = 0.45, small effect). Among the executive functions evaluated, working memory (d = 0.4, small effect) and attention were the least affected by the interventions (d = 0.41, small effect); for inhibition (d = 0.69) and flexibility (d = 0.6), effects were moderate.

Conclusion and Recommendations
As presented above, long-term far-transfer measures are the ultimate variables attesting to the effects of a cognitive training program. Studies failed to provide evidence supporting long-lasting and generalized effects of cognitive training upon working memory abilities, both in typically developing and ADHD populations. More critically, Lambez et al. (2020) showed in a meta-analytical review that working memory and attention are the executive functions least affected by non-pharmacological interventions in ADHD children, and that cognitive training is the least effective intervention to improve executive functions in general. In view of these results, new approaches emerge as alternatives to the use of cognitive training programs for children with ADHD, such as the use of metacognition strategies (Capodieci et al., 2019;Partanen et al., 2015) and coaching sessions (Nelwan et al., 2018). These approaches point towards the use of more adaptive teaching strategies that can be adapted and personalized to suit ADHD children's needs (e.g., adjustments of support materials, task goals, and planning agenda), and promote their abilities to plan, monitor, self-assess and self-motivate whilst performing tasks.
Specifically on the relation of metacognition and working memory, Forsberg et al. (2021) showed that younger children poorly estimate their working memory performance, although they become increasingly better at doing it with age. In this study, these authors asked typically developing children of different ages (9 to 13) and adults to judge their own accuracy in a working memory recognition test (i.e., memorize an array of objects and recognize a probe item as part of the array). Interestingly, the accuracy of one's meta-judgement was associated with better working memory performance in older age groups, and only adults were capable to refine their meta-judgements in the course of the task. These results suggest that meta-working memory is a skill that develops during childhood, following the developmental trend of working memory itself, and is not prone to rapid tuning until adulthood. We believe this is an interesting avenue for future research on metacognitive training and working memory.
Albeit not directly targeting working memory or children with ADHD, recent studies have reported positive results of metacognition training programs to improve mathematical skills in school-aged children with dyscalculia (Lucangeli et al., 2021) and in typically developing children (Fyfe et al., 2021). As for children with other special education needs, one study suggests that training a child's metacognition via dialogue groups in the classroom can improve their performance in working memory tasks to a better extent than simply practising the tasks via a computerized training battery (Partanen et al., 2015). Results have been proved promising so far, yet more research is needed.
Because metacognitive strategies can be adapted to a plethora of class activities and academic skills, we consider that teachers' pre-and in-service training could emphasize the use of these strategies with pupils instead of specific working memory training programs or software. Actions such as explaining the goal of a task and its steps and components before task completion allow children to anticipate their strategies. After task completion, promoting group reflections on the use of strategies ("What strategy did you use?", "What one was the most useful?"), allowing for self-assessment ("How well do you think you did it?"), and giving feedback on performance and strategy choice can enrich a child's capacity to self-monitor her performance (Capodieci et al., 2019). Training school-age children (aged 9-10) with ADHD to use mind maps has also been proved useful to improve their ability to inhibit distractors during mental calculations in a standardized test (the Loud Subtraction 7 test), compared to participants who were trained in sketch noting and controls (Kajka & Kulic, 2021). Because mind maps require a child to elaborate concepts and create hierarchical mental structures before putting their ideas into paper, they recruit much more top-down control (which is also involved in inhibition) than the free sketch noting strategy.
Perhaps most teachers know and use these strategies intuitively in the classroom, without knowing the psychological underpinnings behind them. Educating teachers about the executive functions that drive child development and sustain academic skills can be greatly beneficial, as teachers would gain explicit knowledge on these functions, which would help them designing metacognitive activities tailored to their students' learning needs. These teaching strategies are essential to prepare students for the societal changes propelled by the fourth industrial revolution (automatization, data science, artificial intelligence): more than never, children must be educated to be active learners, and formal education increasingly focuses on transferable skills rather than academic content.
To conclude, considering the scarcity of evidence of long-term and far-transfer of working memory abilities after cognitive training programs, formally introducing them as part of schools' curriculum is not recommended either for typically developing children or children with specific educational needs such as ADHD. More than three decades ago, in a time when training programs were not yet computerized, Abikoff (1985Abikoff ( , 1991 defended that cognitive training should not be considered as an alternative to psychostimulants to treat ADHD. While summarizing the evidence produced during the decades that preceded his publications (i.e., since the '70s), he already pointed to the failure in promoting self-regulation skills that could transfer to broader domains. The creation of computerized programs led to a change in the format of the training, using computers instead of paper-and-pencil task. However, this did not change the outputs. As presented in this article, conclusion on the effects of training remains rather similar. In fact, we now have even more evidence that cognitive training (at least of working memory and other executive functions) does not promote lasting transferable benefits. This urges the scientific community to change their research programs towards novel strategies of improving ADHD-related educational deficits. Future research should focus on better understanding the cognitive functioning of children with ADHD to offer them specific education -certainly a challenge to be achieved.