Children’s Early Acquisition of Syntactic Category: A Corpus-Based Analysis of English Determiner-Noun Combinational Flexibility

While the generativist account posits that an abstract specification of syntactic categories is innate and children show adult-like performance from an early stage, the constructivist account postulates that children’s early acquisition of grammatical categories is item-based and reflects limited rules later. The present study tests these assumptions in a specific category, the English determiners. More specifically, we took the controlled measures of overlap (e.g., the use of definite article the and indefinite articles a / an before the same noun type) in 16 children and their mothers’ spontaneous speech as an indicator of determiner-noun combinational flexibility. A series of three studies were conducted, in which we strictly controlled the impact of differences between children and adults in lexical knowledge. In Study 1 and Study 2, we find that children’s use of determiners shows a significant difference from adults but this difference disappeared later. Furthermore, Study 3 investigates the influence of external environment with birth order and family’s social class as factors and emphasizes that the input factor is worthy of further investigation in future studies. These findings are consistent with one of the constructivist claims, namely that children’s early acquisition of determiners is not category-based and their flexibility in using determiners gradually approximates that of adults with development.


Children's Early Grammar Development
The development of children's syntactic knowledge has long been controversial and one of the core debates focuses on the nature of the early development of grammatical categories. In order to generate adult-like grammar, children not only need to acquire the syntactic category that adult speakers are claimed to possess, but also need to figure out the relationship between categories to arrive at the correct word orders for grammatical sentences (Abbot-Smith, Lieven Tomasello, 2003). Generativist researchers assume that at least a certain level of linguistic categories and principles of utterance formation are innate, and some even claim that children possess adult-like grammar categories from early on, while constructivist researchers argue that even though they may have the potential to acquire language at birth, they do not develop their grammar innately but rather gradually construct it based on the exposure to input. One of the keys to distinguishing these contrasting theoretical approaches is the productivity and flexibility in using syntactic knowledge at children's early stage of development, in which generativist approaches predict no variation in the productivity of the words that children use to combine within their early stage, whereas constructivist models propose that, as children's grammatical categories are limited, their less productive early knowledge will approximate adult's grammar through data-driven learning. of grammar development and the mastery of discourse functions (Bassano, 2015, p. 25). The emergence of using determiners with nouns rather than bare nouns is regarded as a crucial developmental step in children's speech production, indicating the development of grammatical properties of the noun category (Chomsky, 1965; Berwick & Chomsky, 2016, p. 9). There is evidence that children master the basic semantic and morphological distinctions in their determiner system at the age of three or four (Bassano et al., 2008;Bassano et al., 2013;Montrul, 2011). Moreover, some scholars also believe that this ability to show combinational use of determiners with nouns in context is also the hallmark of language evolution, as children's use of a finite inventory of linguistic units to form an unbounded range of meaningful expressions seems to show that they have already acquired a combinational grammar (Goldin-Meadow & Yang, 2017; Hurford, 2012). Nevertheless, there are observations that young children display an optional omission of determiners in obligatory contexts (e.g., saying want cat instead of I want a/the cat), and have limited and formulaic combinatorial flexibility in their early determiner-noun combinations (Tomasello, 2003, pp

Syntactic Categories in Child Language Acquisition
The much-debated issue is whether children's acquisition of syntactic categories is innate. Opposing views have been raised mainly by two rival accounts, namely the constructivist account and the generativist account. The generativist account's position is that children are born with the innate knowledge of linguistic categories such as verbs and nouns and they learn the details about how such categories apply in their target language (cf. Chomsky, 1957;Valian, 2014). The constructivist account, on the other hand, posits that children start by observing details (cf. Braine, 1963) and gradually create abstract structures (e.g., Tomasello, 2003;Abbot-Smith & Tomasello, 2006). Despite their distinction on the innateness of children's syntactic categories, however, both accounts postulate a certain level of abstractness at the end-state of adult grammar. While previous studies on child syntactic development mainly fall into two accounts as mentioned above, their explanation of the formation of child syntactic categories can also be divided into the following three possibilities. They vary in their assumptions of whether children arrive at adult-like grammar eventually as well as the explanation of how children obtain this knowledge.

Innate Syntactic Category
The first possibility is that children are born with innate syntactic categories and they need to learn how those abstract categories behave in their target language, which falls under the category of nativism (Valian, 2009, p. 744). Nativists posit that syntactic categories in adults' knowledge can be best represented by basic syntactic categories (e.g., NOUN, VERB). Although this claim has been challenged by opponents, who point out that languages have great diversity in terms of their syntactic categories (e.g., Haspelmath, 2007), nativists put forth the idea that those categories mainly serve as a "toolkit" (Jackendoff, 2002, p. 263) and language users may select "tools" from it. To address the concern of how children acquire the knowledge of such adult-like syntactic categories, three main hypotheses under the nativist account were developed.
The semantic bootstrapping hypothesis (Pinker, 1982(Pinker, , 1984(Pinker, , 1987 assumes that, in addition to syntactic categories such as NOUN and VERB, children also have the innate knowledge of the mapping between semantic basis and syntactic types and functions (see example 1). Admittedly, semantic categories such as action or person are observable in the real world, which would be possible for children to acquire. However, the main issue with this hypothesis is that semantic types do not always link to a unique syntactic category (Benedict, 1979) and vice versa. For instance, actions can be denoted not only by verbs (e.g., rise, fall) but they can also be expressed through prepositions (e.g., up, down).
b. Elsa suspected the fact….

Previous Studies on Child Determiner Acquisition
A fundamental crux of understanding children's language representations and how they may change through development is their early language productivity and abstraction. This issue has been controversial, and researchers adopting generativist models (e.g., Valian et al., 2013) and constructivist models (e.g., Pine et al., 2013) seek to focus on the existence of adult-like syntactic categories in young children's determiner system. As some researchers oppose constructive models by arguing that the distribution of English determiners in young children supports the generativist claim that children have an adult-like determiner category from early on (Valian et al., 2009;Yang, 2010;Yang, 2013), studies from constructivist researchers point out that these empirical arguments are invalid due to the inappropriate analyses and interpretation of the data (Pine, 2013;Lieven, 2014). The main debate has been over the methodology of how to reliably measure the use of determiners with different nouns in child speakers. Previous findings on children's ability to use a specific category conform to adult grammar are regarded as evidence supporting the adult-like category pattern in children's early stage (Valian, 1986;Ihns & Leonard, 1988). However, this kind of production criteria is too lax in a way that it fundamentally failed to rule out the possibility that children lack adult-like categories, because they may only acquire the knowledge of using different instances of the grammatical category separately in a less abstract and more limited way. On the surface, this knowledge enables children to behave as though they possess adult-like abstract determiner categories, yet they could be completely unaware that the definite and indefinite articles share the same syntactic category. As an alternative, Pine and Martindale (1996, pp. 370−380) advocate a way to measure the extent to which children exhibit overlaps in nouns and predicates with which they combined with different determiners. Instead of measuring children's syntactic knowledge, this relative criterion focuses on measuring the flexibility of children using various instances of the same putative category and comparing it to test whether they are significantly less productive than the adult control (Pine et al., 2013). The data of Pine and Martindale (1996) seem to indicate that children's early determiner use is more in line with an explanation of limited scope as they are less productive than the level if they acquire adult-like determiner categories. . Their first study result found that the overlap comparison is highly sensitive to the vocabulary range, and once they control the variables of the sample size and vocabulary range in their following studies, the less flexible performance in children compared to adults is consistent with the assumption that young children's determiner system is less abstract, proving that children's lexical specificity is not a Zipfian artefact and their knowledge of determiner category becomes increasingly abstract over time. Despite both Valian et al. and Yang's analyses confirming the importance of considering sampling factors when assessing the level of overlap in the child corpus, neither of their claims proved that the relatively low level of overlap in young children's speech can be explained purely by sampling issues, and therefore Pine noted that they did not provide any solid evidence to support the claim that children have an adult-like determiner category. The consensus in these studies calculating overlap scores is that Zipf's Law is generally acknowledged to play a crucial role in controlling the frequency distribution, as it improves the measurement of language productivity. Specifically, the lexical items should be used by both the child and the caretaker so that it is possible to randomly reduce the adult vocabulary range to the same size as that of the child (Pineet al., 2013; Lieven, 2014; Rajewski et al., 2012). Once these relevant factors are controlled, young children should demonstrate systematically less productivity in generating the determiner and noun combinations by comparison with their caretakers.

The logic of this overlap measurement is accepted by various scholars, yet the conclusion has received various
In view of the findings from Pine's analysis, the present study attempts to take their implications into the consideration to further assess children's syntactic category development in different theoretical frameworks, with even younger participants and other potential input factors. Firstly, in terms of the lexical specificity analysis in child-caregiver corpus comparison, data of children's early speech production should be properly interpreted with reference to an index of how lexically specific the speech would be assuming the child had adult-like knowledge. The sampling effect problems are that many vocabularies have low frequency in both child and adult speech since the distribution of words in naturalistic speech obeys Zipf's law and preferable syntactic contexts for words to be grammatical deduce their appearances in other contexts. Thus, in order to derive an index for expected overlap, one approach is to apply the same restricted sampling standards to the targeted corpora for comparisons which enables the analyses to be concerned less about the distributional properties of the language corpora in early development. The comparison can be a synchronic child-mother pair or a diachronic speech comparison of the same child at different points of development. As Pine et al. (2013) suggest, this is a promising way to test the theoretical model that enjoys benefits that the computational modeling approach does not have. The present study adopts this method for the overlap index and extended their research from only child-maternal speech comparison to diachronic comparison in several syntactic developmental stages. Secondly, the measurement of lexical specificity scores needs to be controlled for their sample size as well as the identity of the lexical items in comparison. The present study samples instances of lexical items from data that incorporate the same item effects to ensure our analyses are meaningful comparisons. By sampling an equal number of instances of each item from a fixed control sample, the sampling issues mentioned before, namely that including the item effects of some nouns having different likelihood to be combined with different determiners in adult language, should be solved. Thirdly, the nature of this observed phenomenon has several possible interpretations which require further investigation. When adopting a lexically specific inspection of the controlled speech, the items with a higher frequency that dominate children's early production are less productive than that of their caregivers, and children's flexibility and productivity grow as the frequency of the noun item increases. Since Pine's observation opens to various theoretical accounts, explanations of this finding remain to be explored. The present research evaluates the theoretical explanations discussing whether the syntactic categories are innate, induced, or illusory, and offers insights into the syntactic development mechanism by examining input factors additionally.

The Present Study
The present study aims to examine children's early syntactic category development by probing into their flexibility in using English determiners a/an/the, as compared with adults (in our case, the mothers). Their flexibility in using determiners is measured by the overlap score, which is calculated by dividing the number of determiner-noun pairs showing overlap (e.g., the noun tokens are modified both by a/an and by the) by the total number of determiner-noun pairs in the selected corpora. After having strictly controlled the sample, three sets of analyses were conducted on the selected corpora.

In Study 1, we investigate whether children's categorization (if any) of determiners is adult-like in early developmental stages. To that endeavor, we compare children's and mothers' overlap measures across development and test whether the difference between child and mother (if any) would vary as child MLU increases. In Study 2, rather than looking at overlap measures of children according to their language ability represented by MLU, we take time as the indicator and look for differences among the same group of children and mothers in two test points. Furthermore, a follow-up analysis in Study 3 is included to delve into whether other input-related variables (e.g., birth order, social class) could have an impact on children's overlap measure.
In these studies, predictions vary under the three possible theoretical explanations about child syntactic categories as introduced above. Assuming that children are equipped with innate categories from the first place, we would expect to find their overlap measures as high as their mothers' and it should remain the case throughout their development or, in our case, two different test points. Considering that generativists emphasize more on the innateness of the category, input quality and quantity should not make much difference. If, on the other hand, children's generalization of determiners is induced gradually based on distributionally defined ijel.ccsenet.org International Journal of English Linguistics Vol. 12, No. 6; 2022 clusters, they are predicted to show less flexibility in using determiners (as compared with their mothers) but their flexibility increases gradually over time. We may also observe that, in early stages, children only use a certain article rather than others (e.g., a/an instead of the or vice versa) since they store exemplars through rote-learning. A third possibility is that children may not form abstract syntactic categories at all, but they analogically compare their input with stored exemplars and such exemplars may exist for a long time even when learners grow up. In that case, we expect to find that, similar to the last prediction, children may show less overlap (compared with their mothers) in using determiners since they are still storing exemplars and their performance will improve at a later time when they can make judgments based on their stored exemplars. Another prediction following this hypothesis is that, regardless of whether those learners form abstract syntactic categories eventually, they may still show better performance or higher sensitivity when encountering exemplars that they have stored at the initial language acquisition stages. As for the input influence, both the induced and illusory perspective value it to have related impacts as item-based and exemplar-based ways of language acquisition require children to store information from language exposures. Despite its significance in distinguishing whether syntactic categories are induced (usage-based) or illusory (exemplar-based), finding such evidence would require more data from studies on adults or longitudinal studies and therefore may only be partially addressed in the current study due to the limitation of the scope.

General Method
All the studies adhere to the same basic method for corpus analysis using the CLAN program (MacWhinney, 2000). This involves searching transcripts in CHAT format for determiner plus noun sequences in the corpora. Determiners plus nouns pairs can be identified through the mor-line. Instances of a/an and the followed by nouns either directly or with intervening words like adjectives in between are extracted for analysis.

All analyses were based on the Howe Corpus (Howe, 1981), downloaded from the Child Language Data Exchange System (CHILDES) (MacWhinney, 2000). The corpus includes transcripts from 16 Scottish child-mother pairs (seven girls and nine boys) during a session they played with toys at home. Participants lived
in Glasgow, Scotland, and were randomly selected from a small university and nearby villages. Social class was divided into the "middle class" (with fathers of 13 children having professional or managerial occupations) and the "working class" (with fathers of 11 children having skilled or semiskilled manual occupations). Data of 40 minutes duration were collected at two time points. More specifically, children aged 1:6 to 1:8 (mean 1:7) in the first test, and 1:11 to 2:1 (mean 2:0) in the second test. In each videotaped session, children had 20 minutes to play with their own toys and 20 minutes in which they played with a special set of toys presented in a specific order.

Analysis Control
In order to control not only the sample size but also the identity of nouns that reflects the frequency of nouns combined with either one of the articles, samples were controlled through three criteria in each individual corpus. All criteria must be met before assessing the existence of overlap in the controlled samples. Overlap scores were then calculated by dividing the number of nouns showing overlap by the total number of nouns that met the controlling criteria.
Note that, in all of the analyses, overlap scores were calculated under strict control of sampling size. This was achieved by random sampling (with replacement) determiner + noun tokens from the mother's corpus in a way that, for each noun type (e.g., the noun man), the mother's corpus consisted of the same number of tokens as the child's. For instance, if a child's corpus only contains two tokens containing the noun man modified by a determiner, we would randomly select two determiner + man tokens from the relevant mother's corpus regardless of how many determiner + man exist in the mother's corpus. We took the process of random sampling as a necessary step because, according to the Zipfian distribution, nouns occurring with higher frequency (tokens) would theoretically show a higher probability to be modified by both indefinite articles and definite articles, therefore more easily showing overlap (Pine et al., 2013). After the process of random sampling, the final overlap scores for each mother were then calculated by averaging their overlap scores in 100 times of random sampling.  Criterion 2: Nouns must appear at least twice in both the child's and the mother's speech. On the assumption that noun distribution obeys Zipf's law (Chao & Zipf, 1949), the proportion of nouns occurring with low frequency is likely to be higher in adults' corpus than in children's corpus because adults own larger noun vocabulary size. To avoid undermining the overlap scores of adults on nouns that their children produce and exclude the impact of differences in lexical knowledge, it is important to control the identity of nouns within each child-mother pair.
Criterion 3: Nouns must appear at least twice with either a/an or the in both the child's and the mother's speech. As Valian et al. (2009) pointed out, the overlap is, by definition, impossible in a noun occurring only once with a determiner. Nouns modified by determiners should appear at least twice so that overlap can possibly occur.
These three criteria enable the research to compare measures of noun overlap after controlling the sample size and identity of relevant nouns as well as the frequency of these nouns combined with either a/an or the. The location of these nouns in the Zipfian frequency distribution is a crucial factor in determining the size of overlap scores because lexical items in naturalistic speech interact with differences in vocabulary range in a way that those nouns with low frequency will have a higher probability of overlap in adults than in children with a smaller noun vocabulary range. Therefore, it is likely to mask differences in the overlap between young children and their mothers without proper control of the noun identity. The current research addressed the issue by directly comparing overlap scores between child and adult pairs on an equivalent number of instances of a shared set of nouns that contained the same fixed set of nouns and the same number of a/an/the + noun tokens for each noun in the pool.

Study 1: Comparison of Overlap Scores in Children and Their Mothers
Study 1 aims to investigate whether children have an adult-like determiner category in early development by comparing child overlap measures with adult overlap measures. If children do have an adult-like determiner category, they should show noticeable overlap to an extent that is close to their mother's performance and it shall remain the same regardless of children's MLU values. Otherwise, differences should be noticed in children's flexibility in using determiners compared with their mothers, which will later decrease as children's MLU increases.

Method
Analyses in Study 1 were achieved first by comparing children's overlap measures with their mother's. Considering that in Study 1 we would like to know how children's performance can vary as their language ability (represented by their MLU) varies regardless of the time point they were tested, recordings at two test points from the same child were treated as two distinct samples in Study 1. 18 out of 32 samples were excluded for failing to meet the controlling criteria (see Appendix A for controlling results). Overlap scores after random sampling, which was repeated 100 times, were then calculated for each child and the mother. In addition to delving into the differences between children and their mother (child overlap score subtracted by mother overlap score), we also probed into whether such differences (if any) would vary as children's language ability (represented by their MLU) varies.

Results
Table 1 shows child and mother overlap scores of the 14 controlled samples. Also presented is the child's MLU and the average number of tokens per noun type for each child-mother pair. Note that, due to the control of sample size, the number of noun types and noun tokens were the same within each child-mother pair, therefore resulting in the same ratio for their tokens per noun type. Before investigating the difference between mother overlap score and child overlap score, a correlation test was conducted to find out whether previous sampling consideration was effective. This was achieved by correlating both children's and mothers' overlap scores with the average number of tokens per noun type. Analysis indicated a marginal correlation for mothers' overlap measures (r = .47, df = 12, p = .09 < .10), suggesting that mother's overlap scores are positively correlated with tokens per noun type to a marginal extent. In line with Pine et al.'s (2013) results in which marginal to significant correlation was found for mothers, our result also indicated that controlling the determiner + noun tokens in the mother's corpus was a necessary controlling measure, without which mother's overlap scores would in principle be higher than children's if they say more determiner + noun tokens than children do. However, no significant correlation was found for children (r = -.05, df = 12, p = .87). Although this non-significance was different from the marginal to significant correlation found in Pine et al.
, our result is actually reasonable considering the possibility that children may not have a determiner system but rather stick with a certain combination (e.g., a cat, a man, a tiger), in which case children's use of determiners remains the same pattern regardless of how many determiner + noun tokens they repeated (e.g., in the corpus of Barry in test 1, he used a rather than the even though he repeated the noun man for three times, tiger for five times, and cat for five times).
With the effectiveness of the sampling consideration confirmed, further analyses were carried out to probe into the difference between children and mothers. It is noticeable from the descriptive data in Table 1 that variation exists in child and mother overlap since the difference between them is not zero in most cases. These patterns of results were verified by paired sample t-test, which did not find significant differences between children and mothers (t = -.15, df = 13, p = 0.88). Although the overall result of the t-test suggests no significant difference between mother and child in the flexibility in using determiners, it does not rule out the possibility that differences between the two groups may exist at first but diminish later as children's MLU increases, thus showing no significant differences in total.
To further examine the possibility that the difference between groups diminishes with the increase of children's language ability (symbolized by MLU), linear regression analysis was run to test the relationship of children's MLU and overlap difference between each child and mother (calculated by subtracting child overlap score from mother overlap score), with tokens per noun type as a main factor as well. As is shown in Figure 1 and Table 2, we found a marginal relationship between children's MLU and overlap difference (p = .095 < .10) whereas the fixed effect of tokens per noun type showed no significant nor marginal relationship with overlap difference (p = .226). It indicates that, as children's language ability grows, their differences with their mother decreases.

Study 3
As childre developme quantity an language d realize tha influencing mothers ad (Huttenloc influence variation in

Results
The analys class (mid significant overlap sco = .081). A children fr non-first-b environme overlap sc order diffe .org

Discuss
Present an namely th (exemplardeterminer stages, we across two occurring w Analyses determiner language a did not va MLU incr children's However, would als dimension scores from a five-mon both child flexibility and Study and that th flexibility. In Study 3 of birth or between th measures involving t in particul responsibl caregiver-c communic language child-direc the structu class and Additional .org   , 2005). Nevertheless, due to the variation in the analysis of different relevant indicators, the results are inconsistent and thus the impact of such a factor is still unclear. There is a need for further research with more sophisticated approaches to examine the relationships between these factors in children's acquisition of determiners and syntactic development. Since from theoretical perspectives of syntactic development, the role of input plays different roles in generativist accounts versus constructivist accounts, the discussion of relevant factors also facilitates revealing the mechanism of syntactic development. Those who advocate nativist accounts of language acquisition emphasize children's innate ability for language acquisition and their maturational constraints, and this innate schedule in language learning renders them to regard input quality and quantity as being less important (Chomsky, 1981(Chomsky, , 1995Valian, 2014). Despite the claim from nativist researchers that children could identify and assign instances of innate category in the language input, their explanations including bootstrapping and distributional learning (Lleó, 2001;Mintz, 2003;Ruhlig & Bittner, 2013) still involve certain input influences as children need to have input observations for a successful allocation and deduction of syntactic category, and yet the input factor seems to be less focused in their discussion. Instead, some generativists even propose that children's evolved biological capacity for language learning generates combinatorial productivity without external linguistic input (Goldin-Meadow & Yang, 2017). They advocate that their statistical results provided evidence that, instead of memorizing specific word combinations from caregiver speech, children have a productive grammar that follows abstract rules including the determiner-noun combination in which they combine words independently (Yang, 2013).
In contrast, input plays an essential role for children to eventually arrive at the abstractness of syntactic categories in constructivist theories. The traditional constructivist account assumes that children have an initial item-based learning mechanism in their beginning stages, where they memorize chunks available in the input as holophrases, and then break up memorized chunks into less lexically specific slot-and-frame schemas.  (Yang, 2013). They claim that the simulation of the developmentally motivated item-based learning model improved with exposure to more linguistic input, and it successfully captures the actual determiner-noun combinations in the dense child corpus. In summary, although the current result on input factors may not be informative, it can be regarded as a probe into revealing the environmental factor in children's early grammatical development, which appeals for more research to provide evidence for the theoretical debates on the mechanism of syntactic development. constructivist theory assuming that children's less flexible productivity progressively approximates adult linguistic performance in later developmental stages. As the analyses in this research indicate that the low overlap score in the early syntactic development is not a Zipfian artefact, we interpret this catching-up phenomenon as an indicator that, instead of having adult-like syntactic knowledge at the beginning as nativist theories assert, children's syntactic category is progressively becoming abstract through data-driven learning rather than innateness. Note that this discussion focuses more on the result of whether children have an adult-like syntactic category at early stages, thus being insufficient to unravel the detailed process of how they arrived at such abstractions and generalizations (if any).

Conclusion
In order to further understand how the acquisition mechanism work and also consider the third possibility of exemplar-based accounts' claim that syntactic categories are illusory rather than induced, this research additionally evaluate the corpus data and found combinational patterns in children who already exhibited the ability to make noun-determiner combinations. In addition to omission errors of determiner in an obligatory context like "Back the car." "It tiger.", there are ungrammatical combinations like "a back" in inappropriate contexts that appeared in several children's corpora. With closer examination, instead of acquiring this combination from the mother's speech, all of these "a back" utterances were initiated by the child in child-mother pairs. According to the exemplar-based proposal that learners are simply storing the original sentence strings with detailed probabilistic information, it seems unlikely that this combination of the indefinite article would be followed with the noun "back" referring to a space. As this account proposes that the surface form of young children's language productions is generalized across their stored multiword linguistic exemplars, in which there are fine-grained distributional properties with frequency information, this kind of rare combination should not easily emerge by itself. Moreover, the optionality of using determiners also poses challenges to the exemplar-based account. In view of the child-mother corpus, there is no adults' use of inappropriate bare nouns, yet the variability of the child's omission of determiners is observed. The key difference between the exemplar-based account and a traditional constructivist account ( The present results yield implications for the investigation of child syntactic development. To begin with, they verified the effectiveness of the sampling considerations proposed in Pine et al. (2013). That is to control the frequency and identity of nouns within each child-mother pair and calculate the average overlap score based on results of random sampling (with replacement) repeated 100 times. This finding is necessary because, theoretically, mothers are more likely to show overlap if they produce more tokens for a specific noun as compared with their children. Therefore, controlling both the noun type and the number of determiner + noun tokens can eliminate the potential bias and create an equal probability for the child and the mother to show overlap. The results suggest that, as found in Pine et al. (2013), the mother's overlap score increases as the tokens per noun type increases. However, unlike in Pine et al. (2013), where child overlap score was also positively correlated with tokens per noun type, no significant nor marginal correlation was found between them in the current study. Our finding is actually reasonable in that if children do not have an adultlike syntactic system but rather they build their understanding of determiners on schemas or exemplars (e.g., if they just say "a cat" rather than "a cat" or "the door" rather than "a door"), the number of noun tokens would not impact their overlap possibility in the first place. The implication is that the control of the noun identity and noun frequency is necessary so that the theoretical possibility for overlap to occur would be the same for the child and the mother, leading us to see the actual overlap without the bias caused by the higher noun frequency in either the mother or the child corpora.
A second implication is that the population involved in this study is younger (age range 1:6 to 1:8 in Test 1 and 1:11 to 2:1 in Test 2) than in previous studies on children's early syntactic categories. For instance, children aged from 1:8.22 to 2:0.25 when starting the one-year data collection in Pine et al. (2013), in which significant differences were found among children and caregivers in Phase 1 but not in Phase 2, suggesting that children have caught up with their caregivers over time. In addition to confirming the similar trend that children's flexibility approximates that of adults, our findings put forth the idea that this catching-up time may occur even earlier (before children reached the age range 1:11 to 2:1) than previous studies suggested.

On top of the new evidence on when children catch up with adults from the dimension of time (distinguished by
Test 1 and Test 2 which were five months apart), this study also yields credible results from another perspective, namely language development (represented by child MLU). Note that although Pine , 1973), it was unclear whether the difference varies for children across different developmental stages. However, the present study bridges the gap with the discovery that the difference between mother overlap and child overlap gradually decreases as the child's language ability develops (measured by the child's MLU).
There are, however, some limitations in the current study. Due to strict controlling criteria, the amount of valid data entering the analyses was not sufficient enough. As a result, the overlap scores of some children and mothers were represented with NA, which is less informative. Additionally, those NAs were replaced by zero in the calculation in Study 2, which may open the result to several possible interpretations. Recall that NA in the overlap score column does not necessarily mean that the individual has zero flexibility in using determiners. Rather, the individual might have used the D + N combination several times which might show overlap but this data was not included in the analysis since this noun did not appear in both the mother and the child corpora. It should be noted that the replacement of NA with zero may decrease the absolute value of the average overlap scores for both the mother and her child. NAs occur for a child-mother pair at the same time owing to the controlling criteria and differences between child-mother pairs where overlaps scores were marked as NA will be taken as zero. In this way, the overlap difference between groups in Test 2 may be lower than the actual situation considering that children and mothers may show a certain level of overlap. However, the replacement of NA with zero should not heavily bias the data since there were only two child-mother pairs in which NA exists out of eleven pairs. The overall trend that children approximate adults should remain similar, whereas the exact timepoint and to what extent children catch up with adults requires future investigation with larger sample and higher data collection frequencies (e.g., more tests evenly separated in the age range of 1:6 to 2:1).