1 Introduction

A core feature of human language grammars is that they allow discontinuous dependencies: In (1a) the filler phrase which waffles can be linked to the gap position (denoted by ___ ) after the verb like, allowing the filler to be interpreted as its direct object, analogous to (1b).

(1) a. Which wafflesi did the skiers like ____i ?
  b. The skiers liked which waffles?

Most languages also permit long-distance filler-gap dependencies: A filler like which waffles can be linked to a gap position across a (potentially unbounded) number of clauses, as in (2).

(2) a. Which wafflesi did Svanhild say [that the skiers like ____i ]?
  b. Which wafflesi did Svanhild think [that Tor said [that the skiers like ____i ]] ?

In many languages where long-distance filler-gap dependencies into embedded declarative clauses are acceptable, dependencies into relative clauses (RCs) and embedded questions (EQs) are unacceptable (see 3 and 4).

(3) Relative Clause Island
  a. *Which wafflesi do you know the skiers [that like ____i ]?
  b. *I made the wafflesi that you know the skiers [that like ___i ].
(4) Embedded Questions (Wh-Island)
  a. *Which wafflesi do you know [who likes ____i ]?
  b. *I made the wafflesi that you know [who likes ___i ].

Constituents that block filler-gap dependencies are islands (Ross 1967). Islands pose a puzzle for language acquisition: According to one line of thinking (Chomsky 1973, 1980), English-learning children face a poverty of the stimulus problem (Chomsky 1965; see also Pullum & Scholz 2002; Lasnik & Lidz 2017; and Pearl submitted for extended discussion): the distribution of acceptable filler-gap dependencies in the English input is compatible both with narrow hypotheses that correctly restrict long-distance filler-gap dependencies to embedded declarative clauses and with less restrictive hypotheses that incorrectly generalize the possibility of long-distance filler-gap dependencies to all embedded clauses, declaratives, EQs, and RCs alike. Nevertheless, English learners consistently settle on the more restrictive option. Research shows that learners of English treat EQs as islands at least by the time they are 4;0 years old (de Villiers, Roeper & Vainikka 1990) and RCs as islands at 4;7 (de Villiers & Roeper 1995, see also Otsu 1981).

Islands point to some kind of inductive bias in how children generalize from the finite set of filler-gap dependencies that they observe to the abstract set of acceptable filler-gap dependencies in their language (Pearl & Goldwater 2016). How to best characterize the bias is a point of theoretical debate. Some accounts cache out the bias in terms of innate, formal linguistic constraints on the space of possible hypotheses (Ross 1967; Chomsky 1973, 1977, 1986). Other accounts eschew explicit representational constraints and encode the bias into data-driven discovery procedures for learning acceptable filler-gap dependencies that favor limited or conservative generalization (see, e.g., Pearl & Sprouse 2013; Bates & Pearl submitted for a proposal within the Generative tradition, but also Dąbrowska 2004, 2008; Verhagen 2006; Maratsos, Kuczaj, Fox, & Chalkley 1979).

Cases of cross-linguistic variation in island-sensitivity are particularly useful for sharpening our understanding of the acquisition of filler-gap dependencies and the inductive biases that guide the process. To this end this paper focuses on the case of Norwegian. Mainland Scandinavian languages like Norwegian allow filler-gap dependencies into EQs (e.g., Maling & Zaenen 1982; Kush & Dahl 2020) and (some) RCs (Allwood 1982; Christensen 1982; Engdahl 1982, 1997; Erteschik-Shir 1973; Jensen 2002; Lindahl 2014, 2017; Taraldsen 1982 a.o.). Examples taken from naturalistic speech are given below:

    1. (5)
    1. Embedded Question
    1.  
    1. Var
    2. was
    1. det
    2. it
    1. deni
    2. that
    1. (som)
    2.   REL1
    1. vi
    2. we
    1. ikke
    2. NEG
    1. visste
    2. know
    1. [
    2.  
    1. hvork
    2. where
    1. vi
    2. we
    1. kunne
    2. could
    1. finne ___i ___k ]?
    2. find.INF
    1. ‘Was it that one that we didn’t know where we could find?’          (Ragnhild Eik, p.c.)
    1. (6)
    1. Relative Clause
    1.  
    1. Deti
    2. that
    1. er
    2. is
    1. det
    2. it
    1. flere
    2. many
    1. [somk
    2.   REL
    1. ____k
    2.  
    1. holder
    2. hold
    1. on
    1. med ____i ].
    2. with
    1. ‘There are many (people) who are doing that.’          (NRK’s Ekko Podcast; Episode: Manetslim kan fange mikroplast i havet; 31.07.2019)

The fact that Norwegians draw different conclusions about which embedded clauses are islands implies that the Norwegian input differs in a fundamental way from input to English children. If it did not differ from English input, we would expect Norwegian children to treat EQs and RCs as islands, like their English-learning counterparts. At present it is not known, however, how the Norwegian input differs. The current study is a preliminary step in filling this gap in knowledge.

The simplest possibility is that the input to Norwegian children contains direct evidence of filler-gap dependencies into EQs and RCs. However, the existence of direct evidence is not guaranteed. It could be absent from children’s input. In such a case, learning biases would have to be capable of inferring the non-island nature of Norwegian EQs and RCs from indirect evidence. The empirical goal of our paper is therefore to investigate whether island violations exist in the input so that learning the non-island status of EQs and RCs could be based on direct evidence.

Even if direct evidence does exist it may not be frequent enough to guarantee reliable acquisition (Legate & Yang 2002; Yang 2002, 2011). Evidence must be frequent (and consistent) enough to be distinguished from ‘noise’ (that is, errors that should be ignored by the learner) and, under some models, to drive probabilistic hypothesis testing. In order to assess whether a direct learning route is plausible, we must therefore characterize the frequency of direct evidence.

Unfortunately, there is not consensus on a rigid quantitative threshold for what constitutes sufficient direct evidence. In the absence of such a threshold, we compare the relative frequency of island violations to the frequency of other uncontroversially grammatical long-distance dependencies in Norwegian. The acceptability of long-distance movement from declarative complement clauses is presumably learned via positive evidence.2 Though it has been shown that such evidence is relatively infrequent in the input to children (in languages like English; Yang 2002; Pearl & Sprouse 2013), children nevertheless learn to accept such dependencies. We can therefore use the frequencies of long-distance non-island dependencies as a relative frequency benchmark. Thus, we ask: does direct evidence for island violations to children occur at greater, lesser, or comparable frequency to direct evidence for regular long-distance dependencies?

In addition to frequency, we also consider the distributional characteristics of the input in order to address a second question regarding the granularity of the generalizations that Norwegian children must learn. Early generative approaches abstract over fine-grained differences in dependency type and (certain) syntactic features when defining islands (Chomsky 1977). Island constraints impose general restrictions on A’-movement, which different dependency types such as wh-movement, relativization, and topicalization are all instances of. Insofar as all three are A’-dependencies, they are expected, all else equal, to exhibit comparable island sensitivity. Similarly, the definitions of island domains are usually defined in rather coarse structural terms: all (finite) EQs are treated as islands, irrespective of most of their internal syntax; RCs are likewise treated as islands across the board.

There is evidence that finer-grained distinctions need to be made (at some level). First, the acceptability of island violations varies as a function of the type of the filler: dependencies with argument fillers are acceptable, while adjunct fillers are not. Second, acceptability may vary as a function of dependency type: adult participants tend to judge topicalization dependencies into some islands as acceptable more reliably than wh-dependencies (Kush, Lohndal, & Sprouse 2018, 2019). Finally, as we discuss below, island-violating dependencies seem more acceptable or common with a restrictive subset of RCs or EQs. If finer-grained distinctions need to be made, these distinctions should either follow from universal principles, or the primary linguistic input should offer cues to the appropriate subset generalization. As previous studies did not conduct an exhaustive overview of the full range of acceptable island violations, we do not know whether the input is actually restricted to the types of examples that are most frequently reported. We therefore present a finer-grained description of the distribution of observed island violations.

Finally, we discuss how various models of filler-gap acquisition, each with different inductive biases, would fare in generalizing from the Norwegian data. We consider usage-based models (MacWhinney 1975, 1982; Tomasello 2000, 2003; Goldberg 2006; Dąbrowska 2004, 2008; Verhagen 2006) and two types of models rooted in the generative tradition, which focus on learning purely syntactic generalizations: a data-driven statistical learning model (Pearl & Sprouse 2013) and parameter-setting models (Wexler & Manzini 1987; Gibson & Wexler 1994; Yang 2002; Sakas & Fodor 2001, 2012; Pearl & Lidz 2013; Gould 2017). We conclude that usage-based models are liable to overfit the input distribution. The generative learning models learn generalizations that go beyond the forms observed in the input distribution (and fine-grained restrictions observed in the target language) because they do not represent certain (semantic or discursive) features that appear to modulate acceptability.

2 Characterizing the Target Grammar

We begin with an overview of acceptable long-distance dependencies in the Norwegian target grammar, which can be compared against our corpus sample to assess whether children’s input provides direct evidence for the full scope of adult generalizations. We discuss acceptable long-distance dependencies from non-island complement clauses first and then move to dependencies into RCs and EQs. We consider factors that have been argued to play a role in determining the distribution of acceptable island violations and conclude that though some dependencies into RCs and EQs are judged unacceptable, the unacceptability in these cases is likely extra-syntactic in origin.

Norwegian allows long-distance filler-gap dependencies into declarative complement clauses, which are not islands cross-linguistically. This can be seen with the following three dependency types: wh-movement (7), relativization (8), and topicalization (9).3

    1. (7)
    1. Hvai
    2. what
    1. sa
    2. said
    1. du
    2. you
    1. [
    2.  
    1. at
    2. that
    1. Andrew
    2. Andrew
    1. ville
    2. wanted
    1. lage
    2. make.INF
    1. ___ i ] ?
    2.  
    1. ‘What did you say that Andrew wanted to make?’
    1. (8)
    1. Mat-eni
    2. food-DEF
    1. [
    2.  
    1. som
    2. REL
    1. du
    2. you
    1. sa
    2. said
    1. [
    2.  
    1. at
    2. that
    1. Andrew
    2. Andrew
    1. ville
    2. wanted
    1. lage
    2. make.INF
    1. ___ i ]] …
    2.  
    1. ‘The food that you said Andrew wanted to make…’
    1. (9)
    1. Deti
    2. that
    1. sa
    2. said
    1. du
    2. you
    1. [
    2.  
    1. at
    2. that
    1. Andrew
    2. Andrew
    1. ville
    2. wanted
    1. lage
    2. make.INF
    1. ___ i ].
    2.  
    1. ‘That, you said Andrew wanted to make.’

If Norwegian EQs and RCs are not islands, we would expect, all else equal, that filler-gap dependencies into EQs and RCs would be as free as dependencies into complement clauses.

2.1 Embedded questions

Past research suggests that all three A’-dependency types, wh-movement, relativization, and topicalization, can cross into EQs in Norwegian. Maling & Zaenen (1982) report (10) as an acceptable example of wh-movement. Moreover, in a series of acceptability judgment studies, Kush et al. (2018) found that Norwegian participants frequently accept (argument) wh-movement dependencies from embedded polar questions like (11).

    1. (10)
    1. Wh-movement from an embedded subject question
    1.  
    1. Hvilke
    2. which
    1. bøkeri
    2. books
    1. spurte
    2. asked
    1. Jon
    2. Jon
    1. [
    2.  
    1. hvemk
    2. who
    1. som ___k
    2. c4
    1. hadde
    2. had
    1. skrevet ____i ]?
    2. written
    1. ‘Which books did Jon ask who had written?’          (Maling & Zaenen 1982: 232)
    1. (11)
    1. Wh-movement from an embedded polar question
    1.  
    1. Hva/Hvilke kakeri
    2. what/which cakes
    1. lurer
    2. wonder
    1. gjest-en
    2. guest-DEF
    1. on
    1. [
    2.  
    1. om
    2. if/whether
    1. Hanne
    2. Hanne
    1. bakte ____i ]?
    2. baked
    1. ‘Which cakes did the guest wonder whether Hanne baked?’          (Kush et al. 2018)

Although Kush et al. found that participants occasionally rejected wh-movement from an embedded question slightly more often for bare wh-arguments (hva) than complex wh-phrases (hvilke kaker), the authors concluded that the overall high probability of acceptance entails that wh-movement from EQs is syntactically possible in Norwegian.

In two large-scale judgment studies, Kush & Dahl (2020) found that relativization from EQs was judged as acceptable on average as relativization from declarative complement clauses. Kush & Dahl specifically investigated relativization of the subject from object EQs like (12) or adjunct EQs. Other gap sites and EQ types (e.g. polar, subject, etc.) were not investigated, but the presumption is that the acceptability of (12) implies the general possibility of argument movement from other EQ-internal positions.

    1. (12)
    1. Relativization from an embedded question
    1.  
    1. Sjømenn-ene
    2. sailors-DEF.PL
    1. saw
    1. signal-eti
    2. signal-DEF
    1. som
    2. REL
    1. de
    2. they
    1. visste
    2. knew
    1. [
    2.  
    1. hvak
    2. what
    1. ____i
    2.  
    1. betydde ___k].
    2. meant
    1. ‘The sailors saw the signal that they knew what meant.’

Kush et al. (2019) find that topicalization from embedded polar questions like (13) is equally acceptable as long-distance topicalization from declarative complements (see Bondevik, Kush, & Lohndal 2020 for replication).

    1. (13)
    1. Topicalization from an embedded question
    1.  
    1. Kak-eni
    2. cake-DEF
    1. lurer
    2. wonders
    1. han
    2. he
    1. on
    1. [
    2.  
    1. om
    2. if/whether
    1. Hanne
    2. Hanne
    1. bakte ____i ].
    2. baked
    1. ‘The cake he wonders whether Hanne baked.’
    2. ~ ‘He wonders whether Hanne baked the cake.’          (Kush et al. 2019: 6)

The studies mentioned above only investigated the acceptability of argument dependencies into EQs. We know of no studies that have tested the acceptability of adjunct dependencies into Norwegian EQs. Norwegian seems to follow the general cross-linguistic pattern that adjuncts cannot be extracted from EQs5 (Cinque 1990; Rizzi 1990; Szabolcsi & Zwarts 1993, a.o.). Informal judgments support this assumption:

    1. (14)
    1. *Hvordani
    2.   how
    1. spurte
    2. asked
    1. han
    2. he
    1. [
    2.  
    1. hvem
    2. who
    1. som
    2. c
    1. hadde
    2. had
    1. oppført
    2. behaved
    1. seg ___i]?
    2. self
    1.   ‘How did he ask [who had behaved ___ ]?’

2.2 Relative clauses

Describing the distribution of acceptable A’-movement from Norwegian RCs is slightly more complex than with EQs because the extant literature provides conflicting reports on which dependencies are acceptable. Researchers agree that at least some RCs permit some A’-movement dependencies and that some movement from some RCs results in unacceptability. We discuss the uncontroversial cases first, before proceeding to more controversial cases.

RC-island violating dependencies most often feature topicalization from two types of RCs (e.g., Engdahl 1997; Lindahl 2017 for Swedish). The first type are presentational/existential subject RCs like (15), which primarily function to introduce new referents to a discourse. The second type are (ii) it-clefts like (16), which place focus on the head of the RC (Hedberg 2000; Prince 1978; see Gundel 2002 and Johansson 2001 for cross-linguistic comparison of English and Scandinavian it-clefts).6

    1. (15)
    1. a.
    1. Det
    2. it
    1. er
    2. is
    1. mangei
    2. many
    1. [
    2.  
    1. som ___i
    2. REL
    1. snakker
    2. speak
    1. det
    2. that
    1. språk-et].
    2. language-DEF
    1. ‘There are many (people) who speak that language.’
    2. ~ ‘Many people speak that language.’
    1.  
    1. b.
    1. Det
    2. that
    1. språk-eti
    2. language-DEF
    1. er
    2. is
    1. det
    2. it
    1. mangek
    2. many
    1. [
    2.  
    1. som ___k
    2. REL
    1. snakker ___i ].
    2. speak
    1. ‘That language, there are many who speak.’
    1. (16)
    1. a.
    1. Det
    2. it
    1. er
    2. is
    1. bare
    2. only
    1. Andrew
    2. Andrew
    1. [
    2.  
    1. som ___i
    2. REL
    1. snakker
    2. speaks
    1. det
    2. that
    1. språk-et].
    2. language-DEF
    1. ‘It’s only Andrew who speaks that language.’
    1.  
    1. b.
    1. Det
    2. that
    1. språk-eti
    2. language-DEF
    1. er
    2. is
    1. det
    2. it
    1. bare
    2. only
    1. Andrewk
    2. Andrew
    1. [
    2.  
    1. som ___k
    2. REL
    1. snakker ___i ].
    2. speaks
    1. ‘That language, it’s only Andrew that speaks.’

RC-island violations are, however, not limited to presentational or cleft constructions. Engdahl (1997) observes that dependencies into predicate nominals like (17) are acceptable.

    1. (17)
    1. Topicalization from Predicate-nominal RC
    1.  
    1. Lakrisi
    2. licorice
    1. er
    2. is
    1. Odd
    2. Odd
    1. den
    2. the
    1. eneste
    2. only
    1. (person-en)k
    2. person-DEF
    1. [
    2.  
    1. som
    2. REL
    1. ___k
    2.  
    1. ikke
    2. NEG
    1. liker ___i ].
    2. likes
    1. ‘Licorice, Odd is the only person who doesn’t like.’

Multiple authors have argued that other embedding environments are possible. Naturalistic examples of topicalization from RCs attached to nominals in object position are attested (Erteschik-Shir 1973; Allwood 1982; Taraldsen 1982; Engdahl 1982, 1997; Lindahl 2014, 2017; Löwenadler 2015). Examples like (18a) with embedding predicates such as kjenne (‘to know/be acquainted with’) are relatively common and documented in the literature. Examples with other predicates such as (18b) and (18c) are rarer, though they are judged acceptable.

    1. (18)
    1. Topicalization from RC
    1.  
    1. a.
    1. Context:
    1.  
    1.  
    1. Ein
    2. one
    1. must
    1. ikkje
    2. NEG
    1. vere
    2. be
    1. meir
    2. more
    1. redd
    2. afraid
    1. for
    2. for
    1. å
    2. to
    1. gjere
    2. make
    1. feil
    2. mistakes
    1. on
    1. nynorsk
    2. Nynorsk
    1. enn
    2. than
    1. on
    1. bokmål.
    2. Bokmål
    1. ‘You shouldn’t be more afraid of making mistakes in Nynorsk than in Bokmål.’
    1.  
    1.  
    1. Deti
    2. that
    1. kjenner
    2. know
    1. eg
    2. I
    1. mange
    2. many
    1. eigentleg-nynorskbrukarark
    2. actual-Nynorsk.users
    1. [
    2.  
    1. som ____k
    2. REL
    1. er ____i ].
    2. are
    1. ‘That, I know many actual Nynorsk users who are.’
    2. ~ ‘I know many actual Nynorsk users who are that(= afraid to make mistakes).’
    1.  
    1. b.
    1. Rødspriti
    2. red.spirit
    1. slipper
    2. let
    1. vi
    2. we
    1. ingenk
    2. nobody
    1. inn
    2. in
    1. [
    2.  
    1. som ___k
    2. REL
    1. har
    2. has
    1. drukket ____i ].
    2. drunk
    1. ‘We don’t let anybody in who has drunk grain alcohol.’          (Taraldsen 1982: 206)
    1.  
    1. c.
    1. Det
    2. that
    1. hus-eti
    2. house-DEF
    1. misunner
    2. envy
    1. jeg
    2. I
    1. folk-a
    2. folk-DEF.PL
    1. [
    2.  
    1. som ____k
    2. REL
    1. bor
    2. live
    1. i ___i ].
    2. in
    1. ‘That house I envy the people that live in.’          (based on Löwenadler 2015: 44, ex. 22)

It is often observed, however, that filler-gap dependencies into structurally identical RCs, like (19), are likely to be judged unacceptable – particularly if presented in isolation.

    1. (19)
    1. ?*Deti
    2.     That
    1. klemte
    2. hugged
    1. jeg
    2. I
    1. ingenk
    2. nobody
    1. [
    2.  
    1. som ____k
    2. REL
    1. gjorde ____i ].
    2. did
    1.     ‘I hugged nobody who did that.’

The differences between (18) and (19) indicate that the acceptability of topicalization from RCs is at least partially sensitive to the embedding predicate. Importantly, however, many examples like (19) that informants initially reject become more acceptable given a sufficiently rich context (Engdahl 1997; Lindahl 2017). Given that examples like (19) can improve in context, and we assume that there is no principled structural distinction between the RCs under predicates in (18) and (19; contra Kush, Omaki, & Hornstein 2011), it would seem that the embedded predicate effects should be given an extra-syntactic explanation.

All the RCs above are subject RCs. Some researchers (e.g. Platzack 1999) have maintained that only subject RCs allow filler-gap dependencies, but others (Engdahl 1997; Lindahl 2017) have challenged this claim with naturalistic examples of filler-gap dependencies into non-subject RCs. The Norwegian (20), based on one of Engdahl’s examples, involves topicalization from an object RC. Our informants judge (20) acceptable, thereby arguing against a rigid subject-only restriction.

    1. (20)
    1. Topicalization from a non-subject RC
    1.  
    1. Mattei
    2. math
    1. var
    2. was
    1. det
    2. it
    1. bare
    2. only
    1. pappak
    2. Dad
    1. [(som)
    2.   REL
    1. jeg
    2. I
    1. kunne
    2. could
    1. be ___k
    2. ask
    1. om
    2. about
    1. å
    2. to
    1. hjelpe
    2. help
    1. meg
    2. me
    1. med ___i ].
    2. with
    1. ‘Math, it was only Dad that I could ask to help me with.’
    2. ~ ‘Dad was the only one that I could ask to help me with math.’

Taken together, the data suggest that Norwegian children must learn that topicalization from RCs is, in principle, acceptable in different embedding contexts (existential and presentational constructions, clefts, and RCs attached to NPs in object position) and with different RC types (subject, object, etc.). They must also learn the (as yet) poorly understood additional factors that lead some RC-island violating dependencies to be judged unacceptable, such as the fact that the embedding predicate appears to affect the acceptability of dependencies into RC. We return to this issue in the general discussion.

The last issue to address is the acceptability of different filler-gap dependencies into RCs. In the prior literature, most, if not all, attested examples involve topicalization. Kush et al. (2018) found that Norwegians generally reject wh-movement from non-presentational RCs presented out of context. We do not know of any work that has formally investigated the acceptability of relativization from RCs. In light of the relative abundance of examples with topicalization, the dearth of wh-movement and relativization examples is noteworthy. As we see it, there are at least three ways to interpret the asymmetry: First, we could assume that movement dependencies are uniformly blocked from all RCs and that the apparent cases of topicalization from RCs are not true cases of movement. Second, we could assume that specifically wh-movement, relativization, or both are blocked from RCs. Third, we could assume that wh-movement and relativization from RCs is possible in principle, but individual examples are ruled out by supplemental restrictions, similar to the issue of embedding verb choice.

We reject the first possibility: Topicalization from RCs exhibits characteristics of a standard movement dependency (see Engdahl 1997; Lindahl 2015, 2017): for example, topicalization from RCs exhibits case-connectivity effects (21a), PPs (21b) and certain adjuncts (21c) can be topicalized, and gaps cannot be replaced with resumptive pronouns (21d). Gaps of topicalization also license parasitic gaps (21e; see also Lindahl 2017).

    1. (21)
    1. a.
    1. Megi/*jegi
    2. me/I
    1. er
    2. is
    1. det
    2. it
    1. mangek
    2. many
    1. [
    2.  
    1. som ___k
    2. REL
    1. vil
    2. want
    1. snakke
    2. speak.INF
    1. med ___i ].
    2. with
    1. ‘There are many people who want to speak with me.
    1.  
    1. b.
    1. Med
    2. with
    1. megi
    2. me
    1. er
    2. is
    1. det
    2. it
    1. mangek
    2. many
    1. [
    2.  
    1. som ____k
    2. REL
    1. vil
    2. want
    1. snakke ___i ].
    2. speak.INF
    1. ‘There are many people who want to speak with me.’
    1.  
    1. c.
    1. so
    1. senti
    2. late
    1. kjenner
    2. know
    1. jeg
    2. I
    1. ingenk
    2. no.one
    1. [
    2.  
    1. som ____k
    2. REL
    1. hadde
    2. had
    1. turt
    2. dared
    1. å
    2. to
    1. ringe ___i ].
    2. call.INF
    1. ‘I don’t know anyone who would dare to call that late.’
    1.  
    1. d.
    1. Det
    2. that
    1. språk-eti
    2. language-DEF
    1. er
    2. is
    1. det
    2. it
    1. mangek
    2. many
    1. [
    2.  
    1. som ____k
    2. REL
    1. snakker ___i
    2. speak
    1. /*deti].
    2.   *it
    1. ‘That language there are many people who speak (*it).’
    1.  
    1. e.
    1. Deti
    2. that
    1. er
    2. is
    1. det
    2. it
    1. ingenk
    2. no.one
    1. [
    2.  
    1. som ___k
    2. REL
    1. har
    2. has
    1. prøvd
    2. tried
    1. å
    2. to
    1. fikse ___i
    2. fix.INF
    1. uten
    2. without
    1. å
    2. to
    1. ødelegge ___i].
    2. destroy.INF
    1. ‘That, there is no one who has tried to fix without destroying.’

We also dismiss the possibility that either wh-movement and relativization from RCs is banned outright. It is relatively easy to show that relativization from RCs is possible: Naturally-occurring examples can, for instance, be found online:7

    1. (22)
    1. Relativization from RC
    1.  
    1. a.
    1.  
    1. en
    2. a
    1. hvitvini
    2. white.wine
    1. [
    2.  
    1. som
    2. REL
    1. det
    2. it
    1. ikke
    2. not
    1. er
    2. is
    1. mangek
    2. many
    1. [
    2.  
    1. som ____k
    2. REL
    1. kan
    2. can
    1. måle
    2. measure
    1. seg
    2. self
    1. med ____i ]].
    2. with
    1. ‘… a white wine that there weren’t many wines that could compete with.’
    2. ~ ‘… a white wine that not many other wines could compete with.’
    3. (https://no.tripadvisor.com/LocationPhotoDirectLink-g1189136-d2408913-i234544745-Restaurante_Palm_Garden-Patalavaca_Gran_Canaria_Canary_Islands.html)
    1.  
    1. b.
    1.  
    1. en
    2. a
    1. løypei
    2. trail
    1. [
    2.  
    1. som
    2. REL
    1. det
    2. it
    1. ikke
    2. not
    1. er
    2. is
    1. mangek
    2. many
    1. [
    2.  
    1. som ____k
    2. REL
    1. ferdes
    2. fare
    1. i ____i ]].
    2. in
    1. ‘… a trail that there aren’t many people that use.’
    2. ~ ‘… a trail that not many people use.’
    3. (https://frisomfuglen.wordpress.com/tag/midtre-kytetjern/)

Naturally-occurring examples of wh-movement from RCs is acceptable are scant. However, it appears possible to construct acceptable examples. For example, we constructed (23) – and similar examples – and consulted 10 different native speaker informants. All accepted the examples.

    1. (23)
    1. Wh-movement from an RC
    1.  
    1. Hvilken
    2. which
    1. boki
    2. book
    1. var
    2. was
    1. det
    2. it
    1. mange/Ronjak
    2. many/Ronja
    1. [
    2.  
    1. som
    2. REL
    1. ____k
    2.  
    1. likte ____i ]?
    2. liked
    1. ‘Which book were there many/was it Ronja who liked?’

The existence of acceptable cases of relativization and wh-movement from RCs argues against a wholesale ban on these movement dependencies from RC. Nevertheless, it seems that the frequency of examples differs starkly by dependency type: Topicalization from RC is well attested, relativization significantly less frequent, and wh-movement is extremely rarely observed in the wild: For example, Lindahl (2017) found that a collection of 270 naturally-occurring Swedish examples included 93% topicalization, 7% relativization, and no instances of wh-movement from RC (p. 150). The fact that some examples of relativization or wh-movement are acceptable, but other structurally similar sentences are unacceptable, suggests that the distribution of these dependencies is likely governed by supplemental contextual factors above and beyond simple syntactic well-formedness (Allwood 1982; Engdahl 1997; Lindahl 2017; Kush et al. 2018, 2019). Such extra-syntactic factors might also be able to explain the cross-dependency differences in frequency. We return to this idea in the general discussion.

2.3 Summary of Target Generalizations

Based on formal and informal judgments, we conclude that all six combinations of filler-gap dependency type and island-type (EQ, RC) are, in principle, allowed in Norwegian, though certain combinations appear less frequently and their acceptability seems to be subject to fine-grained contextual factors. Given that the target grammar appears to allow all three types of dependencies from both island types, we would expect the input to children to contain positive examples of the six different dependency-island combinations if the target distribution must be learned via direct positive evidence.

3 Corpus Analysis

We sought to determine if there was direct evidence for island-violating filler-gap dependencies in children’s input. Estimates derived from child-directed speech (CDS) corpora would characterize real input best, but there is unfortunately not a large-scale, searchable corpus of Norwegian CDS comparable to the resources used for recent investigations of filler-gap input in English (Pearl & Sprouse 2013; Bates & Pearl submitted). Only two corpora containing Norwegian CDS are publicly available through CHILDES (MacWhinney 2000): Ringstad (Ringstad 2014) and Garmann (Garmann, Hansen, Simonsen, & Kristoffersen 2019). These corpora are useful in their own right, but are suboptimal for our purposes for a few reasons. First, both corpora are relatively small. The Ringstad corpus contains roughly 21000 adult utterances longer than a single word, whereas the Garmann corpus contains about 6000. Second, neither corpus is tagged or parsed, making fine-grained searches for syntactic constructions difficult. In the case of the Ringstad corpus, automatic tagging and parsing of the corpus in its current form is not possible, since utterances were transcribed in dialect and not in an orthographically standardized form that off-the-shelf taggers recognize.

Instead, we conducted a search through a corpus of child- and young-adult-directed texts, the NorGramBank children’s fiction in Norwegian bokmål treebank (Dyvik, Meurer, Rosén, De Smedt, Haugereid, Losnegaard, Lyse, & Thunes 2016). We chose to use the child fiction corpus rather than extant adult speech corpora (e.g., NoTa: Norwegian Spoken Language Corpus; Johannessen & Hagen 2008) because (i) child-directed texts might be more representative of certain properties of CDS than adult-to-adult speech, and (ii) NoTa (and other corpora) are not parsed, making searching for potential island violations by structural features practically challenging.

There are, of course, important caveats regarding the use of a written corpus to reason about the distribution of evidence in children’s input. Reading material comprises a small portion of a child’s input8 compared to CDS, and the distribution of various constructions in text can differ markedly from the distributions in CDS: Written text tends to be more syntactically complex than speech for both adults and children (see, e.g., Roland, Dick, & Elman 2007; Montag & MacDonald 2015), and complex syntactic constructions, such as relative clauses, are significantly more common in children’s books than in CDS (Cameron-Faulkner & Noble 2013; Noble, Cameron-Faulkner & Lieven 2018; Montag 2019). Insofar as RCs and other complex structures are over-represented in written corpora, our estimates of the frequency of long-distance filler-gap dependencies from complex syntactic structures like embedded clauses, and islands like RCs may be inflated relative to their actual occurrence in Norwegian CDS. On the other hand, main wh-questions (sentences with wh-movement to sentence-initial position) are significantly less common in children’s book text than in CDS (see Cameron-Faulkner & Noble 2013). Thus counts of main questions from the corpus are likely to underestimate the frequency of such structures in Norwegian CDS.

It is also possible that the frequency of island violations may be lower in edited material, if island violations are considered ‘informal’ or characteristic of spoken, rather than written, language. With that in mind, however, if island violations are found in written text, that can be used as suggestive evidence that such constructions are not perceived as especially marked or objectionable.

3.1 Corpus Information and Method

The NorGramBank children’s fiction corpus contains text from 155 children’s books taken from bokhylla.no, managed by the National Library of Norway. The corpus was created as part of the Norwegian Infrastructure for the Exploration of Syntax and Semantics (INESS) project (Rosén, De Smedt, Meurer, & Dyvik 2012; accessible at http://clarino.uib.no/iness). The corpus comprises 4111212 words and 389556 sentences automatically parsed in the LFG formalism with XLE (Xerox Linguistic Environment; Maxwell & Kaplan 1993; Kaplan et al. 2002) and the LFG Parsebanker tool (Rosén, Meurer, & De Smedt 2009). Sentences are annotated with constituent-structure and functional-structure (henceforth f-structure) features which can be queried using the INESS-search interface.

Although a portion of the corpus was disambiguated by annotators at INESS after automatic parsing, most sentences in the corpus are associated with multiple candidate parses. INESS-search queries all parses associated with each sentence, which can lead to a large number of false positives for complicated queries. To counteract the effect of false positives on our frequency estimates, we manually checked all results for queries attempting to identify island violations. For very broad searches (e.g. estimating the total number of wh-, relative clause, or topicalization dependencies), we did not manually check if all examples were correctly identified.

Previous corpus work on the distribution of long-distance dependencies in children’s input has focused on wh-question dependencies (e.g. Pearl & Sprouse 2013), but we collected frequencies for three different dependency types: wh-questions, relative clauses, and topicalization. Queries were conducted using both constituent-structure features (e.g. phrasal categories, dominance relations) and f-structure features, such as clause type, dependency type, and functional role annotation (e.g. subject, object).

The corpus contains a diverse array of reading material intended for different age groups, from picture books to young adult novels. Since text characteristics may differ by target age group and aggregate frequencies for the whole corpus may not accurately reflect the relative probabilities across age groups, we separated books into four rough age groups. The age group for each book was taken from its designated reading level (målgruppe) at the Oslo Public Library website (deichman.no). If a reading level was not available from the library, we consulted a variety of online booksellers. Finally, in the rare event that these searches did not provide a classification, acquaintances with a background in child education were consulted. The four age groups used were: ages 3–5, 6–8, 9–11, 12–18.

The term ‘child-directed’ is typically reserved for input to language learners under the age of 9 or younger. This reflects the common assumption that children complete the acquisition of major syntactic constructions in their native language at a relatively young age. The 9–11 and the 12–18 age groups fall outside the standard range for ‘child-directed’ language. We nevertheless opted to include frequencies from these age groups as a point of comparison against the ‘child-directed’ text in the two lower age groups to get a better sense of whether and how relevant frequencies varied by age. We base our conclusions, however, on the results from the two lower age groups.

Table 1 gives the total number of sentences in each age-separated sub-corpus, as well as the total number of embedded declarative and interrogative clauses. The subcorpus of texts for 3–5 year olds has the smallest number of sentences (partly reflecting the fact that books in this subcorpus were significantly shorter than the rest). Embedded declarative clauses and embedded questions are relatively infrequent, as has been observed in previous corpus work in Norwegian (Westergaard & Bentzen 2007; Ringstad 2019).9

Table 1

Descriptive counts for sentences and embedded clause type by age group. Percentages reflect the number of example sentences out of the total set of sentences for each age group.

Age Group Number of Sentences Total Embedded Declarative Clauses Embedded Questions Relative Clauses
3–5 4588 291 (6.3%) 57 (1.2%) 304 (6.6%)
6–8 48622 4543 (9.3%) 956 (2.0%) 4521 (9.3%)
9–11 196134 19409 (9.9%) 4416 (2.3%) 20025 (10.2%)
12–18 140212 11196 (8.0%) 2942 (2.1%) 11208 (8.0%)
Total 389556 35439 (9.1%) 8371 (2.1%) 36058 (9.2%)

3.2 Base rates of dependency types

We began by calculating the base frequencies of the three movement dependencies, so that we could subsequently report the relative frequencies of island violations by dependency type. We restricted our search for wh-questions to sentences where a wh-word was followed by a verb (see Pearl & Sprouse 2013). This was done to exclude fragment questions (e.g. ‘Who/What/Where/For what?’), which do not constitute overt evidence of a wh-dependency. Our search terms for topicalization specified sentences in which the constituent marked as the topic was not the matrix subject, as, again, such sentences do not provide overt evidence for movement. The counts for each dependency type split up by age group are found in Table 2.

Table 2

All movement dependencies in the NorGramBank Child Fiction Corpus. Percentages reflect the number of example sentences out of the total set of sentences for each age group.

Age Group Sentences Total (See Table 1 for breakdown) Sentences with Wh-Movement (Excl. Embedded Questions) Sentences with Relativization (same as Table 1) Sentences with Topicalization
3–5 4588 84 (1.8%) 304 (6.6%) 1235 (26.9%)
6–8 48622 1337 (2.7%) 4521 (9.3%) 12849 (26.4%)
9–11 196134 4905 (2.5%) 20025 (10.2%) 39519 (20.1%)
12–18 140212 3751 (2.7%) 11208 (8.0%) 19908 (14.2%)
Total 389556 18448 (4.7%) 36058 (9.3%) 73511 (18.9%)

The overall frequency of main wh-questions varies between 1.8–2.7% across age groups in the corpus. Westergaard (2005) reports that main questions make up roughly 8% of Norwegian CDS, similar to reports that main questions comprise between 8%–16% of English CDS (Cameron-Faulkner & Noble 2013). We can thus estimate that main wh-questions are 4–8 times less likely in Norwegian book text than in CDS. This estimate aligns with findings from Cameron-Faulkner & Noble (2013), who found that main questions were roughly six times more common in English CDS than in a sample of picture books for 2-year-old children.

Relative clauses were observed more frequently than main questions. Following previous findings, we assume that relativization structures are more common in text than they are in CDS, perhaps by as much as an order of magnitude (see Montag 2019).

The overall frequency of topicalization structures is high due to Norwegian’s propensity to front non-subjects to sentence-initial position. The rate of topicalization observed in our corpus is roughly the same as rates of topicalization reported to occur in Norwegian CDS (23.4% according to Westergaard 2005). This rate includes topicalization of all phrase types (e.g. both arguments and adjuncts).

The numbers in Table 2 aggregate over local movement dependencies and long-distance movement dependencies (i.e. where the filler occupies a position outside the clause containing its gap). Island violations are long-distance dependencies, so establishing that long-distance movement is possible is a precondition for entertaining the possibility of island violations. Evidence of basic long-distance movement has been shown to be relatively infrequent in the input to children in languages like English (Pearl & Sprouse 2013; Bates & Pearl submitted). We wished to confirm whether the relative rarity of long-distance movement was also characteristic of Norwegian child-directed text.

To obtain the approximate frequency of standard long-distance movement dependencies, we searched for sentences with one or more complement clauses that also contained a filler-gap dependency. We then manually identified sentences in the results that contained examples such that a filler was associated with a gap across at least one embedded declarative complement clause. Results are in Table 3. The amount of text differs across the sub-corpora, but long-distance examples of each dependency type are found in each subcorpus.

Table 3

Long-distance movement dependencies from non-island embedded declarative clauses in the NorGramBank Child Fiction Corpus. Percentages reflect the number of example sentences out of the total set of sentences for each age group.

Age Group Sentences Total (See Table 1) Long-Distance Relativization Long-Distance Wh-Movement Long-Distance Topicalization
3–5 4588 2 (0.04%) 2 (0.04%) 1 (0.02%)
6–8 48622 9 (0.02%) 9 (0.02%) 16 (0.03%)
9–11 196134 79 (0.04%) 73 (0.04%) 79 (0.04%)
12–18 140212 59 (0.04%) 49 (0.03%) 33 (0.02%)
Total 389556 149 133 129

We give examples of long-distance filler-gap dependencies into embedded declaratives from the corpus. We provide the book title and sentence number for all examples.

    1. (24)
    1. Relativization from an embedded declarative clause
    1.  
    1. Plutselig
    2. suddenly
    1. fant
    2. found
    1. han
    2. he
    1. eni
    2. one
    1. [
    2.  
    1. som
    2. REL
    1. han
    2. he
    1. syntes
    2. felt
    1. [
    2.  
    1. han
    2. he
    1. ble
    2. became
    1. fin
    2. fine
    1. med ___ i ]]
    2. with
    1. ‘Suddenly he found one that he thought he looked good in.’          (Jonas får briller, #132)
    1. (25)
    1. Wh-movement from an embedded declarative clause
    1.  
    1.  
    1. hvai
    2. what
    1. tror
    2. believe
    1. dere
    2. you.pl
    1. [
    2.  
    1. at
    2. that
    1. jeg
    2. I
    1. fant ___ i]?
    2. found
    1. ‘What do you think I found?’          (Pippi er tingleter og havner i slagsmål, #53)
    1. (26)
    1. Topicalization from an embedded declarative clause
    1.  
    1.  
    1. dette
    2. this
    1. forslag-eti
    2. suggestion-DEF
    1. føler
    2. feel
    1. jeg
    2. I
    1. [ ___i
    2.  
    1. er
    2. is
    1. dumt ]
    2. dumb
    1. ‘That suggestion, I feel is dumb.’          (Gjemmestedet, #893)

To summarize the basic results: When collapsing across local and long-distance dependencies, topicalization is most common, followed by relativization. Wh-movement dependencies are less frequent than in regular CDS, in line with previous findings regarding the differences between written and spoken corpora, but wh-movement dependencies still occur at non-trivial rates in the corpus. Long-distance dependencies from declarative complements are, on the whole, infrequent, but the absolute frequency of long wh-, relativization, and topicalization dependencies is comparable.

3.3 Direct Evidence of Island Violations

Having established the distribution of simple filler-gap dependencies, we turned to island violations. We conducted distinct searches for wh-, relativization, and topicalization dependencies into RCs and EQs. We used relatively broad search criteria to avoid missing potential hits and then manually sifted the results (see Appendix A for search queries). In the entire corpus we found 63 examples of dependencies into RCs and 42 examples into EQs. Table 4 presents the counts of movement dependencies from RCs and EQs split by age group, dependency and embedded clause type. It also includes the counts of dependencies from simple declarative clauses and the overall count of each clause type without a long-distance dependency to adjust for base rate differences across construction type. Finally, in each cell corresponding to a different movement-island pairing, we provide the expected count10 E[…] of tokens that should have been observed if movement from that island was comparable in frequency to movement from an embedded declarative.

Table 4

Counts of long-distance movement dependencies split by age group, dependency and embedded clause-type. Expected counts (E[…]) reflect the number of tokens that would be expected under the assumption that filler-gap dependencies into Relative Clauses and Embedded Questions are equally as frequent as corresponding filler-gap dependencies into simple declarative complement clauses.

Embedded Clause Type
Simple Declarative(see Table 3) Relative Clause Embedded Q
Age Group      3–5
Relativization 2 0 E[2] 0 E[0]
Topicalization 1 2 E[1] 0 E[0]
Wh-Movement 2 0 E[2] 0 E[0]
No Dependency 286 302 57
Age Group      6–8
Relativization 9 0 E[9] 2 E[2]
Topicalization 16 10 E[16] 2 E[3]
Wh-Movement 9 0 E[9] 0 E[2]
No Dependency 4509 4511 950
Age Group      9–11
Relativization 79 0 E[79] 19 E[16]
Topicalization 79 42 E[79] 10 E[18]
Wh-Movement 43 0 E[73] 0 E[17]
No Dependency 19178 19983 4385
Age Group      12–18
Relativization 59 0 E[59] 7 E[16]
Topicalization 33 9 E[33] 2 E[8]
Wh-Movement 49 0 E[49] 0 E[13]
No Dependency 11055 11199 2931

As seen in Table 4, direct evidence for long-distance dependencies into RCs and EQs occurs in the input to children across age groups. The frequency of evidence differs both by embedded clause-type and dependency type.

For each age group we conducted two 2x4 Fisher Exact tests in R (R Core Team 2018), comparing the counts in the embedded declarative column to the corresponding RC and EQ column, respectively. For the youngest age-group, neither comparison was significant (p = .229, 1), presumably reflecting low power. For the 6–8 age group, the counts in the RC column were significantly lower than in the declarative column (p < .000), but no significant difference was found between the declarative v. EQ columns (p = .664). For the 9–11 age group, RC and EQ counts were lower than declarative counts (p’s < .000). The same held for the 12–18 year old group (p’s < .000).

Qualitatively, evidence for island-violating movement appears unevenly distributed across the dependency-island cells. For some dependency-clause-type combinations, direct evidence seems roughly as frequent as would be expected if movement out of an island was as probable as movement out of a simple declarative complement clause. For example, the number of attested examples of topicalization and relativization from EQs is rather close to the expected counts for most age groups. For other combinations, examples are entirely absent: Wh-movement from either RCs or EQs is unattested in the sample. In the younger age groups, long-distance wh-movement and EQs are infrequent enough that the absence of their conjunction is not surprising. However, in older age groups, the absence is conspicuous. Similarly, the fact that relativization from RCs is never observed in the older groups, paired with the disparity from the expected counts suggests that the gaps in the distribution are real.

4 Characteristics of Observed Island Violations

We now look more closely at the fine-grained distribution of attested examples to see if the data are sufficiently rich to provide direct evidence for the full range of acceptable dependencies in the target grammar.

4.1 Embedded Questions

We inspected filler-gap dependencies into embedded questions, breaking them down by the type of embedded question (e.g. subject, object, polar question11) and the base position of the long-distance moved element. Table 5 provides the counts for each combination.

Table 5

Observed filler-gap dependencies into embedded questions broken down by location of their gap within the embedded question (columns), type of embedded question (rows) and dependency type.

Gap Site Within EQ
Subject Gap Object Gap Complement of P Gap
Relativization
Polar Embedded Q 2 3
Subject Embedded Q
Object Embedded Q 1
Adjunct Embedded Q 3
Be-Comp Embedded Q 17
P-Comp Embedded Q 1 1
Subtotal 24 3 1
Topicalization
Polar Embedded Q
Subject Embedded Q
Object Embedded Q 4 2
Adjunct Embedded Q 2
Be-Comp Embedded Q 6
P-Comp Embedded Q
Subtotal 10 4 0
Total 34 7 1

Collapsing across relativization and topicalization, long-distance movement of the highest subject out of an embedded question is most common (25/28 cases of relativization, 11/14 cases of topicalization). Long-distance subject movement is most common from copular clauses where the copula’s complement is questioned. (27) provides examples:

    1. (27)
    1. a.
    1. Relativization of Subject from Embedded Object Question
    1.  
    1. Det
    2. it
    1. var
    2. was
    1. deti
    2. that
    1. (som)
    2. REL
    1. jeg
    2. I
    1. ikke
    2. NEG
    1. skjønte
    2. understood
    1. [
    2.  
    1. hvak ___i
    2. what
    1. var
    2. was
    1. ___k].
    2.  
    1. ‘That was the thing that I didn’t understand what ___ was.’
    2. ~ ‘That’s the thing that we don’t understand what it was.’          (Ompadorasedet. #1809)
    1.  
    1. b.
    1. Topicalization of Subject from Embedded Object Question
    1.  
    1.  
    1. Men
    2. but
    1. deti
    2. that
    1. vil
    2. want
    1. jeg
    2. I
    1. ikke
    2. NEG
    1. si
    2. say.INF
    1. [
    2.  
    1. hvak ___i
    2. what
    1. er ___k].
    2. is
    1. ‘But that I won’t say what is.’
    2. ~ ‘But I won’t say what that is.’          (Thea og Jens på pensjonat Forglemmegei, #4024)

This specific configuration is most frequent, but there is variation. Subjects phrases are moved out of a variety of different embedded question types: polar (28a), object (28b), and adjunct (28c).

    1. (28)
    1. a.
    1. Relativization of Subject from an Embedded Polar Question
    1.  
    1.  
    1. Han
    2. he
    1. var
    2. was
    1. en
    2. a
    1. sånn
    2. such
    1. ungei
    2. child
    1. [
    2.  
    1. som
    2. REL
    1. du
    2. you
    1. ikke
    2. NEG
    1. skjønner
    2. understand
    1. [om
    2. whether
    1. ___i
    2.  
    1. er
    2. is
    1. lei
    2. sad
    1. seg
    2. self
    1. eller _i
    2. or
    1. ler]].
    2. laughs
    1. ~ ‘He was the kind of child that you don’t know whether [he] is sad or is laughing.’          (Blåveispiken, #3844)
    1.  
    1. b.
    1. Topicalization of Subject from an Embedded Object Question
    1.  
    1.  
    1. Han
    2. he
    1. ene
    2. one
    1. typeni
    2. guy
    1. vet
    2. know
    1. vi
    2. we
    1. jo
    2. prt
    1. ikke
    2. NEG
    1. engang
    2. even
    1. [
    2.  
    1. hvak ___i
    2. what
    1. heter ___k].
    2. is.called
    1. ‘That one guy, we don’t even know what is called.’
    2. ~ ‘That one guy, we don’t even know the name of.’          (Døden på Oslo S, #1319)

Moreover, the corpus contains evidence that non-subjects may also be moved long-distance out of EQs, though the amount of evidence for this possibility is markedly less frequent.

    1. (29)
    1. a.
    1. Topicalization of Object from an Embedded Question
    1.  
    1.  
    1. Dennei
    2. this
    1. vet
    2. know
    1. jeg
    2. I
    1. [
    2.  
    1. hvordan
    2. how
    1. jeg
    2. I
    1. skal
    2. shall
    1. bruke ___i].
    2. use
    1. ‘This I know how I will/should use.’          (Sirkusponnien, #602)
    1.  
    1. b.
    1. Relativization of Direct Object from an Embedded Question
    1.  
    1.  
    1. Jeg
    2. I
    1. har
    2. have
    1. en
    2. a
    1. litt
    2. little
    1. spesiell
    2. special
    1. jobbi,
    2. job
    1. [
    2.  
    1. som
    2. REL
    1. jeg
    2. I
    1. lurte
    2. wondered
    1. on
    1. [
    2.  
    1. om
    2. if
    1. du
    2. you
    1. kunne
    2. could
    1. ta ___i
    2. take
    1. on
    1. deg]].
    2. you
    1. ‘I have a bit of a special job that I wonder if you could take on ___.’          (Lille miss Stoneybrook, #530)

Given the relative diversity of embedded question types that allow long-distance movement, and that phrases with different grammatical roles can be extracted, there may be sufficient evidence that all embedded questions are non-islands for the relativization and topicalization of arguments. The absence of attested examples of wh-movement from EQs, however, means that the corpus lacks examples of at least some sentence types that are acceptable in the target grammar.

4.2 Relative Clauses

Consistent with trends noted in Engdahl (1997) and Lindahl (2017), all 63 examples of RC-island violations occurred either in a presentational or it-cleft configuration. Because both constructions use the expletive pronoun det, followed by the copula, some examples were potentially superficially ambiguous between existential and cleft construction. We categorized each token as either a presentational or it-cleft based on properties of the RC head and whether the sentence could be paraphrased using an alternate existential construction (Søfteland 2014). Like English existential constructions (Milsark 1977), presentational constructions allow weak determiners, but not strong determiners or proper names post-verbally (30). Like English clefts, Norwegian clefts generally disallow weak determiners in post-verbal position.

(30) a. There was a man/no man/*the man/*Ronja in the room.
  b. It was the man/Ronja/*no one/few/many people who was in the room.

The copula in existential constructions can be replaced with the existential verb å finnes (Søfteland 2014), but the same replacement is not possible with clefts (31):

    1. (31)
    1. a.
    1. Existential Construction, finnes can replace er
    1.  
    1.  
    1. Det
    2. It
    1. er/finnes
    2. is/exists
    1. ei
    2. a
    1. dame
    2. lady
    1. som
    2. REL
    1. er
    2. is
    1. i
    2. in
    1. rommet.
    2. room.DEF
    1. ‘There is a lady who is in the room.’
    1.  
    1. b.
    1. Cleft Construction, finnes cannot replace er
    1.  
    1.  
    1. Det
    2. It
    1. er/*finnes
    2. is/exists
    1. Ronja
    2. a
    1. som
    2. REL
    1. er
    2. is
    1. i
    2. in
    1. rommet.
    2. room.DEF
    1. ‘It is Ronja who is in the room.’

After categorization, we found that presentational and it-clefts were both well represented among our examples (2838 and 3832 examples, respectively; see Table B1 in Appendix B). There are two possible explanations for why RC-island violations were only observed with presentational and cleft constructions. The over-representation could indicate that the constructions have one or more properties that make their RCs particularly amenable to extraction compared to other RCs. Such a finding would potentially provide insight into the conditions governing acceptable extraction. Alternatively, their over-representation could simply reflect a difference in base-rate: perhaps presentational and cleft constructions are simply the most frequent types of RC in the input to children. Under this line of reasoning, the frequency of extraction from RCs would be consistent across constructions containing RCs, but non-presentational and non-existential RC constructions are so infrequent that the probability of observing an extraction is just too low.

We compared the overall frequency of presentational and it-cleft constructions to other RCs to see if simple base frequency could account for the absence of island violations from other RCs. We conservatively excluded RCs attached to subject NPs because subjects may be islands for independent reasons (e.g. Huang 1982). Presentational and it-clefts made up roughly 20% of the remaining RCs (6732 of 29522 eligible RCs, see Table B1 in Appendix B). As cleft RCs do not constitute an overwhelming majority of all RCs, it is unlikely that base frequency alone can explain their overrepresentation in island violations.

Movement from RCs was only observed from subject RCs (a tendency described in earlier work, Engdahl 1997; Lindahl 2017; Christensen 1982; Nordgård 1991). Closer inspection showed that subject RCs were roughly twice as likely as non-subject RCs in both presentational and it-cleft constructions, but non-subject RCs were still relatively common (see Table B2 in Appendix B). All else equal, it seems unlikely, then, that baseline frequency differences are to blame for the over-representation of island violations among subject RCs.12

Finally, we also report the phrasal types and roles of the constituents that were topicalized from presentational and clefted RCs. Table 6 shows that the corpus contains examples of a range of different phrase types with different roles being extracted from RCs. Direct object NPs are the phrase most commonly extracted from RCs. Nominal arguments are the most commonly topicalized elements (56/62 examples), however topicalization of (declarative and interrogative) CP complements are also attested, as well as at least one instance of a locative PP.

Table 6

Counts of different phrases moved out of RCs, split up by function/type.

Phrase Types Moved From RCs
Subject Direct Object (NP) Predicate Comp. (Adj) Comp. of P (NP) PP(Locative) CPComplement
0 47 2 9 1 3

Examples from the corpus that illustrate the variety of topicalizations from RC are below:

    1. (32)
    1. a.
    1. Topicalization of Direct Object from RC
    1.  
    1.  
    1. Deti
    2. that
    1. er
    2. is
    1. det
    2. it
    1. ingenk
    2. no.one
    1. [
    2.  
    1. som ____k
    2. REL
    1. vet
    2. knows
    1. ____i ].
    2.  
    1. ‘That, there’s no one who knows.’
    2. ~ ‘No one knows that.’          (Jernmannen, #5)
    1.  
    1. b.
    1. Topicalization of Predicate Complement from RC
    1.  
    1.  
    1. Tristi
    2. sad
    1. var
    2. was
    1. det
    2. it
    1. bare
    2. only
    1. Ronjak
    2. Ronja
    1. [
    2.  
    1. som ____k
    2. REL
    1. var ___i ].
    2. was
    1. ‘Sad it was only Ronja who was.’
    2. ~‘Only Ronja was sad.’          (Ronja Røverdatter, #10123)
    1.  
    1. c.
    1. Topicalization of a Complement of P from RC
    1.  
    1.  
    1. Megi
    2. me.acc
    1. er
    2. is
    1. det
    2. it
    1. mangek
    2. many
    1. [
    2.  
    1. som ____k
    2. REL
    1. ringer
    2. ring
    1. til ___i ].
    2. to
    1. ‘Me, there are many who call ___.’
    2. ~‘Many people call me up.’          (Morderen ringer, #1491)
    1.  
    1. d.
    1. Topicalization of PP from RC
    1.  
    1.  
    1. On
    1. Janes-øyai
    2. Janes-island
    1. var
    2. was
    1. det
    2. it
    1. bare
    2. only
    1. fuglerk
    2. birds
    1. [
    2.  
    1. som ___k
    2. REL
    1. kunne
    2. could
    1. bo
    2. live
    1. ____i ].
    2.  
    1. ‘On Janes Island it was only birds that could live.’
    2. ~‘Only birds can live on Janes Island.’          (Den lange veien hjem, #12127)
    1.  
    1. e.
    1. Topicalization of +wh Complement CP from RC
    1.  
    1.  
    1. [Hva
    2. what
    1. han
    2. he
    1. gjør
    2. does
    1. når
    2. when
    1. det
    2. it
    1. er
    2. is
    1. ruskvær]i
    2. bad-weather
    1. er
    2. is
    1. det
    2. it
    1. ingenk
    2. no-one
    1. [
    2.  
    1. som ___k
    2. REL
    1. vet _i ].
    2. knows
    1. ‘What he does when there’s bad weather, there’s no one who knows.’
    2. ~‘No one knows what he does when there’s bad weather.’          (Å plukke en smørblomst, #277)

5 Target Generalizations and Evaluation of Learning Models

We concluded that the target Norwegian grammar allows wh-movement, relativization, and topicalization from both RCs and EQs. We observed that certain dependency-island combinations were more commonly attested in the literature, but provided evidence that the range of acceptable dependencies extended beyond the most common example types.

There are many ways in which the island violations in our sample fail to provide direct evidence for the full range of acceptable island violations in the target grammar. First, we saw no wh-movement from EQs, even though such dependencies are reportedly acceptable. If our results are representative of children’s input, then Norwegian children do not receive evidence of wh-movement from EQs. However, as mentioned above, our corpus sample may not be representative of children’s input particularly with respect to wh-movement dependencies because questions are less frequent in written corpora than in CDS. It is possible that if wh-questions from EQs do, in fact, occur in CDS. We encourage future investigation of this possibility.

The distribution of RC-island violations in the corpus underdetermines the target adult generalizations to an even greater extent. The observed distribution was rife with conspicuous gaps. The input is compatible with a rather narrow generalization: topicalization (and only topicalization) is only permitted from subject RCs in presentational or cleft constructions. But, children should reject this narrow conclusion to reach the target grammar, which seems to allow filler-gap dependencies into non-presentational/cleft RCs and non-subject RCs (though under poorly understood, contextually sensitive conditions).

We now consider how well different learning models might recover the appropriate generalization if presented with the input that we observe. We compare Constructivist/usage-based models (MacWhinney 1975, 1982; Tomasello 2000, 2003; Goldberg 2006; Dąbrowska 2004, 2008; Verhagen 2006) to two models within the Generative tradition: the computational learner of Pearl & Sprouse (2013), and general parameter-based learning. We conclude that Constructivist models are liable to overfit the observed distribution: learners are predicted to end up with constructions that account for observed island violations, but which do not allow for generalization to unseen cases that adults accept. On the other hand, the Generative models are predicted to learn the less restrictive generalization that A’-movement dependencies are allowed from all EQs and RCs. We discuss how supplemental, independently motivated constraints can explain the unacceptability of the residual subset of unacceptable dependencies into RCs and EQs under these accounts.

5.1 Constructivist Learning Strategies

Usage-based or item-based learning models treat filler-gap acquisition as learning a set of bespoke constructions or templates for well-formed dependencies (Tomasello 2003; Dąbrowska 2004, 2008, a.o.). Templates are generated as follows: learners first construct highly specific formulae/frames from single-items and subsequently collapse multiple formulae into more general abstract templates or constructions. The original formulae can be seen as the conjunction of lexical, morphological, syntactic, semantic, and pragmatic features of a single item at varying granularities (e.g., Dąbrowska & Lieven 2005). Generalization involves collapsing over equivalence classes of features across formulae, though the exact mechanism of generalization is not well understood. Generalization yields a complex well-formedness condition that abstracts over variable features, but maintains highly specific features along dimensions that were invariant in the input.

If examples of wh-movement from EQs are not attested in children’s input, then a constructivist learning strategy, which generalizes from direct evidence, would arguably have a difficult time learning that the filler-gap dependencies are possible. In principle, a constructivist strategy could be engineered to generalize that wh-movement out of EQs is acceptable having observed relativization and topicalization from the same domains. However, such generalization would require collapsing across dependency-types with different semantic features and discursive functions.13 Alternatively, if examples of wh-movement from EQs do infrequently occur in CDS, children could potentially learn their distribution. If there are stricter felicity conditions on wh-movement from EQs than other dependency types, which there seem to be, a usage-based learner might be in a position to learn the narrower restrictions from the smaller number of attested examples, as there would be significantly less variation in the fine-grained structural and semantic features shared across the observed dependencies.

Usage-based learning models would have more difficulty learning the appropriate distribution of acceptable filler-gap dependencies into RCs. We illustrate this point focusing on what superficial lexical and syntactic features a usage-based learning model could extract from the input as necessary conditions on filler-gap dependencies into RCs. Assuming that it was minimally sensitive to features such as (i) dependency type, (ii) embedded clause type, (iii) phrase type, and (iv) the lexical content of the embedding clauses (Dąbrowska 2004, 2008; Verhagen 2006; Löwenadler 2015), a usage-based learner might end up with a construction/template like (33) for RC-island violations. (Constructions are expected to be even more rarefied including additional semantic and pragmatic conditions on application, see e.g., Löwenadler 2015).

(33) Extraction from RC’ Template/Construction:
  Topicalize {DP, AP,PP} from RC, R, iff:
  a. R is a subject RC, and
  b. R is embedded under a copula
  c. R is dominated by a predicate with an expletive subject
  d.

The construction/template in (33) closely tracks the fine-grained features of RC-island violations in the input. As a result, it overfits the input distribution, excluding dependencies that adults accept (e.g. dependencies into non-cleft/presentational and non-subject RCs).

5.2 Generative Syntactic Learners

Usage-based models maintain that knowledge of the distribution of acceptable filler-gap dependencies consists of bespoke, construction-specific conditions that incorporate all manner of features across grain-size and grammatical type. Other learning models restrict their attention to a subset of syntactic features in the input. We consider how two different approaches to syntactic learning would fare on the Norwegian data. Both models learn syntactic generalizations that, in isolation, predict a wider range of acceptable island violations than are observed in the corpus – and to some extent the target distribution of acceptable filler-gap dependencies.

5.2.1 Pearl & Sprouse’s (2013) Learner

Pearl & Sprouse (2013) propose a computational model for learning the set of acceptable filler-gap dependencies (in English) from child-directed input. The learner tracks the probability of sequences of container nodes in a tree between a filler and gap, where container nodes are the major phrasal categories (NP, VP, IP, CP), potentially annotated with extra lexical information. The learner stores probabilities of container node trigrams, where a trigram is a sequence of three contiguous container nodes. For example, the trigram CP-IP-VP corresponds to a CP, followed by an IP, followed by a VP container node. For any given sentence with a filler-gap dependency, F, the model treats the probability of F’s container node sequence as a proxy for the acceptability of F. The lower the probability of the container node sequence, the less acceptable the model predicts F to be. We illustrate how the model works with two English examples.

An acceptable long-distance filler-gap dependency like (34a) would correspond to the container node sequence Start-IP-VP-CPthat-IP-VP-End, which would in turn be broken down into the trigrams in (34b). The probability of the container node sequence Start-IP-VP-CPthat-IP-VP-End could then be calculated by computing the product of the probabilities of the individual trigrams, as in (34c).

(34) a. What did [IP Tor [VP think [CPthat that [IP Siri [VP made ___ ?]]]]]
  b. (Start-IP-VP), (IP-VP-CPthat), (VP-CPthat-IP),(CPthat-IP-VP), (IP-VP-End)
  c. P(Start-IP-VP-CP-IP-VP-End) =
    p(Start-IP-VP)*p(IP-VP-CPthat)*p(VP-CPthat-IP)*p(CPthat-IP-VP)*p(IP-VP-End)

For each of the trigrams in an acceptable container node sequence, children presumably observe multiple sentences containing that trigram in their input. Thus the probability of each of the individual trigrams in such a sequence is significantly above zero, as is their product.

In contrast to acceptable dependencies, probabilities calculated for island-violating dependencies would be roughly zero. To see why, take the RC-island-violation in (35), which corresponds to the container node sequence Start-IP-VP-NP-CPRC-Who-IP-VP-End, which in turn is split up into the trigrams in (35b).

(35) a. *What did [IP Tor [VP know [NP someone [CPRC-Who who [IP [VP made ___ ?]]]]]
  b.   (Start-IP-VP), (IP-VP-NP), (VP-NP-CPRC-Who), (NP-CPRC-Who-IP),(CPRC-Who-IP-VP), (IP-VP-End)
  c.   P(Start-IP-VP-NP-CPRC-Who-IP-VP-End) =
      p(Start-IP-VP)*p(IP-VP-NP)*p(VP-NP-CPRC-Who)*p(NP-CPRC-Who-IP)*p(CPRC-Who-IP-VP)*p (IP-VP-End) ~ 0

Simplifying slightly, because children never see filler-gap dependencies crossing the container node CPRC-Who, the probability associated with any trigram containing CPRC-Who is zero.14 Thus, when the trigrams in (35b) are multiplied, the unattested trigrams ‘zero-out’ the product. The dependencies are judged extremely improbable and therefore unacceptable.

Suppose the model were applied to the Norwegian data. Pearl and Sprouse’s model learned the distribution of wh-dependencies. As we did not find any examples of wh-dependencies into islands in our corpus, it is likely that the model would reach the same conclusion in Norwegian as in English if we restricted the input to wh-dependencies. The same model can, however, be applied to different dependency types. We illustrate how below.

Following assumptions in Pearl & Sprouse, a case of topicalization out of RC like (32a) would be represented as in (36). Here we follow the assumption that the CP container node for the RC will be CPRC-SOM, as the CP is annotated for clause type and the presence of an overt complementizer.

    1. (36)
    1. Detk [IP
    2. that
    1. erv
    2. is
    1. det [VP tv [DP
    2. it
    1. ingen
    2. no.one
    1. [CPRC-SOM
    2.  
    1. somj [IP ___j [VP
    2. REL
    1. vet
    2. knows
    1. ___k.]]]]]]
    2.  
    1. ‘That, there is no one who knows.’

Observing (36) would lead the learner to assign non-negligible probability to container node trigrams that include CPRC-SOM, such as (VP – DP – CPRC-SOM), (DP – CPRC-SOMIP), (CPRC-SOMIPVP). Thus the learner would learn the generalization that RCs with an overt relative complementizer som are non-islands. The learner would not learn that all RCs are non-islands from such sentences, because RCs without overt som would have a different CP container node (e.g CPRC-∅). Insofar as the presence or absence of som does not (by hypothesis) determine the islandhood of a non-subject RC, the model overfits the data. In all other regards, however, the model predicts a wider range of acceptable island violations than are observed in the input distribution: The model does not learn to restrict RC-island violations to subject-RCs or to clefts and presentational RCs because the fine-grained syntactic information needed to distinguish the relevant subset is not encoded in the container node inventory.15

As for cross-dependency differences, the model could either learn different distributions for each dependency type if it tracked different container node sequences and probabilities by dependency type. On the other hand, if we assume that container node trigrams are not relativized to dependency type, then the model does not predict any differences by dependency.

5.2.2 Parameter-setting Models

Other learning models cast syntactic acquisition as inference over a parametrically-defined hypothesis space (Wexler & Manzini 1987; Gibson & Wexler 1994; Yang 2002; Sakas & Fodor 2001, 2012; Pearl & Lidz 2013; Gould 2017). We do not consider how a learner might navigate the myriad conceivable parameter spaces that could ‘solve’ the problem. Instead we discuss a single theoretically-motivated parametric approach.

Parametric accounts of island-sensitivity presuppose the existence of abstract, general universal constraints on syntactic movement (e.g. Subjacency, Chomsky 1977; Phases, Chomsky 2000). A domain, D, is an island under analysis A1 if movement from D runs afoul of a constraint, C. Parametric theories explain variation in island sensitivity by positing a parameter that provides an alternate analysis for D, A2, that allows a movement to comply with C. Parametric approaches to variation in wh-island-sensitivity are well-known and empirically motivated (e.g. Rizzi 1980, 1982; Reinhart 1981).

Movement from EQs and RCs is often assumed to be blocked by locality constraints (e.g. Subjacency, Phase Impenetrability, etc.) in languages like English because successive cyclic movement out of the EQ/RC is blocked by a phrase in the spec,CP of the embedded clause. One way of handling cross-linguistic variation in island sensitivity is to posit that island-insensitive languages allow an extra specifier in the CP domain through which a moved phrase can transit (for EQs see Reinhart 1981; for RCs see Lindahl 2014; Nyvad, Christensen, & Vikner 2017; Vikner, Christensen, & Nyvad 2017; Kush et al. 2019). If the existence of such an extra specifier is a parametric option, then exposure to filler-gap dependencies into EQs and RCs would constitute evidence for that parameter setting. Once Norwegian children observe (sufficient) dependencies into EQs and RCs, they would choose an extra-specifier grammar that allows such dependencies. Importantly, the analysis reduces the problem of acquiring dependencies into EQs and RCs to learning a common syntactic feature. Children need not learn two distinct analyses, one for each island violation. Moreover, because any examples of movement from either an EQ or RC constitutes evidence for the single parameter setting, the frequency of triggering evidence is increased.16 Finally, if the hypothesis space is structured such that children are learning about abstract A’-movement (of which wh-dependencies, relativization, and topicalization are just examples), then observing any one of the three dependency types cross an island boundary would constitute evidence that all other types of dependencies could do so.

6 General Discussion

We asked whether Norwegian children receive direct evidence of island violations in their input. We searched the NorGramBank child fiction corpus for filler-gap dependencies into RCs and EQs, two types of island violations that are reportedly acceptable in Norwegian (and other Mainland Scandinavian languages). Overall, we found that child-directed fiction texts contain island violations, though examples were significantly less frequent than long-distance dependencies into simple declarative complement clauses.

Island violations were found in the input to relatively young children: There were examples of topicalization dependencies into RCs in texts intended for children 3–5 and 6–9 years of age. Examples of topicalization and relativization into EQs were not observed in the small sample of books for 3–5 year olds, but they were relatively well attested in books for children 6 years of age and older. Books for older children (9–11) and teens (12–18) had similar distributions of island violations.

Importantly, island violations were unevenly distributed across the space of possible dependencies and more restricted than the distribution of non-island filler-gap dependencies. Wh-movement into EQs and RCs was completely unattested in the corpus. Topicalization and relativization dependencies were observed with EQs. Only topicalization was observed out of RCs. The distribution of attested RC-island violations was conspicuously skewed: filler-gap dependencies were only observed from presentational and cleft RC constructions where the highest subject had been relativized. The highly constrained distribution of RC-island violations is consistent with previous reports (Engdahl 1997; Lindahl 2017) and with recent experimental findings that suggest that Norwegians reject wh-movement from RCs more consistently than topicalization from the same structures (Kush et al. 2018, 2019).

6.1 Representativeness of the data

We inspected how well the fine-grained characteristics of island violations in the input represented the target distribution of acceptable dependencies in the Norwegian adult grammar. The input distribution was compatible with very narrow generalizations that would not extend to the full range of forms that adults judge to be acceptable. We concluded that learners generalize beyond what is seen in their input.

Our conclusion that children’s input does not provide sufficient evidence for the full range of acceptable dependencies in the target grammar is only justified if the absence of particular forms in the written corpus is representative of their absence (or extreme infrequency) in CDS, children’s primary linguistic input. Island violations may be less frequent in written text than in speech (if, for example, they are associated with a more informal register). The occurrence of violations in our sample argues against categorical prohibitions against their use in writing and preliminary research suggests that island violations are also found in formal text like newspaper articles (Sant, Strætkvern, & Kush 2019). Nevertheless, their overall occurrence could still be less frequent, or particular types of examples could be under-represented in text relative to speech.

We discussed ways in which the statistics of written corpora deviate from CDS: studies in English show that main questions are less frequent in written texts than in CDS, while complex constructions such as RCs are more frequent (Cameron-Faulkner & Noble 2013; Noble, Cameron-Faulkner, & Lieven 2018; Montag 2019). The divergences between text and speech characteristics cut both ways.

The absence of island-violating wh-movements in the corpus, could reflect, in part, the reduced frequency of wh-movement in text. It is possible that examples of island-violating wh-movement are to be found in CDS, given the higher base rate of wh-movement. Thus, we cannot definitively conclude that children’s total input lacks evidence of island-violating wh-movement. If examples are present in CDS, the question remains whether they are frequent enough to drive reliable acquisition (e.g., distinguishable from errors or noise; Legate & Yang 2002).

On the other hand, the fact that RCs are more common in text than in CDS could be taken as evidence that our results overestimate the frequency of island violations that feature RCs. Our corpus findings might therefore overestimate the frequency of direct evidence of relativization from EQs. Our findings may also overestimate the frequency of RC-island violations given the higher base rate of RCs in text.

When it comes to RC-island violations, we feel relatively confident in extrapolating from the corpus to the conclusion that children’s input does not provide sufficient direct evidence of the full range of acceptable RC-island violations. In order to have direct evidence of the full range of acceptable RC-island violations, children would need to observe filler-gap dependencies into non-presentational/cleft RCs. But such RCs are even less likely in CDS than in text: Diessel & Tomasello (2000) note that presentational RCs make up close to 70% of the RCs in (German) CDS, compared to the 20% in our sample. Based on these numbers, we tentatively conclude that direct evidence of non-presentational RC-island violations is unlikely to be sufficiently frequent in CDS to drive direct acquisition of the knowledge that non-presentational RCs also allow extraction.

6.2 Evaluation of Learning Models

We considered how learning models with different inductive biases could recover the target generalizations based on our input. We reasoned that a simple usage-based learning model would overfit the input distribution because it would be too reliant on features of attested examples and would therefore preclude generalization to a number of the construction types that adults accept. Syntactic learners, such as the computational learner of Pearl and Sprouse (2013) or parameter-setting models (Yang 2002; Sakas & Fodor 2001, 2012; Gould 2015), learn generalizations that go beyond the empirical distribution in the input.

The fact that the syntactic learning accounts generalize broadly allows them to account for dependencies that are acceptable in the target grammar but unattested in the corpus. We consider this a welcome result: the models arguably learn the correct basic syntactic generalization that movement is allowed out of both RCs and EQs. However, the basic syntactic generalizations alone are insufficient to explain the finer-grained distribution of acceptable and unacceptable dependencies into EQs and RCs. In subsection 6.4 we sketch how the overall distribution of acceptable sentences might be modeled by incorporating semantic (e.g. Szabolcsi & Zwarts 1993; Abrusán 2014) and discursive/functional constraints (Erteschik-Shir 1973, 1982; Erteschik-Shir & Lappin 1979, Kuno 1987; Van Valin 1995, 1998; Ambridge & Goldberg 2008; Goldberg 2006, 2013) as supplemental filters on the output of the syntax (as suggested by Lindahl 2017 and Kush et al. 2018).17

Although we have expressed support for Generative learning accounts, we acknowledge the possibility that a usage-based model that attends to a different, select subset of features might not be prone to the same overfitting problem. It is also possible that filler-gap acquisition involves a stage of feature-reduction or selection wherein previously-posited highly specific features are abandoned in favor of more general features. At present, we know of no usage-based models that have a fleshed-out account of such a feature-selection process in filler-gap acquisition, but we encourage proposals in this direction. Insofar as we do not have longitudinal data on the abstractness of Norwegian children’s generalizations, we cannot rule out that they initially pursue fine-grained generalizations that overfit the data, only to adopt broader generalizations at a later age (Boyd & Goldberg 2011; Tomasello 2003).

6.3 Frequency and Sufficiency

In the introduction we asked what role the frequency of island violations might play in the acquisition of such structures. We found that children encounter island violations in written text. Do our results indicate that children see such violations frequently enough to learn island insensitivity from them alone? We cannot offer a definitive answer to this question, but provide discussion of the relevant challenges.

One reason frequency is presumed to be important is that it can help learners distinguish signal (e.g., target sentences) from noise (e.g. errors) in the input. Target constructions should occur more reliably than speech errors that should be ignored (e.g. Legate & Yang 2002). In edited text, noise is assumed to be either non-existent or very low, so low frequency constructions can be taken as potentially ‘sufficient’ evidence evaluated relative to the corpus. However, we do not know how corpus frequencies compare to frequencies in CDS, so we are not licensed to conclude that children observe sufficient direct evidence on the whole to distinguish island violations from noise.

Even if children are able to distinguish target sentences from noise, the frequency of a construction can still affect its learnability. As we discuss below, however, what constitutes ‘sufficient’ evidence for acquisition varies by model (and model parameters).

Under usage-based models frequency determines the degree to which generalizations are abstract – and possibly whether generalization occurs at all (Boyd & Goldberg 2011). If frequency drives abstractness of generalization (perhaps because feature-selection processes require larger amounts of data), then the relative infrequency of island violations would lead to predictions of highly specific constructions. The relative infrequency of island violations should also entail relatively late onset of generalization/acquisition, though a later age of acquisition is not a unique prediction of usage-based models.

For Pearl & Sprouse’s learner, islands are structures with container node sequences with unattested trigrams of zero (or nearly zero) probability. Since the distinction between island and non-island essentially amounts to ‘seen v. unseen’ trigrams, the exact frequency of direct evidence is less important (as long as it is frequent enough to be reliably observed). Of course, the more frequent island violations are in the input, the higher probability the relevant trigrams will be, which will thereby increase the predicted acceptability of similar island violations. Thus, frequency is important for tuning the acceptability predictions of the model. Finding the required frequencies to best model human judgment patterns is a project we leave for future work.

Multiple factors influence estimation of sufficiency under parameter-setting models including: the ambiguity of the data (Fodor 1998), the number of hypotheses under consideration, the model’s learning rate (e.g. Yang 2002), and the prior probabilities or weights assigned to each hypothesis. Island violations are relatively unambiguous evidence against the large class of conceivable grammars that disallow filler-gap dependencies into islands. We speculate that the positive evidence of island violations would be ‘enough’ to learn the right syntactic generalizations in Norwegian if learners must only set a single parameter, such as whether the complementizer domain of the language provides an extra escape hatch for successive cyclic movement. We have not, however, conducted the simulations to argue for this conclusion strongly.

Different models’ learning rates would affect how quickly children learned island insensitivity. Triggering models (e.g., Gibson & Wexler 1994) could successfully set the parameter with little data, since parameter settings can, in principle, be changed in response to exposure to individual sentences (or to a small number of sentences).18 Models where the learning rate can vary in relation to the strength of the evidence (e.g. Bayesian models like Pearl & Lidz 2009 and Perfors, Tenenbaum, & Regier 2011) could also learn the generalization quickly. Models with smaller, fixed learning rates (e.g. Yang 2002; 2004) require more data to set the parameter, resulting in a protracted acquisition period. This is particularly true if the prior on a parameter setting (like having an extra specifier) is especially low. A general preference for simpler grammatical analyses might bias children against the parameter options that allow for island-insensitivity, unless the input forced the analysis. The stronger the bias for a simpler model, the more data would be required to overcome that bias. Future work should explore interactions of frequency and parameter settings in implemented models.

6.4 Accounting for finer-grained restrictions

If we assume that generative models treat wh-dependencies, relativization, and topicalization as underlying instances of A’-movement, the models above learn the arguably correct syntactic generalization that phrases of any type can be wh-moved, relativized, or topicalized out of any EQ or RC. We saw, however, that some dependencies that the model could generate were consistently judged unacceptable and others were extremely rare, if not unattested. How can we explain the distribution of observed dependencies from Norwegian EQs and RCs if we adopt a generative model that, on its own, allows for unrestricted movement from EQs and RCs?

One possibility is to assume that the responsibility for explaining the residual restrictions on Norwegian island violations in Norwegian should not fall to the syntax. Instead, unacceptable forms that the syntax generates should be ‘filtered out’ by independent semantic and pragmatic conditions. We sketch below how different distributional restrictions can be seen as arising from semantic and pragmatic conditions.

6.4.1 The argument/adjunct asymmetry

We saw in Section 2 that Norwegian appears to follow the well-known argument/adjunct asymmetry: moving arguments from EQs and RCs is often acceptable, but moving adjuncts is unacceptable (see 14). Norwegian EQs and RCs are therefore weak islands (Szabolcsi & Lohndal 2017) in that they selectively allow movement determined by phrase type. This fact is not predicted under the simple generative models above, which allow free movement of phrases from EQs and RCs irrespective of type.

Many researchers have argued that the argument/adjunct asymmetry and other weak island effects are most parsimoniously understood as reflecting semantic or pragmatic conditions. For example, Szabolcsi & Zwarts (1993) argue that the effects follow from the definitions of basic compositional operations within a Boolean algebra: The operations required to interpret individual-denoting traces (e.g. corresponding to arguments) are defined within the scope of an operator like a wh-phrase, but the operations required to interpret non-individual-denoting traces (e.g. corresponding to adjuncts) are not defined in the same domain. An alternative idea, pursued by Abrusán (2014), is that the prohibition on (wh-)moving (many) adjuncts from EQs (and other weak island environments) results from a failure to meet basic presuppositions associated with the felicitous use of questions. Adjunct questions into EQs are ill-formed because either (i) they presuppose a contradictory set of propositions, or (ii) it is impossible to provide a maximally informative true answer (Dayal 1996, 2016; Fox & Hackl 2006) to them. As such they are judged to be unacceptable (see Abrusán 2014 for extensive case-based illustration).

The appeal of the two accounts above is that they manage to derive the argument/adjunct distinction ‘for free’ from independently-motivated operations or principles, without stipulating additional operations or machinery in the syntax.19 Simply knowing the inventory of possible compositional operations (arguably universal), or the basic felicity conditions on question formation guarantees that the learner will make the appropriate distinction.

6.4.2 Differences across dependency type

The second fact that merits discussion is why island violations are most frequently observed with topicalization and relativization, but less often (or not at all) with wh-movement even though all three dependency types are, in principle, allowed into EQs and RCs. As we see it, the restricted distribution of wh-movement dependencies likely reflects the interaction of semantic and pragmatic conditions similar to those discussed above. We offer some speculation.

If the ability to move a phrase out of a weak island tracks the ability to interpret the trace of that phrase as individual-denoting as Szabolcsi & Zwarts (1993) and others have argued, it may simply be easier to interpret the traces of relativization and topicalization as individual-denoting than wh-traces.20 Alternatively, the differences could reflect that it is easier to accommodate the presuppositions associated with island-violating relativization or topicalization than wh-movement. This last idea is similar to a proposal from Abeillé, Hemforth, Winckel, & Gibson (2020), who argued that cross-dependency differences follow from the fact that dependencies are subject to distinct felicity conditions. We acknowledge that these remarks are largely speculative and encourage future research that attempts to formalize these intuitions more explicitly.

6.4.3 The preference for RCs in cleft and presentational constructions

Finally, we only observed examples of movement from presentational and cleft RCs in the corpus even though movement is permitted from other RC types (cf. examples in 18). We saw in Section 2 how most movement from other RC types was judged unacceptable without significant contextual support. Why should it be easier to extract from presentational and cleft RCs?

We speculate that when dependencies into non-presentational and cleft RCs are rejected in Norwegian, they are rejected for pragmatic reasons. Many functionalist approaches incorporate the idea that discourse-pragmatic well-formedness conditions influence the distribution of acceptable filler-gap dependencies. The conditions often tie the acceptability of a dependency to the informational-status of the constituent containing the gap (Erteschik-Shir 1979, 1982; Kuno 1987; Van Valin 1995, 1998; Ambridge & Goldberg 2008; Goldberg 2006). Most of the proposals share the general intuition that filler-gap dependencies are allowed into domains that make a discourse-relevant predication in some way or another. Constituents that do not convey such information block filler-gap dependency formation. The various proposals define their constraints in slightly different ways and it is not our goal to evaluate the different formulations. We show below, however, that the constraint blocking movement from some RCs cannot be stated in terms of presupposition.

Under one influential proposal (e.g., Goldberg 2006), filler-gap dependencies are only allowed to cross into constituents that convey non-presupposed information (i.e. those that are not backgrounded in Goldberg’s terminology).

Goldberg (2006) uses the standard negation test as a way to identify presupposed/backgrounded constituents (Langendoen & Savin 1971). A constituent C is presupposed if its content is still entailed when C is in the scope of negation. For example, the man that liked waffles in (37a) and (37b) is identified as presupposed because both the affirmative and negative sentences entail that there exists a man that likes waffles.

(37) a. John saw the man [that liked waffles].
  b. John didn’t see the man [that liked waffles].

An account based in presupposition can explain why movement from presentational or existential RCs is possible. The existence of the head noun and RC is asserted, not presupposed, in such sentences. (38b) explicitly negates the existence of waffle-likers.

    1. (38)
    1. a.
    1. Det
    2. It
    1. er
    2. is
    1. mange
    2. many
    1. som
    2. REL
    1. liker
    2. like
    1. vafler.
    2. waffles
    1. ‘There are many (people) that like waffles.’
    1.  
    1. b.
    1. Det
    2. It
    1. er
    2. is
    1. ikke
    2. not
    1. mange
    2. many
    1. som
    2. REL
    1. liker
    2. like
    1. vafler.
    2. waffles
    1. ‘There are not many (people) that like waffles.’

However, a presupposition-based account wrongly predicts that movement from cleft RCs should be blocked, as discussed in Lindahl (2017). Norwegian cleft RCs are presupposed, as the negation test in (39a) shows (see also Prince 1978; Delin 1992; Abbott 2000). As such, cleft RCs would be predicted to block movement, contrary to fact (cf. 32b reprinted as 39b).

    1. (39)
    1. a.
    1. Det
    2. it
    1. var
    2. was
    1. ikke
    2. not
    1. Ronja
    2. Ronja
    1. som
    2. REL
    1. var
    2. was
    1. trist.
    2. sad
    1. ‘It was not Ronja that was sad.’          → Someone was sad.
    1.  
    1. b.
    1. Tristi
    2. sad
    1. var
    2. was
    1. det
    2. it
    1. bare
    2. only
    1. Ronja
    2. Ronja
    1. som
    2. REL
    1. var
    2. was
    1. ___i.
    2.  
    1. ‘It was only Ronja that was sad.’

A presupposition-based account makes the same incorrect prediction about extraction from predicate nominals such as in (17). For the sake of space, we leave this to the reader to confirm.

The above suggests that the relevant feature of presentational, cleft (and predicate nominal) RCs that allow movement is not the absence of presupposition. Some other information-structural feature(s) must be found.

One potentially promising alternative is to ground the constraint in the distinction between new vs. old information: The RCs that allow movement are those that contribute wholly or partially new information to the discourse (or at least information that need not be known to the hearer). Though presupposition and ‘old information’ are at times conflated, there is reason to keep them separate: clefts be used to contribute new information even though they carry presuppositions (see, e.g., Prince 1978, 1981; Delin 1992; Abbot 2000). One way of characterizing what is ‘new’ information is to identify a sentence’s ‘main point of utterance’ (MPU; see Abbott 2000; Simons 2007). Simons operationally defined the MPU as follows: “[T]he main point of an utterance U given in answer to a question is that part of the content of U which constitutes the proffered answer to the question.” (Simons 2007: 1035) Typically, the MPU of an utterance is contained in a main clause, but in some cases, the answer can be proffered in an embedded clause. RCs convey the MPU in Norwegian existential, presentational, and cleft constructions, where the semantic contribution of the matrix predicate is essentially null (Prince 1978).

An MPU-based account may also explain why movement out of some RCs attached to object nominals is possible, but subject to fuzzier contextual constraints (see 18 and 20). Whether an embedded clause is the MPU is, at least partially contextually determined. Simons argues that an embedded clause can be the MPU if the matrix clause is interpreted parenthetically, e.g. serving a discourse function such as indicating evidentiality, or the speaker’s (emotional) orientation towards the content of the embedded clause. This tracks well with the observation that the most frequent embedding verbs in sentences where movement has occurred from RCs attached to object nominals are verbs like å kjenne (to know/be acquainted with) and perception verbs like å se (to see), as observed in (18a,b). Examples like (18c) are roughly consistent with the observation that acceptable embedding verbs can also indicating the speaker’s emotional orientation towards the content of the RC, which is the MPU. We take these observations as suggestive evidence in favor of considering a new/MPU-based account, but acknowledge that there are many details that need to be worked out and which need to be motivated more rigorously and precisely. We leave this to future work.

There is the question of where such pragmatic well-formedness conditions come from. Some researchers (e.g., Van Valin 1995; Goldberg 2006, a.o.) assume that felicity conditions can be learned via exposure to examples of felicitous filler-gap dependencies, though the exact mechanism by which this occurs remains unclear and the origins of notions like ‘presupposition’, ‘new information’ or ‘MPU’ need to be spelled out. We point out, however, that these concepts and conditions – if learned – are most likely learned independently of islands.

Finally, though we have adopted the view that pragmatic conditions impact the distribution of acceptable filler-gap dependencies, we disagree with previous functionalist researchers that pragmatic conditions should entirely supplant syntactic locality constraints. We advocate a position where independent pragmatic conditions supplement syntactic restrictions. Such a view allows for the syntax to ‘over-generate’ ultimately unacceptable dependencies that are filtered out on pragmatic grounds, but it also allows us to explain why some dependencies that might meet pragmatic felicity conditions are still judged unacceptable (in some languages). We take cross-linguistic variation in island sensitivity as an instance of the latter case. If filler-gap dependencies into cleft, presentational, and contextually-supported RCs are acceptable in Norwegian, but not in English, then there must be a non-pragmatic explanation for this residual unacceptability (assuming that notions of MPU/ ‘new’ information do not vary cross-linguistically). The residual differences can, we suggest, be linked to syntactic differences between the languages: Norwegian provides syntactic escape from these domains, where English does not.

7 Conclusion

Norwegian permits wh-movement, relativization, and topicalization dependencies into RCs and EQs, domains which are traditionally thought to be islands for filler-gap dependency formation. We investigated whether Norwegian children receive direct evidence in their input that such dependencies are acceptable. A search through the Norwegian Child Fiction Corpus revealed that filler-gap dependencies into both RCs and EQs are attested. Attested examples do not, however, provide a representative range of the full set of acceptable dependencies in the target grammar. We therefore reasoned that learning about island-insensitivity in their native language requires Norwegian children to generalize beyond the fine-grained distributional characteristics of the input. We argued that usage-based learning models would have difficulty learning the appropriate generalizations because they would overfit the input distributions. Generative syntactic learning models would predict a wider range of acceptable dependencies than what is in the input because they would only learn coarse syntactic generalizations. We speculated that the appropriate distribution could arise via the interaction of coarse syntactic generalizations and independent supplemental semantic or discursive felicity conditions.

Appendix A

The search queries used to identify potential island violations are listed below. Queries are separated according to embedded clause type. False positives were removed manually.

  • Find Sentences with Questions:

    #x_ >FOCUS-INT #y_

    Paraphrase: Find sentences containing an f-structure node #x, such #x dominates another node #y along an edge labeled ‘FOCUS-INT’

  • Find Sentences with RCs:

    #x_ >TOPIC-REL #y_

    Paraphrase: Find sentences containing an f-structure node #x, such #x dominates another node #y along an edged labeled ‘TOPIC-REL’

  • Find Wh-movement into RC:

    #x_ >FOCUS-INT #t_ & #x_ >* #y_ >CLAUSE-TYPE ‘rel’ & #y_ >* #t_

    Paraphrase: Find sentences containing an f-structure node #x such #x dominates another node #t along an edge labeled ‘FOCUS-INT’ and a node #y such that #y is marked CLAUSE-TYPE ‘rel’ and #y dominates #t

  • Find Relativization into RC:

    #x_ >TOPIC-REL #t_ & #x_ >* #y_ >CLAUSE-TYPE ‘rel’ & #y_ >* #t_

  • Find Topicalization into RC:

    #x_ >TOPIC #t_ & #x_ >* #y_ >CLAUSE-TYPE ‘rel’ & #y_ >* #t_

  • Find Wh-movement into EQ:

    #x_ >FOCUS-INT #t_ & #x_ >* #y_ >CLAUSE-TYPE ‘wh-int’ & #y_ >* #t_

    Paraphrase: Find sentences containing an f-structure node #x such #x dominates another node #t along an edge labeled ‘FOCUS-INT’ and a node #y such that #y is marked CLAUSE-TYPE ‘wh-int’ and #y dominates #t

  • Find Relativization into EQ:

    #x_ >TOPIC-REL #t_ & #x_ >* #y_ >CLAUSE-TYPE ‘wh-int’ & #y_ >* #t_

  • Find Topicalization into EQ:

    #x_ >TOPIC #t_ & #x_ >* #y_ >CLAUSE-TYPE ‘wh-int’ & #y_ >* #t_

  • Find Dependencies into Embedded Polar Questions:

    Identical to queries for Embedded Wh-Questions, with CLAUSE-TYPE ‘wh-int’ replaced by CLAUSE-TYPE ‘pol-int’

Appendix B

Table B1

Counts of different RC constructions with and without RC-island violations split by syntactic context.

Presentational It-Cleft RCs attached to Non-subject NPs RCs attached to Subject NPs
Island Violation 29 33 0 0
No IslandViolation 2838 3832 22820 6541
Table B2

Counts of subject- and non-subject-RCs in presentational and it-cleft constructions.

Presentational RCs It-Cleft RCs
Subject RC 1868 2510
Non-Subject RC 999 1355
Total 2867 3865

Abbreviations

DEF = ‘definite’

INF = ‘infinitive’

PL = ‘plural’

REL = ‘relative marker’

Notes

  1. Henceforth, the abbreviation ‘REL’ is used to gloss som, which we analyze as a relative complementizer (Åfarli & Eide 2003). [^]
  2. We assume, for the sake of argument, the conservative position that children do not assume that long-distance movement is possible without direct evidence. We do so for at least two reasons. First, the possibility of local movement does not uniformly entail the possibility of long-distance movement crosslinguistically. Second, a number of acquisition models both Generative and Constructivist assume (either implicitly or explicitly) that such direct evidence is required (e.g. Stromswold 1995; Dąbrowska 2004, 2008). [^]
  3. The base order of the finite matrix verb and the matrix subject is inverted when a phrase other than the matrix main clause subject is wh-moved or topicalized in the main clause because Norwegian is a V2 language (Holmberg & Platzack 1995). [^]
  4. We have glossed the som that is obligatory in embedded questions where the highest subject has been moved as C, simply to avoid the possibility that readers interpret such questions as RCs. [^]
  5. Christensen, Kizach, & Nyvad (2013) found no difference in the average acceptability of moving wh-adjuncts and wh-arguments from EQs in Danish, a Mainland Scandinavian language that patterns with Norwegian on other island judgments. However, the absence of a reliable difference between argument and adjunct extraction in their experiments may reflect a floor effect, given that extraction from EQs was generally rated low. [^]
  6. Following previous work (Hedberg 2000; Fiedler 2014), we assume a syntactic analysis of it-clefts that treats the constituent following the head (mange, Andrew in the examples above) as having the internal syntax of a standard RC, such that it is created by movement of an operator to the specifier of an embedded complementizer phrase (CP). We remain agnostic as to whether the head occupies the specifier of a functional projection (FP) whose head selects the CP or whether the RC merges with the head N(P) directly. [^]
  7. We thank an anonymous reviewer for supplying these and other examples. [^]
  8. Children’s books, especially those directed toward younger children, tend to be read out loud by a parent or caretaker, so they may be considered a portion of CDS. Interestingly, children’s books are often repeatedly read aloud over a stretch of time. This may result in reliable repetition of complex structures or dependencies. We thank Gillian Ramchand (p.c.) for bringing up this point to us. [^]
  9. Embedded declarative clauses were slightly less frequent in our corpus sample than in the CDS sample analyzed by Westergaard (2005) and Westergaard & Bentzen (2007) (compare our 8–10% to their 14%). Embedded questions appear at roughly equal frequency in our sample and theirs (~2%). It should be noted, however, that the CDS estimates are based on a very small sample (579 sentences produced by one adult over the course of an hour), so their counts may not be representative of Norwegian CDS on the whole. [^]
  10. The expected count of dependency type D for age group G and embedded clause type C was calculated as follows: We first computed PDSG, the probability of D from simple declarative clauses for group G, by dividing the observed count in column 1 by the total number of sentences in the corpus with embedded declarative clauses. We then multiplied PDSG by the number of sentences containing clause type C in age group G to get the expected count. For example, the expected count of relativization from RCs in the 3–5 age group was computed as: (2/(286+1+2)) * (302+2) = 2.10 ~ E[2]. [^]
  11. Embedded polar questions can either be headed by the complementizer om or the complementizer hvis. All examples of movement out of a polar question in our data contained om. [^]
  12. An anonymous reviewer points out that the absence of dependencies into non-subject RCs might also partially reflect a genre/register effect. Extraction from non-subject RCs may be considered more marked and thus excluded from edited text. The reviewer points out that Lindahl (2017) only found examples of dependencies into non-subject RCs in spoken language and un-edited text (e.g. blog posts). [^]
  13. An anonymous reviewer points out that Constructivist accounts that require each construction have a shared discourse-functional component (e.g. Tomasello 2003; Goldberg 2006) would reject the possibility of collapsing across discursive function to create an abstract ‘purely syntactic’ template for filler-gap dependencies. [^]
  14. This is a simplification. Following standard practice, Pearl & Sprouse replace zero probabilities associated with unattested trigrams with smoothed trigram probabilities slightly above zero. As a result, container node sequences including unattested trigrams never receive a probability of 0 as they would be if the probability of one of the constituent trigrams was zero. However, since the probabilities assigned to unattested trigrams are significantly lower than attested trigrams, the result is that unattested container node sequences have total probabilities that are many orders of magnitude lower than attested sequences after their trigram probabilities are multiplied. [^]
  15. The argument only goes through if clefts and presentational RCs have the same underlying structure as other RCs. If clefts and presentational RCs were analyzed as structurally distinct from regular RCs, the difference would be reflected in the container node sequences. We know of no defensible syntactic analysis that would treat the correct subset of RCs as fundamentally syntactically different (contra Kush, Omaki, & Hornstein 2013, see Christensen & Nyvad 2014). As discussed earlier, some analyses assume that the head of a cleft and the RC are the specifier and complement of a covert functional phrase, FP. If this is the right analysis, then dependencies into RCs in cleft constructions cross FP. This extra structure would only have an impact on the learner if FP were included in the set of container nodes used to calculate the probability of a dependency. [^]
  16. The fact that EQ and RC-island violations share a common ingredient does not necessarily mean that there are no further syntactic features governing differences between extraction from EQs and RCs. For example, Sichel (2018) advocates an analysis where RCs that permit extraction are analogized to EQs, but argues that such a reduction is only possible if the RC is analyzed as a raising and not a matching RC. [^]
  17. As discussed below, arguing for incorporating semantic or discourse-pragmatic/informational structural constraints is not tantamount to arguing that they supplant syntactic constraints. [^]
  18. Although, as an anonymous reviewer correctly points out, it is not guaranteed that a triggering model would automatically settle on the appropriate parameter setting in response to individual sentences. The speed and efficiency with which the correct parameter setting is arrived at varies as a function of the number of parameters that need to be set and how those parameters interact (see, e.g., Fodor 1998). [^]
  19. This contrasts syntax-based approaches to weak island effects (e.g. Cinque 1990; Cresti 1995; Rizzi 1990). [^]
  20. An anonymous reviewer notes that this suggestion may be compatible with a proposal in Lasnik & Stowell (1991), which argued that the traces of topicalization and (non-restrictive) relativization are not ‘true variables’, unlike traces of wh-movement. We leave investigating this possibility to future research. [^]

Acknowledgements

We would like to thank Malin Bakke Frøystadvåg for her help with sorting and categorizing a portion of the search results reported in this manuscript. Thanks to Terje Lohndal, Brian Dillon, and three anonymous reviewers for helpful feedback on the manuscript. We also thank members of the Øyelab at NTNU, members of the Linguistics communities at UMASS, UC Santa Cruz and UiT, and the audience at MoNS18 for discussion of earlier versions of the work. All errors are our own.

Competing Interests

The authors have no competing interests to declare.

References

Abbott, Barbara. 2000. Presuppositions as non-assertions. Journal of Pragmatics 32. 1419–1437. DOI:  http://doi.org/10.1016/S0378-2166(99)00108-3

Abeillé, Anne & Hemforth, Barbara & Winckel, Elodie & Gibson, Edward. 2020. Extraction from subjects: Differences in acceptability depend on the discourse function of the construction. Cognition 204. 104293. DOI:  http://doi.org/10.1016/j.cognition.2020.104293

Abrusán, Márta. 2014. Weak Island Semantics. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199639380.001.0001

Åfarli, Tor Anders & Eide, Kristin Melum. 2003. Norsk generativ syntaks. Oslo: Novus.

Allwood, Jens. 1982. The complex NP constraint in Swedish. In Engdahl, Elisabet & Ejerhed, Eva (eds.), Readings on Unbounded Dependencies in Scandinavian Languages, 15–32. Stockholm: Almqvist & Wiksell.

Ambridge, Ben & Goldberg, Adele E. 2008. The island status of clausal complements: Evidence in favor of an information structure explanation. Cognitive Linguistics 19(3). 357–389. DOI:  http://doi.org/10.1515/COGL.2008.014

Bates, Alandi & Pearl, Lisa. submitted. When socioeconomic status differences don’t affect input quality: Learning complex syntactic knowledge. Developmental Science.

Bondevik, Ingrid & Kush, Dave & Lohndal, Terje. 2020. Variation in adjunct islands: The case of Norwegian. Nordic Journal of Linguistics. DOI:  http://doi.org/10.1017/S0332586520000207

Boyd, Jeremy K. & Goldberg, Adele E. 2011. Learning what NOT to say: The role of statistical preemption and categorization in A-adjective production. Language 87. 55–83. DOI:  http://doi.org/10.1353/lan.2011.0012

Cameron-Faulkner, Thea & Noble, Claire. 2013. A comparison of book text and child directed speech. First Language 33(3). 268–279. DOI:  http://doi.org/10.1177/0142723713487613

Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT press. DOI:  http://doi.org/10.21236/AD0616323

Chomsky, Noam. 1973. Conditions on transformations. In Anderson, Stephen R. & Kiparsky, Paul (eds.), A festschrift for Morris Halle, 232–286. New York, NY: Holt, Rinehart and Winston.

Chomsky, Noam. 1977. On wh-movement. In Culicover, Peter W. & Wasow, Thomas & Akmajian, Adrian (eds.), Formal syntax, 71–132. New York, NY: Academic Press.

Chomsky, Noam. 1980. Rules and Representations. Oxford: Basil Blackwell. DOI:  http://doi.org/10.1017/S0140525X00001515

Chomsky, Noam. 1986. Barriers. Cambridge, MA: MIT Press.

Chomsky, Noam. 2000. Minimalist inquiries: The framework. In Martin, Roger & Michaels, David & Uriagereka, Juan & Keyser, Samuel Jay (eds.), Step by step, 89–155. Cambridge, MA: MIT Press.

Christensen, Ken Ramshøj & Kizach, Johannes & Nyvad, Anne Mette. 2013. Escape from the island: Grammaticality and (reduced) acceptability of wh-island violations in Danish. Journal of Psycholinguistic Research 42. 51–70. DOI:  http://doi.org/10.1007/s10936-012-9210-x

Christensen, Ken Ramshøj & Nyvad, Anne Mette. 2014. On the nature of escapable relative islands. Nordic Journal of Linguistics 37. 29–45. DOI:  http://doi.org/10.1017/S0332586514000055

Christensen, Kirsti Koch. 1982. On multiple filler-gap constructions in Norwegian. In Engdahl, Elisabet & Ejerhed, Eva (eds.), Readings on Unbounded Dependencies in Scandinavian Languages, 77–98. Sockholm: Almqvist & Wiksell.

Cinque, Guglielmo. 1990. Types of Ā-dependencies. MIT press.

Cresti, Diana. 1995. Extraction and reconstruction. Natural Language Semantics 3(1). 79–122. DOI:  http://doi.org/10.1007/BF01252885

Dąbrowska, Ewa. 2004. Language, Mind and Brain. Georgetown, DC: Georgetown University Press.

Dąbrowska, Ewa. 2008. Questions with long-distance dependencies: A usage-based perspective. Cognitive Linguistics 19(3). 391–425. DOI:  http://doi.org/10.1515/COGL.2008.015

Dąbrowska, Ewa & Lieven, Elena. 2005. Towards a lexically specific grammar of children’s question constructions. Cognitive Linguistics 16. 437–474. DOI:  http://doi.org/10.1515/cogl.2005.16.3.437

Dayal, Veneeta. 1996. Locality in Wh Quantification. Dordrecht: Kluwer. DOI:  http://doi.org/10.1007/978-94-011-4808-5

Dayal, Veneeta. 2016. Questions. Oxford: Oxford University Press.

Delin, Judy. 1992. Properties of It-Cleft Presupposition. Journal of Semantics 9(4). 289–306. DOI:  http://doi.org/10.1093/jos/9.4.289

De Villiers, Jill & Roeper, Thomas. 1995. Relative clauses are barriers to wh-movement for young children. Journal of Child Language 22. 389–404. DOI:  http://doi.org/10.1017/S0305000900009843

De Villiers, Jill & Roeper, Thomas & Vainikka, Anne. 1990. The acquisition of long-distance rules. In Frazier, Lyn & de Villiers, Jill (eds.), Language Processing and Language Acquisition, 257–297. Springer: Dordrecht. DOI:  http://doi.org/10.1007/978-94-011-3808-6_10

Diessel, Holger & Tomasello, Michael. 2000. The development of relative clauses in spontaneous child speech. Cognitive Linguistics 11(1–2). 131–151. DOI:  http://doi.org/10.1515/cogl.2001.006

Dyvik, Helge & Meurer, Paul & Rosén, Victoria & De Smedt, Koenraad & Haugereid, Petter & Losnegaard, Gyri Smørdal & Lyse, Gunn Inger & Thunes, Martha. 2016. NorGramBank: A ‘Deep’ Treebank for Norwegian. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portorož, Slovenia.

Engdahl, Elisabet. 1982. Restrictions on unbounded dependencies in Swedish. In Engdahl, Elisabet & Ejerhed, Eva (eds.), Readings on unbounded dependencies in Scandinavian, 151–74. Stockholm: Almqvist & Wiksell International

Engdahl, Elisabet. 1997. Relative clause extractions in context. Working Papers in Scandinavian Syntax 60. 51–79.

Erteschik-Shir, Nomi. 1973. On the nature of island constraints (Doctoral dissertation). Massachusetts Institute of Technology.

Erteschik-Shir, Nomi. 1982. Extractability in Danish. In Engdahl, Elisabet & Ejerhed, Eva (eds.), Readings on Unbounded Dependencies in Scandinavian Languages, 175–191. Stockholm: Almqvist & Wiksell.

Erteschik-Shir, Nomi & Lappin, Shalom. 1979. Dominance and the functional explanation of island phenomena. Theoretical Linguistics 6(1–3). 41–85. DOI:  http://doi.org/10.1515/thli.1979.6.1-3.41

Fiedler, Judith. 2014. Germanic It-Clefts: Structural Variation and Semantic Uniformity (Doctoral dissertation). University of California, Santa Cruz.

Fodor, Janet Dean. 1998. Unambiguous triggers. Linguistic Inquiry 29(1). 1–36. DOI:  http://doi.org/10.1162/002438998553644

Fox, Danny & Hackl, Martin. 2006. The universal density of measurement. Linguistics and Philosophy 29. 537–586. DOI:  http://doi.org/10.1007/s10988-006-9004-4

Garmann, Nina Gram & Hansen, Pernille & Simonsen, Hanne Gram & Kristoffersen, Kristian Emil. 2019. The phonology of children’s early words: Trends, individual variation, and parents’ accommodation in child-directed speech. Frontiers in Communication 4. 10. DOI:  http://doi.org/10.3389/fcomm.2019.00010

Gibson, Edward & Wexler, Kenneth. 1994. Triggers. Linguistic Inquiry 25(3). 355–407.

Goldberg, Adele. 2006. Constructions at work: The nature of generalization in language. Oxford: Oxford University Press.

Gould, Isaac. 2015. Syntactic Learning from Ambiguous Evidence: Errors and End-States (Doctoral dissertation). Massachusetts Institute of Technology.

Gould, Isaac. 2017. Choosing a Grammar: Learning paths and ambiguous evidence in the acquisition of syntax. Amsterdam: John Benjamins Publishing Company. DOI:  http://doi.org/10.1075/la.238

Gundel, Jeanette K. 2002. Information structure and the use of cleft sentences in English and Norwegian. In Hasselgård, Hilde & Johansson, Stig & Behrens, Bergljot & Fabricius-Hansen, Cathrine (eds.), Information structure in a cross-linguistic perspective, 113–128. Brill Rodopi.

Hedberg, Nancy. 2000. The referential status of clefts. Language 76(4). 891–920. DOI:  http://doi.org/10.2307/417203

Holmberg, Anders & Platzack, Christer. 1995. The role of inflection in Scandinavian syntax. Oxford: Oxford University Press.

Huang, Cheng-Teh James. 1982. Logical relations in Chinese and the theory of grammar (Doctoral dissertation). Massachusetts Institute of Technology.

Jensen, Anne. 2002. Sætningsknuder i dansk. NyS Nydanske Studier & Almen kommunikationsteori 29. 105–124. DOI:  http://doi.org/10.7146/nys.v29i29.13427

Johannessen, Janne Bondi & Hagen, Kristin. 2008. Om NoTa korpuset og artiklene i denne boka. In Johannessen, Janne Bondi & Hagen, Kristin (eds.), Språk i Oslo. Ny forskning omkring talespråk. Oslo: Novus forlag.

Johansson, Mats. 2001. Clefts in contrast: a contrastive study of it clefts and wh-clefts in English and Swedish texts and translations. Linguistics 39(3). 547–582. DOI:  http://doi.org/10.1515/ling.2001.023

Kaplan, Ronald M. & Riezler, Stefan & King, Tracy H. & Crouch, Richard S. & Maxwell, John T. & Johnson, Mark. 2002. Parsing the Wall Street Journal using a Lexical-Functional Grammar and discriminative estimation techniques. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL). Philadelphia, PA. 271–278.

Kuno, Susumu. 1987. Functional syntax: Anaphora, discourse and empathy. Chicago: University of Chicago Press.

Kush, Dave & Dahl, Anne. 2020. L2 Transfer of L1 island insensitivity: The case of Norwegian. Second Language Research. DOI:  http://doi.org/10.1177/0267658320956704

Kush, Dave & Lohndal, Terje & Sprouse, Jon. 2018. Investigating variation in island effects: A case study of Norwegian wh-extraction. Natural Language and Linguistic Theory 36(3). 743–779. DOI:  http://doi.org/10.1007/s11049-017-9390-z

Kush, Dave & Lohndal, Terje & Sprouse, Jon. 2019. On the island sensitivity of topicalization in Norwegian: An experimental investigation. Language 95(3). 393–420. DOI:  http://doi.org/10.1353/lan.0.0237

Kush, Dave & Omaki, Akira & Hornstein, Norbert. 2013. Microvariation in islands? In Sprouse, Jon & Hornstein, Norbert (eds.), Experimental syntax and island effects, 239–264. Cambridge University Press: Cambridge, UK. DOI:  http://doi.org/10.1017/CBO9781139035309.013

Langendoen, Donald Terence & Savin, Harris. 1971. The projection problem for presuppositions. In Fillmore, Charles J. & Langendoen, Donald Terence (eds.), Studies in Linguistic Semantics, 54–60. Holt, Rinehart and Winston, New York.

Lasnik, Howard & Lidz, Jeffrey L. 2017. The argument from the poverty of the stimulus. In Roberts, Ian (ed.), The Oxford Handbook of Universal Grammar, 221–248. Oxford University Press: Oxford, UK. DOI:  http://doi.org/10.1093/oxfordhb/9780199573776.013.10

Legate, Julie Anee & Yang, Charles D. 2002. Empirical re-assessment of stimulus poverty arguments. Linguistic Review 19(1). 151–162. DOI:  http://doi.org/10.1515/tlir.19.1-2.151

Lindahl, Filippa. 2014. Relative clauses are not always strong islands. Working Papers in Scandinavian Syntax 93. 1–25.

Lindahl, Filippa. 2017. Extraction from relative clauses in Swedish (Doctoral dissertation). University of Gothenburg.

Löwenadler, John. 2015. Relative clause extraction: Pragmatic dominance, processing complexity and the nature of crosslinguistic variation. Nordic Journal of Linguistics 38(1). 37–65. DOI:  http://doi.org/10.1017/S0332586515000050

MacWhinney, Brian. 1975. Rules, rote, and analogy in morphological formations by Hungarian children. Journal of Child Language 2(1). 65–77. DOI:  http://doi.org/10.1017/S0305000900000891

MacWhinney, B. 1982. Basic syntactic processes. In Kuczaj, Stan (Ed.), Language Acquisition: Volume 1, 73–136. Lawrence Erlbaum.

MacWhinney, Brian. 2000. The CHILDES project: Tools for analyzing talk (3rd ed.). Mahwah (NJ): Lawrence Erlbaum.

Maling, Joan & Zaenen, Annie. 1982. A phrase-structure account of Scandinavian extraction phenomena. In Jacobson, Pauline & Pullum, Geoffrey K. (eds.), The nature of syntactic representation, 229–82. Dordrecht: Reidel. DOI:  http://doi.org/10.1007/978-94-009-7707-5_7

Maratsos, Michael P. & Kuczaj, Stan A. & Fox, D. E. & Chalkley, Mary Anne. 1979. Some empirical studies in the acquisition of transformational relations. In Collins, W. Andrew (ed.), The Minnesota Symposia on Child Psychology, 1–45. Hillsdale, NJ: Erlbaum.

Maxwell, John T. & Kaplan, Ronald M. 1993. The interface between phrasal and functional constraints. Computational Linguistics 19(4). 571–590.

Milsark, Gary. L. 1977. Toward an explanation of certain peculiarities of the existential construction in English. Linguistic Analysis 3. 1–29.

Montag, Jessica L. 2019. Differences in sentence complexity in the text of children’s picture books and child-directed speech. First language 39. 527–546. DOI:  http://doi.org/10.1177/0142723719849996

Montag, Jessica L. & MacDonald, Maryellen C. 2015. Text exposure predicts spoken production of complex sentences in 8-and 12-year-old children and adults. Journal of Experimental Psychology: General 144. 447–468. DOI:  http://doi.org/10.1037/xge0000054

Noble, Claire H. & Cameron-Faulkner, Thea & Lieven, Elena. 2018. Keeping it simple: The grammatical properties of shared book reading. Journal of Child Language. 45. 753–766. DOI:  http://doi.org/10.1017/S0305000917000447

Nordgård, Torbjørn. 1991. Determinisme og syntaktisk flertydighet [Determinism and syntactic ambiguity]. Proceedings of the Scandinavian Conference of Computational Linguistics (p. 17).

Nyvad, Anne Mette & Christensen, Ken Ramshøj & Vikner, Sten. 2017. CP-recursion in Danish: A cP/CP- analysis. The Linguistic Review 34(3). 449–77. DOI:  http://doi.org/10.1515/tlr-2017-0008

Otsu, Yukio. 1981. Universal grammar and syntactic development in children: Toward a theory of syntactic development (Doctoral dissertation). Massachusetts Institute of Technology.

Pearl, Lisa. submitted. Poverty of the Stimulus Without Tears. Language Learning and Development.

Pearl, Lisa & Goldwater, Sharon. 2016. Statistical Learning, Inductive Bias, and Bayesian Inference in Language Acquisition. In Lidz, Jeffrey & Snyder, William & Pater, Joe (eds.), The Oxford Handbook of Developmental Linguistics, 664–695. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199601264.013.28

Pearl, Lisa & Lidz, Jeffrey. 2009. When domain-general learning fails and when it succeeds: Identifying the contribution of domain-specificity. Language Learning and Development 5(4). 235–265. DOI:  http://doi.org/10.1080/15475440902979907

Pearl, Lisa & Sprouse, Jon. 2013. Syntactic islands and learning biases: Combining experimental syntax and computational modeling to investigate the language acquisition problem. Language Acquisition 20. 23–68. DOI:  http://doi.org/10.1080/10489223.2012.738742

Perfors, Amy & Tenebaum, Joshua, & Regier, Terry. 2011. The learnability of abstract syntactic principles. Cognition 118(3). 306–338. DOI:  http://doi.org/10.1016/j.cognition.2010.11.001

Platzack, Christer. 1999. Satsfläta med Relativsats. In Haskå, I. & Sandqvist, C. (eds.), Alla tidersspråk: En vänskrift till Gertrud Pettersson, 189–199.

Prince, Ellen F. 1978. A comparison of wh-clefts and it-clefts in discourse. Language 54(4). 883–906. DOI:  http://doi.org/10.2307/413238

Prince, Ellen. F. 1981. Towards a taxonomy of given-new information. In Cole, Peter (ed.), Radical pragmatics, 223–255. San Diego, CA: Academic Press.

Pullum, Geoffrey. K. & Scholz, Barbara C. (2002). Empirical assessment of stimulus poverty arguments. The Linguistic Review 19. 9–50. DOI:  http://doi.org/10.1515/tlir.19.1-2.9

R Core Team. 2018. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Online: https://www.r-project.org/.

Reinhart, Tanya. 1981. A second COMP position. In Belletti, Adrianna (ed.), Theory of markedness in generative grammar: Proceedings of the 1979 GLOW conference, 518–557. Pisa: Scuola Normale Superiore.

Ringstad, Tina L. 2014. Byggeklossar i barnespråk [Building blocks in child language] (Master’s thesis). Norwegian University of Science and Technology. Trondheim: NTNU.

Ringstad, Tina L. 2019. Distribution and function of embedded V–Neg in Norwegian: A corpus study. Nordic Journal of Linguistics 42(3). 329–363. DOI:  http://doi.org/10.1017/S0332586519000210

Rizzi, Luigi. 1980. Violations of the Wh-island constraint in Italian and the Subjacency condition. Journal of Italian Linguistics 5(1). 157–191. DOI:  http://doi.org/10.1515/9783110883718.49

Rizzi, Luigi. 1982. Issues in Italian syntax. Dordrecht: Foris. DOI:  http://doi.org/10.1515/9783110883718

Rizzi, Luigi. 1990. Relativized minimality. The MIT Press: Cambridge, MA.

Roland, Douglas & Dick, Frederic & Elman, Jeffrey L. 2007. Frequency of basic English grammatical structures: A corpus analysis. Journal of Memory and Language 57. 348–379. DOI:  http://doi.org/10.1016/j.jml.2007.03.002

Rosén, Victoria & De Smedt, Koenraad & Meurer, Paul & Dyvik, Helge. 2012. An open infrastructure for advanced treebanking. In Hajič, Jan & De Smedt, Koenraad & Tadić, Marko & Branco, Antonio (eds.), META-RESEARCH Workshop on Advanced Treebanking at LREC2012, 22–29.

Rosén, Victoria & Meurer, Paul & De Smedt, Koenraad. 2009. LFG Parsebanker: A toolkit for building and searching a treebank as a parsed corpus. In Van Eynde, Frank & Frank, Annette & DeSmedt, Koenraad & van Noord, Gertjan (eds.), Proceedings of the seventh international workshop on treebanks and linguistic theories (TLT7), 127–133.

Ross, John Robert. 1967. Constraints on variables in syntax (Doctoral dissertation). Massachusetts Institute of Technology. [Published as Infinite syntax!, Norwood, NJ: Ablex, 1986.]

Sakas, William G. & Fodor, Janet D. 2001. The Structural Triggers Learner. In Bartolo, Stefano (ed.), Language Acquisition and Learnability, 172–233. Cambridge, MA: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511554360.006

Sakas, William G. & Fodor, Janet D. 2012. Disambiguating Syntactic Triggers. Language Acquisition 19(2). 83–143. DOI:  http://doi.org/10.1080/10489223.2012.660553

Sant, Charlotte & Strætkvern, Sunniva B. & Kush, Dave. 2019. En konstruksjon vi ikke ennå forstår hva er: en korpusstudie av syntaktiske øybrudd i norsk. [A construction we don’t yet understand ‘what is’: a corpus study of syntactic island violations in Norwegian]. Presentation at Møte om norsk språk (MONS18), 27–29.

Sichel, Ivy. 2018. Anatomy of a counterexample: Extraction from relative clauses. Linguistic Inquiry 49(2). 335–78. DOI:  http://doi.org/10.1162/LING_a_00275

Simons, Mandy. 2007. Observations on embedding verbs, evidentiality, and presupposition. Lingua 117(6). 1034–1056. DOI:  http://doi.org/10.1016/j.lingua.2006.05.006

Søfteland, Åshild. 2014. Utbrytingskonstruksjonen i norsk spontantale (Doctoral dissertation). [The cleft construction in Norwegian spontaneous speech]. University of Oslo.

Szabolcsi, Anna. & Lohndal, Terje. 2017. Strong vs. Weak Islands. In Everaert, Martin & Van Riemsdijk, Henk C. (eds.), The Blackwell Companion to Syntax (2nd ed.). New York, NY: John Wiley & Sons. DOI:  http://doi.org/10.1002/9781118358733.wbsyncom008

Szabolcsi, Anna & Zwarts, Frans 1993. Weak islands and an algebraic semantics for scope taking. Natural Language Semantics 1. 235–284. DOI:  http://doi.org/10.1007/BF00263545

Taraldsen, Knut Tarald. 1982. Extraction from relative clauses in Norwegian. In Engdahl, Elisabet & Ejerhed, Eva (eds.), Readings on unbounded dependencies in Scandinavian, 205–21. Stockholm: Almqvist & Wiksell International.

Tomasello, Michael. 2000. Do young children have adult syntactic competence? Cognition 74(3). 209–253. DOI:  http://doi.org/10.1016/S0010-0277(99)00069-4

Tomasello, Michael. 2003. Constructing a language: A usage-based theory of language acquisition. Cambridge, MA: Harvard University Press.

Van Valin, Robert. 1995. Toward a functionalist account of so-called ‘extraction constraints’. In Divriendt, Betty & Goossens, Louis & van der Auwera, Johan (eds.), Complex Structures: A Functionalist Perspective, 29–60. Berlin: Mouton de Gruyter.

Van Valin, Robert. 1998. The acquisition of wh-questions and the mechanisms of language acquisition. In Tomasello, Michael (ed.), The new psychology of language: Cognitive and functional approaches to language structure. Hillsdale, NJ: Lawrence Erlbaum Associates.

Verhagen, Arie. 2006. On subjectivity and ‘‘long distance wh-movement’’. In Athanasiadou, Angeliki & Canakis, Costas & Cornillie, Bert (eds.), Subjectification: Various paths to subjectivity, 323–346. New York: Mouton de Gruyter.

Vikner, Sten & Christensen, Ken Ramshøj & Nyvad, Anne Mette. 2017. V2 and cP/CP. In Bailey, Laura R. & Sheehan, Michelle (eds.), Order and structure in syntax I: Word order and syntactic structure, 313–324. Berlin: Language Science.

Westergaard, Marit. 2005. Optional word order in wh-questions in two Norwegian dialects: a diachronic analysis of synchronic variation. Nordic Journal of Linguistics 28. 269–296. DOI:  http://doi.org/10.1017/S0332586505001459

Westergaard, Marit & Bentzen, Kristine. 2007. The (non-)effect of input frequency on the acquisition of word order in Norwegian embedded clauses. In Gülzow, Insa & Gagarina, Natalia (eds.), Frequency Effects in Language Acquisition: Defining the Limits of Frequency as an Explanatory Concept, 271–306. Berlin: Mouton de Gruyter. DOI:  http://doi.org/10.1515/9783110977905.271

Wexler, Kenneth & Manzini, M. Rita. 1987. Parameters and Learnability in Binding Theory. In Roeper, Thomas & Williams, Edwin (eds.), Parameter Setting, 41–76. Dordrecht: Reidel. DOI:  http://doi.org/10.1007/978-94-009-3727-7_3

Yang, Charles. 2002. Knowledge and Learning in Natural Language. Oxford: Oxford University Press.

Yang, Charles. 2004. Universal Grammar, statistics, or both? Trends in Cognitive Sciences 8(10). 451–456. DOI:  http://doi.org/10.1016/j.tics.2004.08.006

Yang, Charles. 2011. Computational models of syntactic acquisition. WIREs Cognitive Science. DOI:  http://doi.org/10.1002/wcs.1154