1 Introduction

1.1 The status – and functions – of nuclear and prenuclear pitch accents

Most studies on the relation between prosody and meaning restrict themselves to form and function of nuclear accents, commonly defined as the last pitch accent in an intonation unit and as the structural head of this unit. As such, the nucleus has a special status in the prosodic hierarchy (see, e.g., Shattuck-Hufnagel & Turk 1996), being the only obligatory accent within its domain (generally the intermediate phrase). Semantic-pragmatically, the nuclear accent is important since its position determines the interpretation of an utterance’s information structure, as in the famous example (1): the sentence Dogs must be carried (written on a sign in the London underground) has two crucially different meanings depending on whether the final accent falls on carried (as in 1a) or on dogs (as in 1b).1

(1) Dogs must be carried.
  a. Dogs must be CARried.  
  b. DOGS must be carried. (adapted from Halliday 1967)

The meaning of (1a) can be paraphrased as ‘If you have a dog, you have to carry it’, whereas (1b) has to be interpreted as ‘Everybody has to carry a dog’.

By contrast, little attention has so far been paid to the investigation of prenuclear accents, defined as pitch accents that occur before the nucleus within the same intonation unit. Both their structural and functional status is unclear, since previous studies obtain inconsistent results. To start with, it has been shown that prenuclear accents yield a lower inter-transcriber agreement than nuclear accents (see studies using the ToBI model for English and German, such as Pitrelli et al. 1994; Syrdal & McGory 2000; Grice et al. 1996), and also that there is a relatively low listener sensitivity to prenuclear accents which may surface as longer reaction times in an accent recognition task (Jagdfeld & Baumann 2011). In the latter experiment, which used cross-splicing, acoustically identical weak accents were perceived much more often as accents if they stood in nuclear position and less so if they appeared in prenuclear position instead. This low stability is in line with Büring’s (2007) claim that prenuclear accents are only optional, or ornamental, especially on prefocal elements, i.e. on elements that are not F(ocus)-marked. Example (2) is adopted from Büring (2007) and suggests that prenuclear accents (indicated by small capitals) on the subject Gus as well as the verb voted are possible but not obligatory.

(2) Who did Gus vote for? GUS VOTED [for a friend of his neighbors from LITtleville]F

Along similar lines, it has been suggested, for example by Calhoun (2010), that prenuclear accents are used due to a general principle of rhythmic organisation and that they do not reliably mark information structural distinctions. Especially nouns (and presumably all content words) that precede the nuclear accent within a long phrase are likely to obtain a pitch accent to preserve the expected rhythmicity of an utterance.

Actually, following Gussenhoven (2015), the finding that the word category is relevant for the placement of pitch accents can be used as an argument in favour of morphosyntactic reasons for pitch accent distribution (in English) and against a rhythmic or, more generally, phonological approach. In his view, morphosyntax – next to the speaker’s intention to mark focus – is the most important driving force for accent placement in an English sentence.

Such an approach does not seem to be highly compatible with the idea that prenuclear accents are merely optional. In fact, there are some production studies which showed that prenuclear accents are placed consistently, irrespective of differences in information structure. For example, this has been found for textually given information in narrow focus contexts (Baumann et al. 2007; Féry & Kügler 2008), as well as for topics in topic-comment structures (cf. Braun 2006). Most interestingly, however, in these studies prenuclear accents displayed subtle changes in peak scaling or peak alignment, which expressed meaning differences.

Féry & Kügler (2008) investigated the prosody of given and new referents in pre- and postnuclear position in a reading study on German, using utterances like ‘Weil der Hummer dem Löwen den Rammler vorgestellt hat’ (Because the lobster introduced the buck to the lion). The authors show that both given and new referents consistently carry prenuclear accents, and that given items (the nucleus being on new information and in narrow focus) are realised with slightly lower accents compared to prenuclear accents realised on items that were contextually new. That is, information status had an influence on peak height and pitch range. Similarly, Braun (2006) found a slight prosodic difference between contrastive and non-contrastive topics in German. In both realisations of a target sentence such as In Armenien schreibt man Lateinisch (‘In Armenia the Latin alphabet is used’) participants produced a prenuclear accent on the sentence topic Armenien (‘Armenia’). Nevertheless, in the contrastive condition, the accent was produced with a later and higher F0 peak as opposed to an earlier and lower F0 peak when realised in the non-contrastive condition.

A recent eye-tracking study on contrastive topics by Braun & Biezma (2019) provided further evidence for the information structural relevance of prenuclear accents in German. The investigation confirmed that a prenuclear L*+H accent (i.e. a rising accent with a low tonal target in the accented syllable, following the annotation system GToBI; Grice, Baumann & Benzmüller 2005), leads to the activation of focus alternatives (see next section). That is, participants fixated more on a contrastive alternative when the subject of a target sentence was produced with an L*+H accent (in comparison with another accent type), with the same effect size and timing as reported for focus constituents. This result shows, the authors claim, that this type of prenuclear accent functions like a nuclear focus accent, which in turn serves as evidence against the claim that prenuclear accents are just ornamental.

1.2 Information structure, informativeness and prosodic prominence

Which information structural categories are we dealing with in the present paper? We follow Krifka’s (2008) notion of (sentence) topic as an aboutness topic (along the lines of, e.g., Reinhart 1981), which may be contrastive or not. If it contrasts with another topic, it contains an element marked as focus, which is defined – following Rooth’s (1985; 1992) Alternative Semantics – as that part of an utterance that evokes alternatives that are relevant for the interpretation of linguistic expressions (Krifka 2008: 247). Thus, contrast is to be understood (also in the present paper) as the availability of (explicit) alternatives, which can either occur as a contrastive focus or as a contrastive topic, depending on the position and role of an argument in an utterance. The latter can be interpreted as a “focus within a topic” (a notion used by Braun & Biezma 2019), indicating that both concepts are independent, unlike e.g. Büring’s (2016) notion of contrastive topic, which is an information structural concept in its own right (see the discussion in Braun & Biezma 2019). We refer to the information status of nominal expressions when aspects related to their givenness or novelty in discourse are concerned (see the overview in Baumann & Riester 2012).

We would like to introduce informativeness as a useful notion that relates to both the information status of a referring expression and its role as part of a specific focus domain. It is thus meant to account for the (fairly “objective”) level of newness of an item in the discourse as well as its pragmatic role in a proposition (Lambrecht 1994: 323), i.e. the level of newsworthiness a speaker assigns to the item. It is assumed that a referential expression is getting more informative the “newer” it is in discourse and the more specific its focus domain is, i.e. the fewer alternatives are available. This is tantamount to saying that an item is getting more informative from broad to narrow focus and from narrow to contrastive focus (or topic), since in the latter case the alternatives are made explicit – and thus reduced in number. Background elements can be considered to be least informative since they are already available in the discourse (generally being conflated with givenness at the level of information status). There is no (other) information structural concept which directly links these two levels of information status and focus which we claim to be a relevant semantic-pragmatic correlate of prosodic prominence (see also the usage of informativeness in Baumann et al. 2019 and Baumann et al. 2020). In the present paper, we are using informativeness to describe the effects of information status and contrast at a single level of analysis.

At least for German, but to some extent for other West Germanic languages as well (e.g. Ayers 1996 for American English), there is evidence that aspects of informativeness are mediated by corresponding levels of prosodic prominence. For instance, it has been shown that smaller or more specific focus domains are probabilistically marked by more prominent accent types in German (Mücke & Grice 2014), as illustrated in Figure 1b. There is independent evidence for an increase in perceptual prominence from no accents through falling pitch accents (GToBI type H+!H*) and high accents (H*) to rising pitch accents (L+H*) in German (see Baumann & Röhr 2015). Figure 1b shows that the highest percentage of rising accents could be found for contrastively focused items, getting lower in narrow focus and even lower in broad focus (the elicitation setup with question-answer pairs is displayed in Figure 1a). Backgrounded items are deaccented and thus prosodically least prominent.

Figure 1
Figure 1

(a) Speech material of the production experiment and (b) Distribution of GToBI accent types including schematic pitch contours in the vicinity of the (shaded) accented syllable (see Mücke & Grice 2014: 52–53).

Not only focus but also information status (or level of givenness/newness) is expressed by differences in accent type which in turn differ as to their level of prominence (again based on the results of the perception study by Baumann & Röhr 2015). Both a corpus study of read German (Baumann & Riester 2013) and a perception study (Röhr & Baumann 2011) in which participants had to judge an item’s degree of givenness (without presenting them the context in which the item was uttered) revealed a correspondence of newness with more prominent accents (here: high accent types), semi-givenness (e.g. bridged referential expressions) with less prominent accents (here: low and falling accent types) and givenness with prenuclear accents or deaccentuation. The finding that a prenuclear accent makes a referent sound relatively “given” in turn suggests that prenuclear accents are generally perceived as less prominent than nuclear ones (all five accent types tested in Röhr & Baumann (2011) occurred in nuclear position). This effect may also be supported by the listeners’ top-down knowledge that (at least in West Germanic but also in many other languages) cognitively accessible referents are often placed early in an utterance (see, e.g., Bock & Warren 1985 and, for a broader overview, Wagner 2016).

Additional evidence for the interplay of prosodic prominence and position in an utterance comes from another independent study in which naïve listeners had to decide for all words in a set of spoken excerpts whether they perceived a word as prominent or not (Baumann & Winter 2018; using the Rapid Prosody Transcription method, see Cole & Shattuck-Hufnagel 2016). Figure 2 indicates that prenuclear accents were judged as prominent to a much lower extent than nuclear accents. The study further confirms the prominence ranking of accent types found in Baumann & Röhr (2015) mentioned above.

Figure 2
Figure 2

Results of accent position and accent type in a prominence judgment task on German (Baumann & Winter 2018); ‘ip’ stands for ‘intermediate phrase’, i.e. a smaller intonation unit, and ‘IP’ stands for ‘intonation phrase’, i.e. a larger intonation unit (see, e.g., Turk & Shattuck-Hufnagel 1996).

However, the categorical criteria ACCENT TYPE and ACCENT POSITION are not the only relevant factors for prominence perception. There are a couple of continuous-valued phonetic parameters which have been found in previous studies to lend prominence, such as DURATION of syllables and words (e.g. Turk & Sawusch 1996), INTENSITY (e.g. Kochanski, Grabe, Coleman & Rosner 2005) and local pitch movement, i.e. tonal RANGE and SLOPE (e.g. Rietveld & Gussenhoven 1985). A more recent prominence measure is the Tonal Center of Gravity (TCoG; Barnes, Veilleux, Brugos & Shattuck-Hufnagel 2012), which is a holistic parameter that incorporates the shape of the contour and the alignment or scaling of turning points. These aspects have been found to be relevant in the studies by Féry & Kügler and Braun mentioned above (and further discussed below), which marked slight but potentially meaningful differences between prenuclear accents in German.

1.3 Hypothesis of the present study

In the light of the results of the (few) previous studies on the relation between form and function of prenuclear accents in German the present production study examines whether differences in the information structure of a sentence-initial argument (topic) influence its prosodic realisation. Concretely, we expect a direct relation between informativeness (comprising information status and contrast) and the prosodic prominence of sentence-initial (prenuclear) target words, i.e. we can formulate the following hypothesis:

(3) Hypothesis: The more informative a referent is, the more prominent is its prosodic marking.

In terms of discrete categories, we expect the following probabilistic mapping between levels of informativeness and accent types (based on the studies discussed above):

    1. (4)
    1. given
    2. no accent
    1. accessible
    2. low accent
    1. new
    2. high accent
    1. contrastive
    2. rising accent

As to continuous parameters, we expect to find longer durations, higher intensities and F0 ranges, and steeper slopes from given through accessible and new to contrastive target words.

In what follows, we first introduce our methodological procedure in sections 2.1–2.3. In section 2.4 we present our results for the categorical and continuous parameters followed by a discussion and conclusion of our findings in section 3.

2 Experiment

2.1 Material

In this experiment, we used 20 target words (nouns) with a trochaic syllable structure and containing mostly sonorous material (cf. Table 1). Each of these words constituted the first argument in the sentence, i.e. it was both the grammatical subject and the sentence topic. Target sentences were integrated into a mini story, consisting of three sentences (see example in (5)).

Table 1

Disyllabic target words with lexical stress on the first syllable.

Bauer
‘farmer’
Bruder
‘brother’
Dame
‘lady’
Diener
‘servant’
Dingo
‘dingo’
Dogge
‘Great Dane’
Hase
‘hare’
Heldin
‘heroine’
Junge
‘boy’
Laie
‘amateur’
Lehrer
‘teacher’
Lehrling
‘apprentice’
Maler
‘painter’
Mama
‘mom’
Meise
‘titmouse’
Nanny
‘nanny’
Nonne
‘nun’
Rabbi
‘rabbi’
Räuber
‘robber’
Sammler
‘collector’
(5) Nach dem langen Winter freuten sich alle auf ein paar sonnige Stunden im Freien.
  Im Klostergarten blühten die ersten Pflanzen.
  Die Nonne hat einen Mandelbaum gegossen.
  ‘After the long winter everybody was looking forward to a couple of sunny hours in the open. The first plants bloomed in the cloister garden. The nun watered an almond tree.’

The first context sentence within a story set was held constant introducing the setting for each story version. This first sentence was then combined with one out of four second context sentences. By varying the second context sentence, four conditions were designed rendering the subject in the target sentence either given, accessible, new or contrastive.2 In example (5) above, the referent is accessible due to the mention of ‘cloister garden’ which sets the scenario for the occurrence of a ‘nun’. Table 2 gives an example of all four mini stories for the target word Nonne (‘nun’), where referring expressions that are assumed to have an immediate influence on the informativeness of the target word (i.e. in the given, accessible and contrastive contexts) are printed in bold face.

Table 2

Mini stories for the target word Nonne (‘nun’).

Context 1 Nach dem langen Winter freuten sich alle auf ein paar sonnige Stunden im Freien.
‘After the long winter everybody was looking forward to a couple of sunny hours in the open.’
Context 2a given Die Nonne kümmerte sich um den Klostergarten.
‘The nun was looking after the cloister garden.’
Context 2b accessible Im Klostergarten blühten die ersten Pflanzen.
‘The first plants bloomed in the cloister garden.’
Context 2c new Die Sonne schien schon den ganzen Tag und der Schnee war endlich geschmolzen.
‘The sun had been shining all day and the snow had finally melted.’
Context 2d contrastive Der Mönch hat einen Brombeerstrauch gegossen.
‘The monk watered a blackberry bush.’
Target Die Nonne hat einen Mandelbaum gegossen.
‘The nun watered an almond tree.’

These manipulations resulted in a total of 80 mini stories (20 target words * 4 conditions).

2.2 Participants and procedure

Twenty-nine native speakers of German (21 female, 8 male), aged between 19 and 30 years, participated in this reading experiment. Twenty-three of them (79.3%) originated from North Rhine-Westphalia, speaking a variety from the West of Germany.3 All participants gave written informed consent, and the experiment was performed in accordance with the Declaration of Helsinki.

Each participant was presented with 20 different mini stories on a computer screen, using PsychoPy2 (Peirce & MacAskill 2018, version 1.82.01). The order of the stories was randomised by generating different stimulus lists in R (R Core Team 2017, version 3.4.1) by means of a randomisation script. For each participant an individual list was loaded into PsychoPy. Within PsychoPy, participants passed through the instructions and the stimulus presentation by themselves by pressing the right arrow button. Participants were instructed to first read each story for themselves before reading it out loud. In order to reduce the workload for each participant and since they should not familiarise themselves too much with the content of the specific stories, each participant read only one mini story (i.e. one condition) per target word, resulting in five realisations of each condition per speaker.

By advising the speakers to “tell the story to a friend”, we aimed at triggering a natural but swift speech rate in the participants’ productions. The speech rate was further primed by two preceding training items: for these items, two additional context sentences had been pre-recorded by one of the authors and were then presented to participants both auditorily and visually. The target sentence that participants had to read out loud was presented visually. Training items were used to ensure a similar speech rate across speakers, in that participants adapt to the speech rate they perceived in the pre-recorded sentences.

After each story, participants had to answer a content question on the second context sentence, to ensure that they paid attention to what they read and hence parsed the underlying information structure correctly. The questions were designed as in the following example: War im Klostergarten der Winter eingebrochen? (‘Had winter begun in the cloister garden?’). Answers were given by pressing the keys ‘y’ for ‘yes’ or ‘n’ for ‘no’.

The recording took place in a sound-proof booth using a head-mounted microphone (AKG C 544 L) with a mouth-microphone distance of approximately five centimetres that was held constant across speakers. Recordings were performed at a sampling rate of 44,100 Hz and 16-bit resolution. Completion of the task took 30–45 minutes, and all subjects were paid for participation.

2.3 Data analysis

In total 580 utterances were recorded (29 participants * 20 target words). However, due to wrong answers to the content question, hesitations or phrase breaks after target words, which turn prenuclear accents into nuclear accents, 87 utterances (i.e. 15% of the data)4 were excluded. Hence, 493 utterances entered the analysis.

All utterances were annotated at the word level, adding the beginning and end of the stressed syllable of the target word, prosodic boundaries and ACCENT TYPE in Praat (Boersma & Weenink 2018). The classification of accent types (and phrase breaks) followed GToBI and was based on a consensus judgment of two trained phoneticians (see Figure 3).

Figure 3
Figure 3

Praat screen shot of the annotated target sentence Die Nonne hat einen Mandelbaum gegossen (‘The nun watered an almond tree’). From top to bottom the screenshot displays the oscillogram, the spectrogram and the F0 contour, as well as annotation levels for all words, the stressed syllable of the target word, low and high turning points in the target word (indicated by ‘1’ and ‘2’), GToBI accent types and – if applicable – phrase breaks (cf. Grice et al. 2005).

Furthermore, several continuous-valued phonetic parameters were measured, including the duration of target words (WORD DURATION), the duration of the stressed syllable (SYLLABLE DURATION), the RMS amplitude (i.e. the INTENSITY) of the stressed syllable, as well as the TONAL RANGE in semitones (st) and the F0 SLOPE (in st per milliseconds) in the vicinity of the stressed syllable. For calculating RANGE and SLOPE, we extracted the minimal and maximal F0 values within and surrounding the region of the stressed syllable (labelled as ‘1’ and ‘2’), excluding perturbations due to micro-prosody.

In addition, the Tonal Center of Gravity (TCoG) was measured, following Barnes, Veilleux, Brugos & Shattuck-Hufnagel (2012). The TCoG constitutes a holistic measure that incorporates contour shape and ALIGNMENT or SCALING of turning points representing the balancing point of the area under the curve (cf. Figure 4a & b). Thus, an accent with a convex contour shape exhibits an earlier alignment point as opposed to an accent which is characterised by a rather concave shape (illustrated by the coloured vertical dashed lines in Figure 4a & b). Similarly, the center of gravity in the scaling dimension is shifted upwards, since the area density is more concentrated in the higher pitch region under the curve, as is the case in convex contours. Opposed to that, a concave contour would lead to a lower TCoG scaling point as the area density is higher in the lower pitch region (illustrated by the coloured horizontal dashed lines in Figure 4a & b).

Figure 4
Figure 4

TCoG alignment (x-axis) and TCoG scaling (y-axis) for (a) a convex F0 contour coded in red and (b) a concave F0 contour coded in blue, both in relation to a symmetrical rising-falling contour (in black; illustrations adapted from Barnes 2017). The x’s indicate the position of the respective center of gravity.

All inferential statistics were performed in R (R Core Team 2020, version 4.0.2). For the discrete dependent variable ACCENT TYPE, we tested whether the presence or absence of a specific accent type was affected by informativeness using logistic regression, employing the glmer function of the lme4 package (Bates, Maechler, Bolker & Walker 2015). The continuous dependent variables were analysed with linear mixed-effects models, also employing the lme4 package.

In the process of statistical modelling (both logistic and linear regression), we followed a top-down approach, beginning with the most complex model, which we will call the full model. The full model comprises a fixed effect for informativeness (i.e. the respective condition, with given as the baseline level) as well as random effects: in order to account for individual behaviour of both participants and target words, we included random intercepts for gender as well as random slopes for both informativeness per speaker and informativeness per target word. In case a full model did not converge, we stripped down our models, reducing the amount of random slopes and using random intercepts instead. This led to a model with only one random slope for informativeness per speaker and a random intercept for target word, since we are more interested in the individual behaviour of participants than of target words. In case this single-slope model did still not converge, we built even less complex models, employing only random slopes. This process was repeated until the model converged. We evaluated the models’ goodness-of-fit by performing likelihood ratio tests with a null model (i.e. comparison with a model that does not contain the fixed effect), which computed a χ2-value and a p-value. We will explicitly report on the model that was found to be most likely to explain the data for each dependent variable individually in the results section.

Lastly, we followed two different approaches to evaluate the actual effects, depending on the type of regression. For the categorical data, we calculated odds ratios, employing the standard error, to evaluate effect sizes. The odds ratio indicates if and how the odds for a specific response change when the input is systematically varied. For the continuous data, we performed post-hoc Tukey tests5 to assess which differences between the levels of informativeness were significant. One major advantage of this test is that it automatically adjusts the p-value for multiple comparisons.

2.4 Results

2.4.1 Categorical parameters

As to the distribution of accent types, we hypothesize that given information will be deaccented, accessible information will be marked by a low accent, new information by a high accent, and contrastive referents by a rising accent (see (4) above).

The plot in Figure 5 summarizes the contours of all target sentences arranged by experimental condition, including average pitch contours. The plot indicates that the target sentences were produced very similarly across conditions, although the rise in contrastive items appears to be shallower than the rise in the other three conditions.

Figure 5
Figure 5

Spaghetti plot showing the time-normalized intonation contours of the target sentences in each of the four experimental conditions with the items superimposed on each other and with average contours (in red). Note that the F0 is computed relative to the mean of each utterance in semitones, levelling the differences between male and female speakers.

Looking at the GToBI labels, we do not observe major effects of informativeness, however. While there are hardly any cases of deaccentuation, 92% of the data is produced with a prenuclear rise, comprising L*+H in 74.2% and L+H* in 17.8% of all cases (see Figure 6 and the average contours in Figure 5). The logistic regression confirms that informativeness does not impact the presence of deaccentuation (χ2(2) = 0, p = 1),6 L* (χ2(3) = 0, p = 1),7 H* (χ2(3) = 3.8595, p = .28),8 or L*+H (χ2(3) = 2.0993, p = .55). However, we observe a trend that the presence of L+H* is impacted by the level of informativeness, decreasing the odds of an L+H*-accent in the contrastive condition (compared to given) by 0.36 to 1. Still, this effect falls short of reaching significance, since model comparison returns χ2(3) = 7.4352 with p = .059.

Figure 6
Figure 6

Distribution of GToBI accent types across test conditions.

Similarly, when concentrating on the direction of tonal movements, subsuming L*+H and L+H* under the category ‘rise’, we only observe a trend for informativeness affecting ACCENT TYPE (see Figure 7). The odds for a rising accent in accessible items increase by 1.65 to 1 (compared to given items) and the odds for a rising accent in contrastive items decrease by 0.65 to 1 (compared to given items). In the case of new items, the odds for a rising accent increase by even 2.46 to 1 (compared to given items) due to a lack of deaccentuation and low accents, and fewer instances of high accents. However, we can merely speak of a tendency, since model comparison,9 again, returned a non-significant value, with χ2(3) = 6.8034 and p = .078.

Figure 7
Figure 7

Distribution of accent types comprised as tonal movement directions across test conditions.

We still report this pattern as a tendency, since a post-hoc power analysis, carried out with G*Power (Version 3.1.9.6), revealed only little power for the categorical variables, yielding a value of 0.53. Accordingly, it is possible that our results are due to a type 2 error, i.e. we might find an effect of informativeness on accent type if we re-ran the experiment with an increased sample size (with 54 subjects, e.g., we would yield a power of 0.80).

2.4.2 Continuous parameters

In contrast to the categorical parameters, we find a main effect of informativeness on many, though not all, continuous parameters. For WORD DURATION, we find that the full model (including random slopes for both informativeness per speaker and per target word)10 is suited to explain the data, with χ2(3) = 10.033, p = .018. Our results indicate that WORD DURATION is affected by informativeness, with increasing duration as the informativeness of referents increases (see Figure 8).

Figure 8
Figure 8

Average word duration (in seconds) per condition.

However, only the difference between the conditions contrastive and given was found to be significant (p = .004) in a post-hoc Tukey test, with contrastive referents being produced about 12 ms longer than their given counterparts (see Table 3).

Table 3

Coefficients, standard error, z-values and p-values for WORD DURATION.

Estimate Standard Error z-value p-value
accessible – given 0.008 0.004 1.989 0.188
new – given 0.008 0.004 2.063 0.162
contrastive – given 0.012 0.004 3.372 0.004*
new – accessible 0.001 0.005 0.108 0.999
contrastive – accessible 0.004 0.004 1.107 0.681
contrastive – new 0.004 0.005 0.794 0.855

Also for SYLLABLE DURATION, the full model11 was found to explain the data best, with χ2(3) = 11.613 and p = .009. Broadly speaking, we find that informativeness affects SYLLABLE DURATION, showing a trend for more informative referents having longer stressed syllables (see Figure 9).

Figure 9
Figure 9

Average syllable duration (in seconds) per condition.

However, the post-hoc Tukey test reveals that this effect is only significant if we compare contrastive to given items, with the former being on average 7 ms longer than the latter (p < .001). The results of the Tukey test are given in Table 4.

Table 4

Coefficients, standard error, z-values and p-values for SYLLABLE DURATION.

Estimate Standard Error z-value p-value
accessible – given 0. 006 0.003 2.293 0.098
new – given 0.003 0.003 1.318 0.549
contrastive – given 0.010 0.003 3.906 <0.001*
new – accessible –0.003 0.003 –0.934 0.785
contrastive – accessible 0.004 0.002 1.511 0.428
contrastive – new 0.007 0.003 2.227 0.114

Regarding the INTENSITY of target words, we observe a trend that contrastive items were produced more softly than given, accessible and new items (see Figure 10). However, this trend failed to reach significance as the full model12 returned a value of χ2(3) = 7.0464, p = .07.13

Figure 10
Figure 10

Average intensity (RMS, in dB) per condition.

With regard to the TONAL RANGE on the target words, we observe an effect of informativeness: the range slightly increases as a function of information status, but decreases when the focus domain gets more specific (i.e. from broad focus to contrast; cf. Figure 11). The full model14 confirms the effect of informativeness, returning χ2(3) = 17.697 and p = .0005.

Figure 11
Figure 11

Average tonal range (in st) per condition.

However, the post-hoc Tukey test reveals that merely the difference in focus/topic type reaches significance, with contrastive topics showing a narrower tonal range than accessible and new items which are in broad focus (–1.4 st, p < .001). The difference between contrastive and given items falls short of reaching significance (–0.8 st, p = .0502). The results of the Tukey test are displayed in Table 5.

Table 5

Coefficients, standard error, z-values and p-values for TONAL RANGE.

Estimate Standard Error z-value p-value
accessible – given 0.629 0.369 1.705 0.315
new – given 0.589 0.311 1.892 0.227
contrastive – given –0.824 0.322 –2.559 0.050
new – accessible –0.04 0.265 –0.150 0.999
contrastive – accessible –1.453 0.309 –4.707 <0.001*
contrastive – new –1.413 0.302 –4.680 <0.001*

As to the SLOPE within target words, neither the full model nor the single-slope model for target words converged. However, we observe a similar effect of informativeness on SLOPE as on TONAL RANGE: with increasing newness the slope gets slightly steeper, but with the more specific focus domain (contrastive) the slope gets shallower (see Figure 12; the slightly smaller range and shallower slope are also visible in the contours in Figure 5). The full model15 confirms this observation, returning χ2(3) = 13.623 and p = .003.

Figure 12
Figure 12

Average slope (in st/ms) per condition.

However, this effect is only significant if we compare referents of the contrastive condition against accessible and new referents, as indicated by the post-hoc Tukey test reported in Table 6. All other differences between conditions do not reach significance.

Table 6

Coefficients, standard error z-values and p-values for SLOPE.

Estimate Standard Error z-value p-value
accessible – given 0.001 0.002 0.361 0.983
new – given 0.001 0.002 0.661 0.908
contrastive – given –0.005 0.002 –2.391 0.075
new – accessible 0.0004 0.001 0.339 0.986
contrastive – accessible –0.006 0.002 – 3.564 0.002*
contrastive – new –0.006 0.002 –4.091 <0.001*

With respect to the ALIGNMENT of the TCoG, we do not observe an effect of informativeness (cf. Figure 13). The full model16 confirms this observation, returning χ2(3) = 1.4876 and p = .69.

Figure 13
Figure 13

Average TCoG alignment (in ms) per condition.

The SCALING of TCoG, however, is affected by informativeness, as indicated by the full model,17 which returns χ2(3) = 9.2115, with p = .027. However, there is no clear pattern observable, as target words in the accessible and the contrastive condition display lower TCoG scaling values than target words in the given condition, whereas items in the new condition only display marginally higher TCoG scaling values than items in the given condition (see Figure 14 and Table 7).

Figure 14
Figure 14

Average TCoG scaling (in Hz) per condition.

Table 7

Coefficients, standard error, z-values and p-values for TCoG SCALING.

Estimate Standard Error z-value p-value
accessible – given –3.973 4.383 –0.906 0.801
new – given 0.319 4.627 0.069 0.999
contrastive – given –11.736 4.422 –2.654 0.04*
new – accessible 4.292 4.902 0.876 0.817
contrastive – accessible –7.763 4.322 –1.796 0.274
contrastive – new –12.055 4.318 –2.792 0.027*

When inspecting the differences between conditions, the post-hoc Tukey test reveals that only the differences between contrastive and given (p = .04) and contrastive and new (p = .03) are statistically significant (cf. Table 7).

3 Discussion and conclusion

In this investigation we hypothesized a direct relation between informativeness and prosodic prominence. Thus, with increasing informativeness a referent is expected to be realised with longer word and syllable durations, higher intensity, wider range and steeper slope. Furthermore, we anticipated to find the position and type of an accent to be realised in accordance with informativeness, as such that given topics would be rather deaccented and more informative topics would receive more prominent accent types.

Our hypothesis could only be confirmed to a very limited degree: with respect to both accent position and accent type, there is no effect of informativeness in that sentence topics are almost always produced with an accent and that they are predominantly marked by rises. In other words: sentence topics are consistently marked by rising prenuclear accents and, surprisingly, not even given items are deaccented.

The latter observation might be caused by the so-called repeated-name penalty effect (Gordon et al. 1993). This effect describes the finding that repeated, non-pronominalised subjects, either proper names or, in our case, definite noun phrases, exhibit longer reading times than pronouns. It is assumed that such repeated subjects rather impede local coherence, in contrast to pronouns, which increase coherence. Thus, in the case of given topics, it might take the reader longer to process a noun which was mentioned immediately before, resulting in placing an accent on this noun, rather than deaccenting it due to the unexpected repetition of the noun. Probably the only study to date that related the repeated-name penalty effect to prosody is Wagner (2016), who investigated the realization of contextually given proper names and pronouns in coordinated as well as single, i.e. uncoordinated structures (in English). His production data are comparable to our data in that about two thirds of the uncoordinated proper names received an accent – which is even more surprising, however, since they occurred as objects in sentence-final position and were thus expected to be deaccented to a larger extent than our sentence-initial subjects. As an explanation Wagner suggests that speakers might have chosen to use an accent since the presence of the proper name (instead of a pronoun) led them to assume that a contrast to some other referent was intended in the experimental setting. The same effect of processing alternatives might have been the reason for the additional processing time found in the self-paced reading studies when encountering a repeated full name. It is debatable whether this explanation also holds for the given condition in our experiment, since one of the other conditions explicitly sets up a contrast between elements.

The consistent placement of prenuclear accents in the present study can also be regarded as being rhythmically induced. In German and English there are two positions in an utterance that are predestined for prosodic prominence: one near the beginning and one near the end of the utterance. This meets Bolinger’s (1989) idea of a tendency towards placing two major accents in an utterance, the last one describing the accent which reveals the focal information, the so-called accent of interest, i.e. the nuclear accent. Opposed to that, an accent placed at the beginning of the same utterance, most probably on a content word, would be referred to as an accent of power. Thus, both accents appear to contribute to the ‘rhythmical frame’ of a well-formed utterance (at least in German or English). Such a structure is provided by our stimuli, but we have to be aware that we are conflating prenuclear accents with sentence-initial accents here, since our target words always represent the first accentable item in an utterance. Thus, a follow-up experiment will have to show to what extent the rhythmical frame is restricted to accenting initial constituents (even when they are given) or whether all accentable prenuclear constituents actually receive an accent. A possible setup would be to compare the prosodic realization of the sentences in (6a) and (6b). Intuitively, die Nonne ‘the nun’ in (6b) is less likely to be accented than in (6a), due to its medial sentence position (instead, the sentence-initial gestern ‘yesterday’ will probably be marked by an accent) supporting the idea of a rhythmical frame with the strongest positions at the beginning and at the end.

(6) Context: Die Nonne kümmerte sich um den Klostergarten.
  (‘The nun was looking after the cloister garden.’)
  a. Die Nonne hat einen Mandelbaum gegossen.
    (‘The nun watered an almond tree.’)
  b. Gestern hat die Nonne einen Mandelbaum gegossen.
    (‘Yesterday the nun watered an almond tree.’)

Inherent in such a metrical approach (which is compatible with Calhoun (2010) mentioned above) is the idea that prosodic prominence is relative in nature and that it may only partly mirror an utterance’s information structure. As Wagner (2005) points out, an early accent on a constituent – may it be given, as Nonne in (6a), or not – will already be perceived as less prominent than a subsequent accent, especially if the subsequent accent is the final one in the phrase. Thus, the initial accent does not need to be prosodically reduced, since it is secondarily prominent in any case.

Although consistent placement of prenuclear accents seems to be due to rhythmic reasons, we still find a few – more or less subtle – effects of informativeness on initial arguments when looking at continuous phonetic parameters. Above all, both word and syllable DURATION show a significant increase (compared to given referents) when the referent is allegedly most informative, namely a contrastive topic, which confirms the hypothesized direct relation at least in part. Other parameters, especially tonal RANGE and SLOPE, show only slight (and statistically non-significant) differences among the three levels of information status in the expected direction: new items seem to be produced with a wider range and steeper rise than accessible and, in turn, given items (cf. Figures 11 and 12). Unexpectedly, however, they also display significantly higher values for referents in the background (given) and in broad focus (accessible and new) than for contrastive topics.

Differences in pitch RANGE have also been found in the above-mentioned production study by Féry & Kügler (2008), who tested the influence of information structure (notably givenness, newness and narrow focus) on pitch accent scaling in German. Similar to our study, the authors kept the syntactic structure of their stimuli constant but varied the number of constituents as well as their word order. Importantly, Féry & Kügler’s data also showed exhaustive usage of rising pitch accents on arguments in prenuclear position (and not just in initial position), even if they were contextually given. This outcome corroborates our own results. Nevertheless, Féry & Kügler also found that the high tones of the prenuclear accents were produced considerably lower when given, a result we could not (statistically) confirm in our data.

The observation that referents in the contrastive condition are marked with reduced prosodic prominence is a rather surprising but nevertheless stable finding. It does not only hold for tonal RANGE and SLOPE but is also true with respect to the related parameter TCoG SCALING: contrastive items display a significantly lower TCoG than given and new target words. Moreover, contrastive topics had a stronger tendency – although not reaching significance – to be produced with an L*+H pitch accent (as well as H*) compared to L+H* (which is the perceptually most prominent accent type in German, according to Baumann & Röhr 2015) than in given, accessible and new items (cf. Figure 6). This tendency is also reflected in the lower TCoG SCALING values for contrastive topics pointing towards rather concave F0 contours as opposed to more convex contours in the other conditions. Taken together, all continuous parameters (except for DURATION) and the tendency for less steeply rising accent types, contribute to a less prominent realisation of contrastive topics.

These results are only to some extent compatible with the production study by Braun (2006), who found a general tendency for contrastive topics to be more prominent than non-contrastive topics (which were composed of contextually accessible information). Nevertheless, as in our study, all sentence topics, whether contrastive or non-contrastive, carried a rising prenuclear accent – although in Braun (2006) with an equal distribution of L*+H and L+H* accents. That is, in terms of accent type, no difference in prosodic prominence between contrastive or non-contrastive topics could be found. As to the continuous parameters, however, there was a clear relation between an increase in prominence and contrastivity: contrastive topics were not only marked by increased duration (as in our data) but also by later and higher F0 peaks as well as wider pitch range.

Interestingly, the eye-tracking study by Braun & Biezma (2019) suggests that contrastivity is not primarily related to prosodic prominence (as triggered by continuous parameters) but by the type of accent alone: L*+H leads to the activation of alternatives (i.e. functions as a contrastive topic), whereas L+H* does not – while keeping their level of prominence (in terms of pitch range) identical.18 Although our data support this view to some extent, since we found more L*+H pitch accents in the contrastive condition compared to the other ones, we also found a vast majority of this accent type across all conditions. In this respect, our results are in line with Truckenbrodt (2002) who identified L*+H as the “neutral” prenuclear accent type in German.

Looking at the structure of the contrastive test stimuli in our data and the intonation contour of the target sentences as a whole, we may find a possible explanation for the surprising pattern: all contrastive items involve a double contrast (contrastive topic plus contrastive focus), in that not only the initial argument but also the final one contrasts with the previously mentioned referents (cf. Table 2, where ‘nun’ contrasts with ‘monk’ and ‘almond tree’ contrasts with ‘blackberry bush’ in context sentence 2d). This double contrast results in a flat hat pattern, i.e. a prenuclear rise followed by a plateau and a nuclear fall, in which both the prenuclear and the nuclear accents are marked as prosodically rather non-prominent. The reason for this relative lack of prosodic prominence might be that the contrast is already expressed by the parallel syntactic (and semantic-pragmatic) structure. That is, speakers do not have to put a particularly strong accent on the contrasted item (as suggested e.g. by Mücke & Grice 2014, see Figure 1) since another component of the linguistic system (i.e. syntax) takes over which achieves the same function. Thus, the contrastive meaning is “robust” in that it is expressed by multiple cues (see Winter 2014). Note that the contrastive stimuli in Braun (2006) displayed only a parallel semantic-pragmatic structure but no parallel syntactic structure, i.e. there was a double contrast as well but it was not expressed by using the same syntactic constituents (e.g. ‘The Georgians even have their own writing system’ vs. ‘In Armenia the Latin alphabet is used’). This difference in the setup might explain the slightly different marking of contrast in the two studies. Since our study was not designed to investigate various levels of contrast marking, however, this hypothesis has to be tested in another study.

To conclude, the present investigation has shown no effects of informativeness on the position and type of pitch accents as markers of sentence topics in German, as prenuclear rising accents were placed consistently. Similarly, continuous phonetic parameters which are known to be important cues for prosodic prominence – and, in turn, for marking information structure – such as tonal range and slope, were shown to be only slightly influenced by a referent’s level of informativeness. In fact, the type of topic (contrastive vs. non-contrastive) and its interaction with syntactic structure seem to play a stronger role than differences in information status, at least in the data set under investigation. In general, the results can be argued to support the view of a somewhat stable “rhythmical frame” within an utterance on whose left-hand side prosodic adjustments of the packaging of a referent’s informativeness only play a minor role.

Supplementary material

The complete stimuli set, the processed data (.csv-tables) as well as the R scripts are available via osf.io/4nxdp/.

Notes

  1. Nuclear accents are indicated by full capitals. [^]
  2. Note that our four conditions of informativeness comprise both information status (given, accessible, new) and type of focus or topic (contrastive topic as opposed to background (context 2a) and broad focus (contexts 2b and 2c)). [^]
  3. Only two speakers came from the North (6.9%) and four speakers from the South West of Germany (13.8%). [^]
  4. Out of these 87 utterances, 75 were excluded due to phrase breaks, with 17 cases in the given, 23 in the accessible, 20 in the new and 15 in the contrastive condition. Thus, the presence of phrase boundaries was not restricted to a specific condition and can hence not be regarded as a strategy for marking a particular level of informativeness across speakers. [^]
  5. Since we used Tukey tests, we report z-values instead of t-values which are usually returned by linear modelling. [^]
  6. Deaccentuation ~ condition + (1|target word) + (1|speaker). [^]
  7. L*-accent ~ condition + (1|target word) + (1+condition|speaker) + (1|gender). [^]
  8. H*-accent ~ condition + (1|target word) + (1|speaker) + (1|gender). [^]
  9. Rising accents ~ condition + (1|target word) + (1|speaker). [^]
  10. Word duration ~ condition + (1+condition|target word) + (1+condition|speaker) + (1|gender). [^]
  11. Syllable duration ~ condition + (1+condition|target word) + (1+condition|speaker) + (1|gender). [^]
  12. Intensity ~ condition + (1+condition|targetword) + (1+condition|speaker) + (1|gender). [^]
  13. Similar to the categorical data reported above, a post-hoc power analysis with an assumed mediocre effect size revealed a power of only 0.65 for our continuous data, thus increasing the likelihood of a type-2-error. [^]
  14. Range ~ condition + (1+condition|target word) + (1+condition|speaker) + (1|gender). [^]
  15. Slope ~ condition + (1+condition|target word) + (1+condition|speaker) + (1|gender). [^]
  16. TCoG alignment ~ condition + (1+condition|target word) + (1+condition|speaker) + (1|gender). [^]
  17. TCoG scaling ~ condition + (1+condition|target word) + (1+condition|speaker) + (1|gender). [^]
  18. Note that the rating study by Baumann & Röhr (2015) did find a difference in prominence between the two accent types (although they also displayed identical pitch range values), with L+H* being judged as slightly more prominent than L*+H (however in nuclear position). Thus, the finding by Braun & Biezma is against the intuition that the most prominent accent type should be used for marking contrast. [^]

Acknowledgements

We would like to thank Simon Rössig for creating the spaghetti plots and the participants of the experiment for their help and patience.

Funding information

This work was funded by the German Research Foundation (DFG), grant BA 4734/1-2.

Competing interests

The authors have no competing interests to declare.

References

Ayers, Gayle Marie. 1996. Nuclear accent types and prominence: some psycholinguistic experiments. Columbus, OH: Ohio State University dissertation.

Barnes, Jonathan. 2017. Integrating pitch and time in intonational phonetics and phonology. Keynote talk at PaPE (Phonetics and Phonology in Europe) 2017. Cologne, Germany: University of Cologne.

Barnes, Jonathan & Veilleux, Nanette & Brugos, Alejna & Shattuck-Hufnagel, Stefanie. 2012. Tonal Center of Gravity: A global approach to tonal implementation in a level-based intonational phonology. Laboratory Phonology 3(2). 337–383. DOI:  http://doi.org/10.1515/lp-2012-0017

Bates, Douglas & Martin Maechler, B. Bolker & Walker, S. 2015. lme4: Linear mixed-effects models using Eigen and S4. R package version 1.1-8. DOI:  http://doi.org/10.18637/jss.v067.i01

Baumann, Stefan & Riester, Arndt. 2012. Referential and Lexical Givenness: Semantic, Prosodic and Cognitive Aspects. In Elordieta, Gorka & Prieto, Pilar (eds.), Prosody and Meaning (Interface Explorations 25), 119–162. Berlin, New York: Mouton De Gruyter.

Baumann, Stefan & Riester, Arndt. 2013. Coreference, Lexical Givenness and Prosody in German. Lingua 136. 16–37. DOI:  http://doi.org/10.1016/j.lingua.2013.07.012

Baumann, Stefan & Winter, Bodo. 2018. What makes a word prominent? Predicting untrained German listeners’ perceptual judgments. Journal of Phonetics 70. 20–38. DOI:  http://doi.org/10.1016/j.wocn.2018.05.004

Baumann, Stefan & Röhr, Christine. 2015. The perceptual prominence of pitch accent types in German. Proceedings 18th ICPhS, Glasgow, UK: University of Glasgow.

Baumann, Stefan & Mertens, Jane & Kalbertodt, Janina. 2019. Informativeness and speaking style affect the realization of nuclear and prenuclear accents in German. Proceedings 19th ICPhS, Melbourne, Australia, 1580–1584.

Baumann, Stefan & Kalbertodt, Janina & Mertens, Jane. 2020. The appropriateness of prenuclear accent types – Evidence for information structural effects. Proceedings Speech Prosody 2020, Tokyo, Japan, 161–165. DOI:  http://doi.org/10.21437/SpeechProsody.2020-33

Baumann, Stefan & Becker, Johannes & Grice, Martine & Mücke, Doris. 2007. Tonal and Articulatory Marking of Focus in German. Proceedings 16th ICPhS, Saarbrücken, 1029–1032.

Bock, J. Kathryn & Warren, Richard K. 1985. Conceptual accessibility and syntactic structure in sentence formulation. Cognition 21. 47–67. DOI:  http://doi.org/10.1016/0010-0277(85)90023-X

Boersma, Paul & Weenink, David. 2018. Praat: doing phonetics by computer [Computer program]. Version 6.0.43, retrieved September 2018 from http://www.praat.org/.

Bolinger, Dwight. 1989. Intonation and Its Uses. Palo Alto: Stanford University Press.

Braun, Bettina. 2006. Phonetics and phonology of thematic contrast in German. Language and Speech 49(4). 451–493. DOI:  http://doi.org/10.1177/00238309060490040201

Braun, Bettina & Biezma, María. 2019. Prenuclear L*+H activates alternatives for the accented word. Frontiers in Psychology 10. 1993. DOI:  http://doi.org/10.3389/fpsyg.2019.01993

Büring, Daniel. 2007. Intonation, Semantics and Information Structure. In Ramchand, Gillian & Reiss, Charles (eds.), The Oxford Handbook of Linguistic Interfaces, 445–474. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199247455.001.0001

Büring, Daniel. 2016. Contrastive Topic. In Féry, Caroline & Ishihara, Shinichiro (eds.), The Handbook of Information Structure, 64–85. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199642670.013.002

Calhoun, Sasha. 2010. The Centrality of Metrical Structure in Signaling Information Structure: A Probabilistic Perspective. Language 86(1). 1–42. DOI:  http://doi.org/10.1353/lan.0.0197

Cole, Jennifer & Shattuck-Hufnagel, Stefanie. 2016. New methods for prosodic transcription: Capturing variability as a source of information. Laboratory Phonology 7. 1–29. DOI:  http://doi.org/10.5334/labphon.29

Féry, Caroline & Kügler, Frank. 2008. Pitch accent scaling on given, new and focused constituents in German. Journal of Phonetics 36(4). 680–703. DOI:  http://doi.org/10.1016/j.wocn.2008.05.001

Gordon, Peter C. & Grosz, Barbara J. & Gilliom, Laura A. 1993. Pronouns, names, and the centering of attention in discourse. Cognitive Science 17(3). 311–347. DOI:  http://doi.org/10.1207/s15516709cog1703_1

Grice, Martine & Reyelt, Matthias & Benzmüller, Ralf & Mayer, Jörg & Batliner, Anton. 1996. Consistency in Transcription and Labelling of German Intonation with GToBI. Proceedings 4th ICSLP, Philadelphia, 1716–1719.

Grice, Martine & Baumann, Stefan & Benzmüller, Ralf. 2005. German intonation in autosegmental-metrical phonology. Prosodic typology: The phonology of intonation and phrasing. 55–83. DOI:  http://doi.org/10.1093/acprof:oso/9780199249633.003.0003

Gussenhoven, Carlos. 2015. Does phonological prominence exist? Lingue E Linguaggio 14(1). 7–24. DOI:  http://doi.org/10.1418/80751

Halliday, Michael A. K. 1967. Intonation and Grammar in British English. The Hague: Mounton. Focus and givenness in Egyptian Arabic, 321. DOI:  http://doi.org/10.1515/9783111357447

Jagdfeld, Nils & Baumann, Stefan. 2011. Order Effects on the Perception of Relative Prominence. Proceedings 17th ICPhS, Hongkong, China, 958–961.

Kochanski, Greg & Grabe, Esther & Coleman, Joyce & Rosner, Bernard A. 2005. Loudness predicts prominence: Fundamental frequency lends little. Journal of the Acoustical Society of America 118. 1038–1054. DOI:  http://doi.org/10.1121/1.1923349

Krifka, Manfred. 2008. Basic notions of information structure. Acta Linguistica Hungarica 55. 243–276. DOI:  http://doi.org/10.1556/ALing.55.2008.3-4.2

Lambrecht, Knud. 1994. Information Structure and Sentence Form. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511620607

Mücke, Doris & Grice, Martine. 2014. The effect of focus marking on supralaryngeal articulation – Is it mediated by accentuation? Journal of Phonetics 44. 47–61. DOI:  http://doi.org/10.1016/j.wocn.2014.02.003

Peirce, Jonathan W. & MacAskill, Michael R. 2018. Building Experiments in PsychoPy. London: Sage.

Pitrelli, John & Beckman, Mary & Hirschberg, Julia. 1994. Evaluation of prosodic transcription labeling reliability in the ToBI framework. Proceedings 3rd ICSLP, Yokohama, Japan, Vol. 2, 123–126.

R Core Team. 2020. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Version 4.0.2.

Reinhart, Tanya. 1981. Pragmatics and linguistics: an analysis of sentence topics. Philosophica 27. 53–94.

Rietveld, Antonius C. M. & Gussenhoven, Carlos. 1985. On the relation between pitch excursion size and prominence. Journal of Phonetics 13. 299–308. DOI:  http://doi.org/10.1016/S0095-4470(19)30761-2

Röhr, Christine & Baumann, Stefan. 2011. Decoding Information Status by Type and Position of Accent in German. Proceedings 17th ICPhS, Hongkong, China, 1706–1709.

Rooth, Mats. 1985. Association with focus. Amherst, MA: University of Massachusetts dissertation.

Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics 1. 75–116. DOI:  http://doi.org/10.1007/BF02342617

Shattuck-Hufnagel, Stefanie & Turk, Alice E. 1996. A prosody tutorial for investigators of auditory sentence processing. Journal of Psycholinguistic Research 25. 193–247. DOI:  http://doi.org/10.1007/BF01708572

Syrdal, Ann K. & McGory, Julia. 2000. Inter-transcriber reliability of ToBI prosodic labeling. Proceedings 6th ICSLP, Beijing, China, Vol. 3, 235–238.

Truckenbrodt, Hubert. 2002. Upstep and embedded register levels. Phonology 19. 77–120. DOI:  http://doi.org/10.1017/S095267570200427X

Turk, Alice E. & Sawusch, James R. 1996. The processing of duration and intensity cues to prominence. Journal of the Acoustical Society of America 99. 3782–3790. DOI:  http://doi.org/10.1121/1.414995

Wagner, Michael. 2005. Prosody and recursion. Cambridge, MA: MIT dissertation.

Wagner, Michael. 2016. Information Structure and Production Planning. In Féry, Caroline & Ishihara, Shinichiro (eds.), The Handbook of Information Structure, 541–561. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199642670.013.39

Winter, Bodo. 2014. Spoken language achieves robustness and evolvability by exploiting degeneracy and neutrality. BioEssays 36(10). 960–967. DOI:  http://doi.org/10.1002/bies.201400028