The purpose of this dissertation is to analyze discourse anaphoric reference, both on a local level and a global level, with respect to Focus Theory (Sidner 1983; Grosz and Sidner 1986) and Centering Theory (Grosz et al. 1986). To achieve this purpose...
The purpose of this dissertation is to analyze discourse anaphoric reference, both on a local level and a global level, with respect to Focus Theory (Sidner 1983; Grosz and Sidner 1986) and Centering Theory (Grosz et al. 1986). To achieve this purpose, 621 discourses were examined from a corpus of a Korean T.V. drama script. Pronominalization is one way to indicate the most salient entity to the current utterance (Un). Korean is a pro-drop language, with null subjects and zero topics; its content is locally and globally identified based on the context. In other words, in Korean, the discourse anaphoric noun phrase frequently occurs into zero anaphora, an unexpressed argument of the verb, even when it does not refer to a member of the Un and it may not be transparent to the listener.
The alternation of null and overt pronominal subjects could be explained in terms of centering transitions. However, the hypotheses many scholars put forward in their earlier works were supported only by a few constructed examples. In this dissertation, I report on a corpus study that I conducted in order to find more solid evidence for those hypotheses. Analyzing real data has the added benefit of allowing me to address how discourse structure and attentional state affect the interpretation of global discourse anaphora, and to provide a more detailed analysis of CONTINUE transition. The ordering for the Cf list I use is modified with respect to the usual one postulated for Korean language.
The frequency of zero anaphora popping in hierarchical structure is analyzed and compared with that of definite noun phrases and full nouns (48.53% (zero anaphora)>27.94% (full NP)>23.53% (definite NP)). The result reflects the fact that the segment boundaries poorly correlate with segmentation proposed by Passonneau (1998). She shows that the intention of discourse entity remains over discourse segment and suggests that centering should interact with global context.
It is claimed that information from a local segment alone is insufficient to solve global anaphora, so there should be an algorithm for accessing hierarchically recent Un-1. I present different discourse structures varying in terms of whether two adjacent utterances can be considered to be linearly recent or hierarchically recent. It is proposed that the discourse structure captures nonlocal antecedent and analyze centering transition of zero anaphora based on the hierarchical structure. The result shows that zero anaphora occurs almost in CONTINUE transition, the highest-ranked preference type for the discourse coherence, but rarely in SIFT transition (65.15% (CONTINUE)>21.21% (RETAIN)>13.64% (SHIFT)). In summary, a shift of centers occurs only when such an intended interpretation is well supported by other contextual information, so that the speaker's intention is rarely misinterpreted. If the speaker is concerned that his utterance might be misinterpreted as a consequence of shifting the topic, he always use full NPs to express the intented new topic overtly.
Integrating the centering algorithm with the Cache Model (Walker 1986, 1998) provides reasonable evidence of explicit global zero anaphora. The use of the full NP is one of a number of potentially redundant cues that the speaker has available for signaling intentional structure, so that the choice of a full NP or pronoun is not determined by the current attentional state, but my data examined here show that the cues are inconstant. Some problems arise from the cache size assumption and concreteness of the cache elements in Korean discourse. Further work should investigate what is the network structure of cache elements which provides constraints on the inferential process, and how the use of the structure might be used as a part of an algorithm for inferring discourse.