This paper has two aims. One is to modify my previous model(henceforth Parser-2008) in order to make it more efficient. The other is to discuss ways to extract crucial information from corpus that have been syntactically tagged using my revised model(...
This paper has two aims. One is to modify my previous model(henceforth Parser-2008) in order to make it more efficient. The other is to discuss ways to extract crucial information from corpus that have been syntactically tagged using my revised model(henceforth Parser-2009). In order to revise Parser-2008, I need to introduce a new marker `X` to represent a sentence constituent which has no grammatical function in a sentence. In addition, contrary to Parser-2008, I suggest that the number of arguments be added to the information of their predicates instead of giving up marking deleted arguments. I parsed a short novel using Parser-2009, from which I extracted crucial information. Firstly, it is very interesting that while diversity of sentence structure depends on matrix clauses, its complexity relies on embedded clauses. In the horizontal connection between sentences, the degree of complexity of the first sentence is lower than that of the last, and the complexity of the last sentence in a previous paragraph is higher than that of the first in the next paragraph. In a complex sentence which has more than one embedded clause, the degree of complexity of the matrix clause is higher than the deepest embedded clause, with the first embedded clause being the most complex, and the last one the least. In the horizontal and vertical connection between sentences, the types such as [ACE], [ABE], and [AE] are the most dominant links, and the degree of diversity within the links of matrix clauses is higher than that of the embedded clauses as well.