This study aims to analyze how faithfully the summary results reflect the topic structure of the input data in news trend analysis using generative summarization models, and to propose a topic induction method focused on the input stage to improve thi...
This study aims to analyze how faithfully the summary results reflect the topic structure of the input data in news trend analysis using generative summarization models, and to propose a topic induction method focused on the input stage to improve this. Existing news summarization research has primarily focused on performance evaluation using reference summary-based similarity metrics like ROUGE, or on improving model architecture and training techniques. However, these approaches have limitations in fully explaining how well the summary text represents the issue structure over the entire period, particularly whether it is suitable from a trend analysis perspective. For this purpose, this study set period buckets of 1 day, 1 week, 1 month, and 3 months for Naver's economic and financial news data, and constructed news article sets for each period. First, topic keywords were extracted for each period using TF-IDF analysis. The representativeness of these topics was then re-evaluated based on article-level coverage, defining a hierarchical topic cluster structure comprising Primary Topic Clusters, Secondary Topic Clusters, and Tertiary Topic Clusters. In this process, TF-IDF was used solely as a tool to identify potential issues emerging during the period, not as a substitute for the internal judgments of the summarization model. The summary generation experiments were conducted using the fine-tuned KoBART model, distinguishing between Unguided and Guided approaches. The Unguided approach assumed a scenario where the user does not explicitly input topic keywords, using the entire article set as input. In contrast, the Guided approach filtered only relevant articles based on TF-IDF-based topic keywords to compose the input. Both methods maintained the same summary model and generation conditions, allowing for a comparison of how the topic representation in the summary results changed based on the difference in input composition. The evaluation of the summary results was performed not by assessing linguistic quality or similarity to reference summaries, but by analyzing how the topic keywords included in the generated summaries were distributed across PTC, STC, and TTC. Experimental results showed that TF-IDF-based topic guiding did not exhibit the same effectiveness across all time periods. However, it significantly improved the concentration of core issues in summaries, particularly for medium-scale period data like 1 week. Conversely, the guiding effect was limited in short-term data with high issue density, such as 1-day data. For long-term data like 1-month and 3-month periods, structural limitations were identified in representing the entire topic structure of the period with a single summary. Additionally, to explore alternatives to fine-tuning, this study conducted experiments combining the base KoBART model with TF-IDF-based RAG. The results showed that while natural sentences were generated under some conditions, consistent performance at a level capable of reliably replacing the fine-tuned model was not achieved overall. This confirmed that fine-tuning serves not as a topic-determining role, but rather as an element providing the linguistic foundation for reliably summarizing multiple article inputs and expressing guided information in coherent sentences. In summary, this study holds significance not by directly improving the performance of generative summarization models, but by proposing a methodological foundation for a news trend analysis summarization system usable by users without domain knowledge. This is achieved by separating topic guidance at the input stage from topic reflection evaluation at the output stage in the design. This is expected to serve as foundational material for constructing input-design-centric generative systems across various application domains, including not only news summarization but also time-series document analysis and automated report generation.