RISS 검색 - 국내학술지논문 상세보기

국문 초록 (Abstract)

발달된 인터넷 환경과 데이터 교환 표준 언어로서 확정되고 있는 XML을 기반으로 하여 대량의 웹 문서들이 생산되면서 정보 추출의 대상은 자연스럽게 웹 문서로 이동하게 되었다. 이에 따라 급속히 증가하고 있는 XML 문서에 대한 구조, 통합 및 검색을 위한 연구들이 있다. 이 논문에서는 XML 문서들에 대한 질의 처리, 검색 등을 효율적으로 처리하기 위한 기반으로써 빈발구조 중심의 XML 문서를 클러스터링 하는 방법을 제안한다. 첫째 XML 문서를 트리 구조로 표현하여 분리하고 분리된 구조들을 대상으로 빈발하게 발생하는 구조들을 추출한다. 둘째 각 XML 문서에서 추출된 빈발 구조들을 트랜잭션의 항목으로 취급하여 클러스터링을 수행한다. 클러스터링을 수행할 때 각 클러스터의 생성 및 생성된 전체 클러스터의 응집도를 함께 고려하는 주요항목 가중치를 이용한다. 셋째 기존연구와의 비교 실험을 통해 제안하는 방법의 우수성을 증명한다.

번역하기

발달된 인터넷 환경과 데이터 교환 표준 언어로서 확정되고 있는 XML을 기반으로 하여 대량의 웹 문서들이 생산되면서 정보 추출의 대상은 자연스럽게 웹 문서로 이동하게 되었다. 이에 따라...

다국어 초록 (Multilingual Abstract)

As the web document of XML, an exchange language of data in the advanced Internet, is increasing, a target of information retrieval becomes the web documents. Therefore, there we researches on structure, integration and retrieval of XML documents. This paper proposes a clustering method of XML documents based on frequent structures, as a basic research to efficiently process query and retrieval. To do so, first, trees representing XML documents are decomposed and we extract frequent structures from them. Second, we perform clustering considering the weight of large items to adjust cluster creation and cluster cohesion, considering frequent structures as items of transactions. Third, we show the excellence of our method through some experiments which compare which the previous methods.

번역하기

참고문헌 (Reference)

1 "http://www.acm.org/sigmod/record/xml" 2001

2 "XClust: Clustering XML Schemas for Effective Integration" 2002

3 "TreeFinder: A First Step towards XML Data Mining" 2002

4 "Structural Matching and Discovery in Document Databases" 1997

5 "Storing Semistructured Data with STORED" 431-442, 1999

6 "PrefixSpan: Mining Sequential Pattern Efficiently by Prefix-Projected Pattern Growth" 2001

7 "Naive Clustering of a Large XML Document Collection" 2002

8 "Indexing and Retrieval of XML-encoded Structured Documents in Dynamic Environment" 2480 : 2002

9 "Efficiently Mining Frequent Tree in a Forest" 2002

10 "Efficiently Maintaining Structural Associations of Semistructured Data" 2003

1 "http://www.acm.org/sigmod/record/xml" 2001

2 "XClust: Clustering XML Schemas for Effective Integration" 2002

3 "TreeFinder: A First Step towards XML Data Mining" 2002

4 "Structural Matching and Discovery in Document Databases" 1997

5 "Storing Semistructured Data with STORED" 431-442, 1999

6 "PrefixSpan: Mining Sequential Pattern Efficiently by Prefix-Projected Pattern Growth" 2001

7 "Naive Clustering of a Large XML Document Collection" 2002

8 "Indexing and Retrieval of XML-encoded Structured Documents in Dynamic Environment" 2480 : 2002

9 "Efficiently Mining Frequent Tree in a Forest" 2002

10 "Efficiently Maintaining Structural Associations of Semistructured Data" 2003

11 "Discovery of Frequent Tag Tree Patterns in Semistructured Web Documents" 2002

12 "Discovery Typical Structures of Documents: A Road Map Approach" 1998

13 "CLOPE : A fast and effective clustering algorithm for transaction data" 2002

14 "BitCube: Clustering and Statistical Analysis for XML Documents" 2001

15 "A Clustering Technique using Common Structures of XML Documents" 32 (32): 2005

연월일	이력구분	이력상세
2012-10-01	평가	학술지 통합(등재유지)
2010-01-01	평가	등재학술지 유지(등재유지)
2008-01-01	평가	등재학술지 유지(등재유지)
2006-01-01	평가	등재학술지 유지(등재유지)
2003-01-01	평가	등재학술지 선정(등재후보2차)
2002-01-01	평가	등재후보 1차 PASS(등재후보1차)
2000-07-01	평가	등재후보학술지 선정(신규평가)

상세검색

RISS 보유자료

상세검색

해외전자자료

클러스터의 주요항목 가중치 기반 XML 문서 클러스터링 = Clustering XML Documents Considering The Weight of Large Items in Clusters

부가정보

동일학술지(권/호) 다른 논문

분석정보

인용정보 인용지수 설명보기

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료