RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
선제 대응을 위한 의심 도메인 추론 방안

강병호(Byeongho Kang),양지수(JISU YANG),소재현(Jaehyun So),김창엽(Czang Yeob Kim) 한국정보보호학회 2016 정보보호학회논문지 Vol.26 No.2
- 원문보기 3
  ScienceON
  
  KCI
  
  DBpia
본 논문에서는 선제 대응을 위한 의심 도메인 추론 방안을 제시한다. TLD Zone 파일과 WHOIS 정보를 이용하여 의심 도메인을 추론하며, 후보 도메인 탐색, 기계 학습, 의심 도메인 집단 추론의 세 과정으로 구성되어 있다. 첫 번째 과정에서는 씨앗 도메인과 동일한 네임 서버와 업데이트 시간을 가진 다른 도메인을 TLD Zone 파일로부터 추출하여 후보 도메인을 형성하며, 두 번째 과정에서는 후보 도메인의 WHOIS 정보를 정량화하여 유사한 집단끼리 군집화 한다. 마지막 과정에서는 씨앗 도메인을 포함하는 클러스터에 속한 도메인을 의심 도메인 집단으로 추론한다. 실험에서는 .COM과 .NET의 TLD Zone 파일을 사용하였으며, 10개의 알려진 악성 도메인을 씨앗 도메인으로 이용하였다. 실험 결과, 제안하는 방안은 55개의 도메인을 의심 도메인으로 추론하였으며, 그 중 52개는 적중하였다. F1은 0.91을 기록하였으며, 정밀도는 0.95을 보였다. 본 논문에서 제안하는 방안을 통해 악성 도메인을 추론하여 사전에 차단할 수 있을 것으로 기대한다. In this paper, we propose a proactive inference method of finding suspicious domains. Our method detects potential malicious domains from the seed domain information extracted from the TLD Zone files and WHOIS information. The inference process follows the three steps: searching the candidate domains, machine learning, and generating a suspicious domain pool. In the first step, we search the TLD Zone files and build a candidate domain set which has the same name server information with the seed domain. The next step clusters the candidate domains by the similarity of the WHOIS information. The final step in the inference process finds the seed domain’s cluster, and make the cluster as a suspicious domain set. In experiments, we used .COM and .NET TLD Zone files, and tested 10 seed domains selected by our analysts. The experimental results show that our proposed method finds 55 suspicious domains and 52 true positives. F1 scores 0.91, and precision is 0.95 We hope our proposal will contribute to the further proactive malicious domain blacklisting research.
2
워드 임베딩과 단어 네트워크 분석을 활용한 비지도학습 기반의 문서 다중 범주 가중치 산출

정재윤(Jaeyun Jeong),모경현(Kyoung Hyun Mo),서승완(Seungwan Seo),김창엽(Czang Yeob Kim),김해동(Haedong Kim),강필성(Pilsung Kang) 대한산업공학회 2018 대한산업공학회지 Vol.44 No.6
- 원문보기
- 복사/대출신청
Due to the increased amounts of online documents, there is a growing demand for text categorization that categorizes documents into predefined categories. Many approaches to this problem are based on supervised machine learning which couldn’t be applied to unlabeled data. However, large number of documents, such as online cell phone reviews, have no category information and key categories are not predefined. To solve these problems, we propose unsupervised document multi-labeling method based on word embedding and word network analysis. After embedding words in a lower dimensional space using Word2Vec technique, we generate a weight matrix by calculating similarities between words. We create a word network using this matrix and extract the key categories from this network. With key category-weight matrix and co-occurrence matrix, we generate a document-category score matrix. To verify our proposed method, we collect 298,206 cell phone reviews from four review websites. Then, we compared the results of the proposed method with labeled documents from human cognitive perspective.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천