RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어
        • 저자
          펼치기

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        A Classification Method Using Data Reduction

        Daiho Uhm,Sunghae Jun,Seung-Joo Lee 한국지능시스템학회 2012 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.12 No.1

        Data reduction has been used widely in data mining for convenient analysis. Principal component analysis (PCA) and factor analysis (FA) methods are popular techniques. The PCA and FA reduce the number of variables to avoid the curse of dimensionality. The curse of dimensionality is to increase the computing time exponentially in proportion to the number of variables. So, many methods have been published for dimension reduction. Also, data augmentation is another approach to analyze data efficiently. Support vector machine (SVM) algorithm is a representative technique for dimension augmentation. The SVM maps original data to a feature space with high dimension to get the optimal decision plane. Both data reduction and augmentation have been used to solve diverse problems in data analysis. In this paper, we compare the strengths and weaknesses of dimension reduction and augmentation for classification and propose a classification method using data reduction for classification. We will carry out experiments for comparative studies to verify the performance of this research.

      • SCOPUSKCI등재

        Data Reduction Method in Massive Data Sets

        Namo, Gecynth Torre,Yun, Hong-Won The Korea Institute of Information and Commucation 2009 Journal of information and communication convergen Vol.7 No.1

        Many researchers strive to research on ways on how to improve the performance of RFID system and many papers were written to solve one of the major drawbacks of potent technology related with data management. As RFID system captures billions of data, problems arising from dirty data and large volume of data causes uproar in the RFID community those researchers are finding ways on how to address this issue. Especially, effective data management is important to manage large volume of data. Data reduction techniques in attempts to address the issues on data are also presented in this paper. This paper introduces readers to a new data reduction algorithm that might be an alternative to reduce data in RFID Systems. A process on how to extract data from the reduced database is also presented. Performance study is conducted to analyze the new data reduction algorithm. Our performance analysis shows the utility and feasibility of our categorization reduction algorithms.

      • KCI등재

        인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝

        박지애(Jiae Park),조윤호(Yoonho Cho) 한국지능정보시스템학회 2016 지능정보연구 Vol.22 No.3

        The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

      • SCOPUSKCI등재

        Wavelet-Based Dimensionality Reduction for Multiple Sets of Complicated Functional Data

        Young-Seon Jeong 대한산업공학회 2019 Industrial Engineeering & Management Systems Vol.18 No.2

        Multiple sets of complicated functional data with sharp changes have appeared in many engineering studies for such purposes as monitoring the quality and detecting faults in manufacturing processes. Some of the data curves in these studies exhibit large variations in local regions. This paper present a wavelet-based data reduction procedure to reduce high dimensional functional data from manufacuring processes. The proposed method can characterize the variations of multiple curves at certain local regions. In addition, unlike existing methods, which is based on a single curve, the method can handle with multiple curves together for the reduction of high dimensional data having distinct structures. Evaluation with real-life data sets shows that the proposed procedure performs better than several techniques extended from methods based on single-curve-based data reduction.

      • KCI등재

        A Classification Method Using Data Reduction

        Uhm, Daiho,Jun, Sung-Hae,Lee, Seung-Joo Korean Institute of Intelligent Systems 2012 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.12 No.1

        Data reduction has been used widely in data mining for convenient analysis. Principal component analysis (PCA) and factor analysis (FA) methods are popular techniques. The PCA and FA reduce the number of variables to avoid the curse of dimensionality. The curse of dimensionality is to increase the computing time exponentially in proportion to the number of variables. So, many methods have been published for dimension reduction. Also, data augmentation is another approach to analyze data efficiently. Support vector machine (SVM) algorithm is a representative technique for dimension augmentation. The SVM maps original data to a feature space with high dimension to get the optimal decision plane. Both data reduction and augmentation have been used to solve diverse problems in data analysis. In this paper, we compare the strengths and weaknesses of dimension reduction and augmentation for classification and propose a classification method using data reduction for classification. We will carry out experiments for comparative studies to verify the performance of this research.

      • KCI등재

        센서 네트워크를 위한 PCA 기반의 데이터 스트림 감소 기법

        알렉산더페도시브 ( Alexander Fedoseev ),최영환 ( Young-hwan Choi ),황인준 ( Eenjun Hwang ) 한국인터넷정보학회 2009 인터넷정보학회논문지 Vol.10 No.4

        데이터 스트림이란 새로운 개념과 기존의 단순 데이터 사이에 존재하는 개념적 차이를 극복하기 위해서는 많은 연구가 필요하다. 대표적인 예로써 센서 네크워크에서의 데이터 스트림 처리를 들 수 있는 데, 이를 위해서는 대역폭이나 에너지, 메모리와 같은 자원적 한계에서부터 연속 질의를 포함하는 질의처리의 특수성까지 고려해야 할 대상이 광범위하다. 본 논문에서는 데이터 스트림 처리에서의 물리적 제약사항에 해당하는 한정된 메모리 문제를 해결하기 위해 PCA 기법을 기반으로 하는 데이터 스트림 축소 방안을 제안하다. PCA는 상호 관련된 다수의 변수들을 관련이 없는 적은 수의 변수로 변환해준다. 본 논문에서는 질의 처리 엔진의 협력을 가정하고서 센서 네크워크의 스트림 데이터 처리를 위해 PCA 기법을 적용하며, 다른 센서로부터 얻어진 많은 측정값 사이에 시공간적 관련성을 이용한다. 최종적으로 그러한 데이터 처리를 위한 프레임워크를 제시하고 다양한 실험을 통하여 기법의 성능을 분석한다. The emerging notion of data stream has brought many new challenges to the research communities as a consequence of its conceptual difference with conventional concepts of just data. One typical example is data stream processing in sensor networks. The range of data processing considerations in a sensor network is very wide, from physical resource restrictions such as bandwidth, energy, and memory to the peculiarities of query processing including continuous and specific types of queries. In this paper, as one of the physical constraints in data stream processing, we consider the problem of limited memory and propose a new scheme for data stream reduction based on the Principal Component Analysis (PCA) technique. PCA can transform a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables. We adapt PCA for the data stream of a sensor network assuming the cooperation of a query engine (or application) with a network base station. Our method exploits the spatio-temporal correlation among multiple measurements from different sensors. Finally, we present a new framework for data processing and describe a number of experiments under this framework. We compare our scheme with the wavelet transform and observe the effect of time stamps on the compression ratio. We report on some of the results.

      • KCI등재

        Collective Prediction exploiting Spatio Temporal correlation (CoPeST) for energy efficient wireless sensor networks

        ( Muruganantham Arunraja ),( Veluchamy Malathi ) 한국인터넷정보학회 2015 KSII Transactions on Internet and Information Syst Vol.9 No.7

        Data redundancy has high impact on Wireless Sensor Network`s (WSN) performance and reliability. Spatial and temporal similarity is an inherent property of sensory data. By reducing this spatio-temporal data redundancy, substantial amount of nodal energy and bandwidth can be conserved. Most of the data gathering approaches use either temporal correlation or spatial correlation to minimize data redundancy. In Collective Prediction exploiting Spatio Temporal correlation (CoPeST), we exploit both the spatial and temporal correlation between sensory data. In the proposed work, the spatial redundancy of sensor data is reduced by similarity based sub clustering, where closely correlated sensor nodes are represented by a single representative node. The temporal redundancy is reduced by model based prediction approach, where only a subset of sensor data is transmitted and the rest is predicted. The proposed work reduces substantial amount of energy expensive communication, while maintaining the data within user define error threshold. Being a distributed approach, the proposed work is highly scalable. The work achieves up to 65% data reduction in a periodical data gathering system with an error tolerance of 0.6°C on collected data.

      • KCI등재

        P2P 환경을 위한 허위 데이터 감축 정책

        김승연(Seung-Yun Kim),이원주(Won-Joo Lee),전창호(Chang-Ho Jeon) 한국컴퓨터정보학회 2011 韓國컴퓨터情報學會論文誌 Vol.16 No.5

        본 논문에서는 P2P 환경에서 허위 데이터를 감축할 수 있는 FDR(False Data Reduction) 정책을 제안한다. 이 정책의 특징은 사용자가허위 데이터를 인지하게 되면 그 파일을 다운로드중인 다른 피어 들에게 그 데이터 삭제를 요청한다. 또한 허위 데이터를 다운로드 중인 피어 들에게 이를 통보함으로써 허위 데이터를 다운로드하지 않도록 하고, P2P환경에 퍼지지 않도록 허위 데이터를 삭제한다. 또한 이러한 모든 과정은 어떠한 검색 서버도 요구되지 않고 오직 피어들 간에 정보 교환으로 이루어지기 때문에 검색 서버가 필요하지 않은 순수 P2P모델에 적용할 수 있다는 장점이 있다. 본 논문에서는 시뮬레이션을 통하여 FDR 정책이 네트워크 트래픽을 줄이는데 효과적임을 보인다. 임으로써 유효 데이터의 평균전송시간을 단축할 수 있음을 보인다. 그 결과 허위 데이터 비율에 따른 유효 데이터의 평균전송시간을 9.78~16.84% 단축함을 알 수 있었다. In this paper, we propose a FDR(False Data Reduction) strategy for P2P environment that reduces false data. The key idea of our strategy is that we use FDR algorithm to stop transmitting of false data and to delete that. If a user recognizes false data in downloaded-data and the user's peer requests the others to stop the transmission of the false data immediately. Also, the FDR algorithm notifies the other peers to prohibit spreading of the false data in the environment. All this procedure is possible to be executed in each peer without any lookup server. The FDR algorithm needs only a little data exchange among peers. Through simulation, we show that it is more effective to reduce the network traffic than the previous P2P strategy. We also show that the proposed strategy improves the performance of network compared to previous P2P strategy. As a result, The FDR strategy is decreased 9.78 ~ 16.84% of mean true data transmission time.

      • KCI등재

        환경영향평가 협의 내용 분석을 통한 데이터 수요 도출방안 - 수환경 분야를 중심으로 -

        황진후,김윤지,전성우,최유영,성현찬 한국환경영향평가학회 2023 환경영향평가 Vol.32 No.1

        The need for improvement is raised due to limitations with environmental impact assessment, and the importance for data-based environmental impact assessment is increasing. In this study, data demand was derived by analyzing Agreed Terms and Conditions in the Water Environment field (Water Quality, Hydraulic & Hydrologic Conditions, and Marine Environment) of environmental impact assessment. Agreed Terms and Conditions on environmental impact assessment in the water environment field were classified and categorized by environmental impact assessment stage (addition to status survey, impact prediction and evaluation, establishment of reduction measures, post-environmental impact survey), and data demand for each type of consultation opinion was linked. As a result of the categorization of Agreed Terms and Conditions, it was classified into 18 types in the water quality, 15 types in the hydraulic & hydrologic conditions, and 17 types in the marine environment. As a result of linking data demand, the total number of data demand was 236 in the water quality, 98 in the hydraulic & hydrologic conditions, and 73 in the marine environment. The highest number of Agreed Terms and Conditions and data demands were found in the water quality for the evaluation item and establishment of reduction measures, specifically establishment of non-point source pollution reduction measures, for the stage. The numbers were judged to be linked to the relative importance of the items and the primary purpose of environmental impact assessment. The derivation of data demand through the analysis of Agreed Terms and Conditions in the environmental impact assessment can contribute to the advancement of the preparation of environmental impact assessment reports and is expected to increase data utilization by various decision-makers by establishing a systematic database. 환경영향평가에 대한 문제점으로 인한 개선 필요성이 제기되고, 데이터 기반의 환경영향평가의 중요성이 증가하고 있다. 본 연구에서는 환경영향평가의 수환경 분야(수질, 수리·수문, 해양환경)의 협의 내용 분석을 통해 데이터 수요를 도출하였다. 수환경 분야의 환경영향평가 협의 내용 총 400건(4,180문장) 을 평가항목(수질, 수리수문, 해양환경) 및 환경영향평가 단계(현황조사 추가, 영향예측 및 평가, 저감대책 수립, 사후환경영향조사)별로 분류 후 유형화하였고, 해당하는 협의 내용 유형 별 데이터 수요를 연계하였다. 협의 내용 유형화 결과 수질 분야 18개, 수리·수문 분야 14개, 해양환경 분야 17개의 유형으로 분류되었으며, 데이터 수요 연계 결과 수질 분야 254개, 수리·수문 분야 102개, 해양환경 분야 74개의 데이터 수요가 도출되었다. 평가항목으로는 수질 분야, 환경영향평가 단계 상으로는 저감대책 수립 분야에서의협의 내용 유형 및 데이터 수요가 가장 높은 빈도로 나타났으며, 세부 협의 내용 유형으로는 비점오염 저감대책이 가장 많은 빈도로 나타났다. 이는 항목의 상대적 중요도와, 환경영향평가의 주요 목적과 연계된 것으로 판단되었다. 환경영향평가 협의 내용 분석을 통한 데이터 수요의 도출은 환경영향평가서 작성의 고도화에 기여할 수 있으며, 환경영향평가 데이터 체계화를 통해 다양한 의사결정자의 데이터 활용도를 높일것으로 기대된다.

      • Novel Intent based Dimension Reduction and Visual Features Semi-Supervised Learning for Automatic Visual Media Retrieval

        kunisetti, Subramanyam,Ravichandran, Suban International Journal of Computer ScienceNetwork S 2022 International journal of computer science and netw Vol.22 No.6

        Sharing of online videos via internet is an emerging and important concept in different types of applications like surveillance and video mobile search in different web related applications. So there is need to manage personalized web video retrieval system necessary to explore relevant videos and it helps to peoples who are searching for efficient video relates to specific big data content. To evaluate this process, attributes/features with reduction of dimensionality are computed from videos to explore discriminative aspects of scene in video based on shape, histogram, and texture, annotation of object, co-ordination, color and contour data. Dimensionality reduction is mainly depends on extraction of feature and selection of feature in multi labeled data retrieval from multimedia related data. Many of the researchers are implemented different techniques/approaches to reduce dimensionality based on visual features of video data. But all the techniques have disadvantages and advantages in reduction of dimensionality with advanced features in video retrieval. In this research, we present a Novel Intent based Dimension Reduction Semi-Supervised Learning Approach (NIDRSLA) that examine the reduction of dimensionality with explore exact and fast video retrieval based on different visual features. For dimensionality reduction, NIDRSLA learns the matrix of projection by increasing the dependence between enlarged data and projected space features. Proposed approach also addressed the aforementioned issue (i.e. Segmentation of video with frame selection using low level features and high level features) with efficient object annotation for video representation. Experiments performed on synthetic data set, it demonstrate the efficiency of proposed approach with traditional state-of-the-art video retrieval methodologies.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼