RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • SCOPUS

        Privacy Disclosure and Preservation in Learning with Multi-Relational Databases

        Guo, Hongyu,Viktor, Herna L.,Paquet, Eric Korean Institute of Information Scientists and Eng 2011 Journal of Computing Science and Engineering Vol.5 No.3

        There has recently been a surge of interest in relational database mining that aims to discover useful patterns across multiple interlinked database relations. It is crucial for a learning algorithm to explore the multiple inter-connected relations so that important attributes are not excluded when mining such relational repositories. However, from a data privacy perspective, it becomes difficult to identify all possible relationships between attributes from the different relations, considering a complex database schema. That is, seemingly harmless attributes may be linked to confidential information, leading to data leaks when building a model. Thus, we are at risk of disclosing unwanted knowledge when publishing the results of a data mining exercise. For instance, consider a financial database classification task to determine whether a loan is considered high risk. Suppose that we are aware that the database contains another confidential attribute, such as income level, that should not be divulged. One may thus choose to eliminate, or distort, the income level from the database to prevent potential privacy leakage. However, even after distortion, a learning model against the modified database may accurately determine the income level values. It follows that the database is still unsafe and may be compromised. This paper demonstrates this potential for privacy leakage in multi-relational classification and illustrates how such potential leaks may be detected. We propose a method to generate a ranked list of subschemas that maintains the predictive performance on the class attribute, while limiting the disclosure risk, and predictive accuracy, of confidential attributes. We illustrate and demonstrate the effectiveness of our method against a financial database and an insurance database.

      • SCOPUS

        Privacy Disclosure and Preservation in Learning with Multi-Relational Databases

        Hongyu Guo,Herna L. Viktor,Eric Paquet 한국정보과학회 2011 Journal of Computing Science and Engineering Vol.5 No.3

        There has recently been a surge of interest in relational database mining that aims to discover useful patterns across multiple interlinked database relations. It is crucial for a learning algorithm to explore the multiple inter-connected relations so that important attributes are not excluded when mining such relational repositories. However, from a data privacy perspective, it becomes difficult to identify all possible relationships between attributes from the different relations, considering a complex database schema. That is, seemingly harmless attributes may be linked to confidential information, leading to data leaks when building a model. Thus, we are at risk of disclosing unwanted knowledge when publishing the results of a data mining exercise. For instance, consider a financial database classification task to determine whether a loan is considered high risk. Suppose that we are aware that the database contains another confidential attribute, such as income level, that should not be divulged. One may thus choose to eliminate, or distort, the income level from the database to prevent potential privacy leakage. However, even after distortion, a learning model against the modified database may accurately determine the income level values. It follows that the database is still unsafe and may be compromised. This paper demonstrates this potential for privacy leakage in multi-relational classification and illustrates how such potential leaks may be detected. We propose a method to generate a ranked list of subschemas that maintains the predictive performance on the class attribute, while limiting the disclosure risk, and predictive accuracy, of confidential attributes. We illustrate and demonstrate the effectiveness of our method against a financial database and an insurance database.

      • SCOPUSKCI등재

        Contribution to Improve Database Classification Algorithms for Multi-Database Mining

        Miloudi, Salim,Rahal, Sid Ahmed,Khiat, Salim Korea Information Processing Society 2018 Journal of information processing systems Vol.14 No.3

        Database classification is an important preprocessing step for the multi-database mining (MDM). In fact, when a multi-branch company needs to explore its distributed data for decision making, it is imperative to classify these multiple databases into similar clusters before analyzing the data. To search for the best classification of a set of n databases, existing algorithms generate from 1 to ($n^2-n$)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification are subsets of clusters in the next classification), existing algorithms generate each classification independently, that is, without taking into account the use of clusters from the previous classification. Consequently, existing algorithms are time consuming, especially when the number of candidate classifications increases. To overcome the latter problem, we propose in this paper an efficient approach that represents the problem of classifying the multiple databases as a problem of identifying the connected components of an undirected weighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of our algorithm against existing works and that it overcomes the problem of increase in the execution time.

      • KCI등재

        Contribution to Improve Database Classification Algorithms for Multi-Database Mining

        Salim Miloudi,Sid Ahmed Rahal,Salim Khiat 한국정보처리학회 2018 Journal of information processing systems Vol.14 No.3

        Database classification is an important preprocessing step for the multi-database mining (MDM). In fact,when a multi-branch company needs to explore its distributed data for decision making, it is imperative toclassify these multiple databases into similar clusters before analyzing the data. To search for the bestclassification of a set of n databases, existing algorithms generate from 1 to (n2–n)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification aresubsets of clusters in the next classification), existing algorithms generate each classification independently,that is, without taking into account the use of clusters from the previous classification. Consequently, existingalgorithms are time consuming, especially when the number of candidate classifications increases. Toovercome the latter problem, we propose in this paper an efficient approach that represents the problem ofclassifying the multiple databases as a problem of identifying the connected components of an undirectedweighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of ouralgorithm against existing works and that it overcomes the problem of increase in the execution time.

      • A PROPOSAL OF SEMI-AUTOMAT IC INDEXING AL GORI T HM FOR MULTI-MEDIA DATABASE WITH USERS` SENSIBIL ITY

        ( Takashi Mitsuishi ),( Jun Sasaki ),( Yutaka Funyu ) 한국감성과학회 2000 춘계학술대회 Vol.2000 No.-

        We propose a semi-automatic and dynamic indexing algorithm for multi-media database(e.g. movie files, audio files), which are difficult to create indexes expressing their emotional or abstract contents, according to user`s sensitivity by using user`s histories of access to database. In this algorithm, we simply categorize data at first, create a vector space of each user`s interest(user model) from the history of which categories the data belong to, and create vector space of each data(title model) from the history of which users the data had been accessed from. By continuing the above method, we could create suitable indexes, which show emotional content of each data. In this paper, we define the recurrence formulas based on the proposed algorithm. We also show the effectiveness of the algorithm by simulation result.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼