http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Feature selection for continuous aggregate response and its application to auto insurance data
Kang, Suyeon,Song, Jongwoo Elsevier 2018 expert systems with applications Vol.93 No.-
<P><B>Abstract</B></P> <P>This paper presents new feature selection algorithms for aggregate data analysis. Data aggregation is commonly used when it is not appropriate to model the relationship between a response and explanatory variables at an individual-level. We investigate substantial challenges in analysis for aggregate data. Then, we propose a groupwise feature selection method that addresses (i) the change in dataset depending on the selection of predictor variables, (ii) the presence of potential missing responses, and (iii) the suitability of model selection criteria when comparing models using different datasets. In application to real auto insurance data, we find a set of important predictors to classify the policyholders into some homogeneous risk groups. Our results clearly demonstrate the potential of the proposed feature selection method for aggregate data analysis in terms of flexibility and computational complexity. We expect that the proposed algorithms would be further applied into a wide range of decision-making tasks using aggregate data as they are applicable to any type of data.</P> <P><B>Highlights</B></P> <P> <UL> <LI> We study major difficulties in aggregate data analysis and propose a solution for it. </LI> <LI> The proposed group feature selection algorithm is simple, fast, and flexible. </LI> <LI> The proposed approach is applied to real data to demonstrate its practical effectiveness. </LI> <LI> Important rating factors for risk assessment in auto insurance are identified. </LI> </UL> </P>
Kang, Suyeon,Song, Jongwoo Elsevier 2017 Journal of the Korean Statistical Society Vol.46 No.4
<P><B>Abstract</B></P> <P>In this article, we consider six estimation methods for extreme value modeling and compare their performances, focusing on the generalized Pareto distribution (GPD) in the peaks over threshold (POT) framework. Our goal is to identify the best method in various conditions via a thorough simulation study. In order to compare the estimators in the POT sense, we suggest proper strategies for some estimators originally not developed under the POT framework. The simulation results show that a nonlinear least squares (NLS) based estimator outperforms others in parameter estimation, but there is no clear winner in quantile estimation. For quantile estimation, NLS-based methods perform well even when the sample size is small and the Hill estimator comes to the front when the underlying distribution has a very heavy tail. Applications of EVT cover many different fields and researchers on each field may have their own experimental conditions or practical restrictions. We believe that our results would provide guidance on determining proper estimation method on future analysis.</P>
Suyeon Kang,Thi Hao Vu,Jubi Heo,Chaeeun Kim,Hyun S. Lillehoj,Yeong Ho Hong 대한수의학회 2023 Journal of Veterinary Science Vol.24 No.5
Background: Highly pathogenic avian influenza virus (HPAIV) is considered a global threat to both human health and the poultry industry. MicroRNAs (miRNA) can modulate the immune system by affecting gene expression patterns in HPAIV-infected chickens. Objectives: To gain further insights into the role of miRNAs in immune responses against H5N1 infection, as well as the development of strategies for breeding disease-resistant chickens, we characterized miRNA expression patterns in tracheal tissues from H5N1-infected Ri chickens. Methods: miRNAs expression was analyzed from two H5N1-infected Ri chicken lines using small RNA sequencing. The target genes of differentially expressed (DE) miRNAs were predicted using miRDB. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analysis were then conducted. Furthermore, using quantitative real-time polymerase chain reaction, we validated the expression levels of DE miRNAs (miR-22-3p, miR-146b-3p, miR-27b-3p, miR-128-3p, miR-2188-5p, miR-451, miR-205a, miR-203a, miR-21-3p, and miR-200a-3p) from all comparisons and their immune-related target genes. Results: A total of 53 miRNAs were significantly expressed in the infection samples of the resistant compared to the susceptible line. Network analyses between the DE miRNAs and target genes revealed that DE miRNAs may regulate the expression of target genes involved in the transforming growth factor-beta, mitogen-activated protein kinase, and Toll-like receptor signaling pathways, all of which are related to influenza A virus progression. Conclusions: Collectively, our results provided novel insights into the miRNA expression patterns of tracheal tissues from H5N1-infected Ri chickens. More importantly, our findings offer insights into the relationship between miRNA and immune-related target genes and the role of miRNA in HPAIV infections in chickens.
Robust gene selection methods using weighting schemes for microarray data analysis
Kang, Suyeon,Song, Jongwoo BioMed Central 2017 BMC bioinformatics Vol.18 No.1
<P><B>Background</B></P><P>A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates.</P><P><B>Results</B></P><P>We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays.</P><P><B>Conclusions</B></P><P>The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.</P><P><B>Electronic supplementary material</B></P><P>The online version of this article (10.1186/s12859-017-1810-x) contains supplementary material, which is available to authorized users.</P>
입력 패턴 학습을 통해 터치 영역 최적화를 지원하는 가상키보드의 사용성 평가
강수연(Suyeon Kang),김지영(Jiyoung Kim),박민희(Minhee Park),임호정(Hojeong Im),김헌(Huhn Kim) 대한인간공학회 2019 大韓人間工學會誌 Vol.38 No.6
Objective: The purpose of this study was to verify the usability of two typical smartphone virtual keyboards that optimize the touch area of each key by learning touch input patterns. Background: Virtual keyboards have many limitations due to their small size and lack of tactile feedback. Therefore, many studies have been conducted to improve the usability of the virtual keyboard. Among them, the verification on the usefulness of a virtual keyboard, with which optimizes the user’s touch area by learning his or her input pattern, is still insufficient. Method: In this study, the participants performed the task of inputting presented sentences using three virtual keyboards (Nota, AL, and Smartboard) that provide different levels of touch optimization support. Through the experiment, sentence matching ratio and typing time data were collected, and subjective satisfaction were also rated after the typing task was finished. Results: There were significant differences in the sentence matching ratio, typing time, and subjective satisfaction between Nota, AL and Smartboard. The Nota keyboard showed significantly better performances than the Smartboard in all respects. However, the AL keyboard showed no significant difference in sentence matching ratio and typing time compared to the Smartboard without such optimization function. Rather, the AL keyboard was less satisfied than the Smartboard. Conclusion: Automatically optimizing the touch area based on users" input pattern was more useful to the users than predicting, visually expanding and highlighting the keys that will be entered next. Application: In the future, smart phone manufacturers or virtual keyboard makers can use this result as a reference when they want to provide a similar function for touch area optimization.