http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Machine learning methods for microbiome studies
Junghyun Namkung 한국미생물학회 2020 The journal of microbiology Vol.58 No.3
Researches on the microbiome have been actively conducted worldwide and the results have shown human gut bacterial environment significantly impacts on immune system, psychological conditions, cancers, obesity, and metabolic diseases. Thanks to the development of sequencing technology, microbiome studies with large number of samples are eligible on an acceptable cost nowadays. Large samples allow analysis of more sophisticated modeling using machine learning approaches to study relationships between microbiome and various traits. This article provides an overview of machine learning methods for non-data scientists interested in the association analysis of microbiomes and host phenotypes. Once genomic feature of microbiome is determined, various analysis methods can be used to explore the relationship between microbiome and host phenotypes that include penalized regression, support vector machine (SVM), random forest, and artificial neural network (ANN). Deep neural network methods are also touched. Analysis procedure from environment setup to extract analysis results are presented with Python programming language.
Namkung, Junghyun,Elston, Robert C.,Yang, Jun-Mo,Park, Taesung Wiley Subscription Services, Inc., A Wiley Company 2009 Genetic epidemiology Vol.33 No.7
<P>Gene-gene interaction is believed to play an important role in understanding complex traits. Multifactor dimensionality reduction (MDR) was proposed by Ritchie et al. [2001. Am J Hum Genet 69:138–147] to identify multiple loci that simultaneously affect disease susceptibility. Although the MDR method has been widely used to detect gene-gene interactions, few studies have been reported on MDR analysis when there are missing data. Currently, there are four approaches available in MDR analysis to handle missing data. The first approach uses only complete observations that have no missing data, which can cause a severe loss of data. The second approach is to treat missing values as an additional genotype category, but interpretation of the results may then be not clear and the conclusions may be misleading. Furthermore, it performs poorly when the missing rates are unbalanced between the case and control groups. The third approach is a simple imputation method that imputes missing genotypes as the most frequent genotype, which may also produce biased results. The fourth approach, Available, uses all data available for the given loci to increase power. In any real data analysis, it is not clear which MDR approach one should use when there are missing data. In this article, we consider a new EM Impute approach to handle missing data more appropriately. Through simulation studies, we compared the performance of the proposed EM Impute approach with the current approaches. Our results showed that Available and EM Impute approaches perform better than the three other current approaches in terms of power and precision. Genet. Epidemiol. 33:646–656, 2009. © 2009 Wiley-Liss, Inc.</P>
Namkung, Junghyun,Kim, Kyunga,Yi, Sungon,Chung, Wonil,Kwon, Min-Seok,Park, Taesung Oxford University Press 2009 Bioinformatics Vol.25 No.3
<P>Gene-gene interactions are important contributors to complex biological traits. Multifactor dimensionality reduction (MDR) is a method to analyze gene-gene interactions and has been applied to many genetics studies of complex diseases. In order to identify the best interaction model associated with disease susceptibility, MDR classifiers corresponding to interaction models has been constructed and evaluated as a predictor of disease status via a certain measure such as balanced accuracy (BA). It has been shown that the performance of MDR tends to depend on the choice of the evaluation measures.</P>
Novel biomarker panel for the early detection of pancreatic cancer using peripheral blood
Jin-Young Jang,Wooil Kwon,Sun-Whe Kim,Do-Youn Oh,Wujin Lee,Joo-Kyoung Park,Jin-Seok Heo,Chang Moo Kang,Song Cheol Kim,Junghyun Namkung,Yongwhan Choi,Youngsoo Kim,Taesung Park 한국간담췌외과학회 2016 한국간담췌외과학회 학술대회지 Vol.2016 No.4
Yoo, Seong-Keun,Lim, Byung Chan,Byeun, Jiyoung,Hwang, Hee,Kim, Ki Joong,Hwang, Yong Seung,Lee, JoonHo,Park, Joong Shin,Lee, Yong-Sun,Namkung, Junghyun,Park, Jungsun,Lee, Seungbok,Shin, Jong-Yeon,Seo, American Association for Clinical Chemistry, Inc. 2015 Clinical chemistry Vol.61 No.6
<P><B>BACKGROUND:</B></P><P>Noninvasive prenatal diagnosis of monogenic disorders using maternal plasma and targeted massively parallel sequencing is being investigated actively. We previously demonstrated that comprehensive genetic diagnosis of a Duchenne muscular dystrophy (DMD) patient is feasible using a single targeted sequencing platform. Here we demonstrate the applicability of this approach to carrier detection and noninvasive prenatal diagnosis.</P><P><B>METHODS:</B></P><P>Custom solution-based target enrichment was designed to cover the entire dystrophin (<I>DMD</I>) gene region. Targeted massively parallel sequencing was performed using genomic DNA from 4 mother and proband pairs to test whether carrier status could be detected reliably. Maternal plasma DNA at varying gestational weeks was collected from the same families and sequenced using the same targeted platform to predict the inheritance of the <I>DMD</I> mutation by their fetus. Overrepresentation of an inherited allele was determined by comparing the allele fraction of 2 phased haplotypes after examining and correcting for the recombination event.</P><P><B>RESULTS:</B></P><P>The carrier status of deletion/duplication and point mutations was detected reliably through using a single targeted massively parallel sequencing platform. Whether the fetus had inherited the <I>DMD</I> mutation was predicted correctly in all 4 families as early as 6 weeks and 5 days of gestation. In one of these, detection of the recombination event and reconstruction of the phased haplotype produced a correct diagnosis.</P><P><B>CONCLUSIONS:</B></P><P>Noninvasive prenatal diagnosis of DMD is feasible using a single targeted massively parallel sequencing platform with tiling design.</P>
Multi-biomarker panel prediction model for diagnosis of pancreatic cancer
Doo-Ho LEE,Woongchang YOON,Areum LEE,Youngmin HAN,Yoonhyeong BYUN,Jae Seung KANG,Hongbeom KIM,Wooil KWON,Young-Ah SUH,Yonghwan CHOI,Junghyun NAMKUNG,Sangjo HAN,Sung Gon YI,Jin Seok HEO,In Woong HAN,Jo 한국간담췌외과학회 2021 Annals of hepato-biliary-pancreatic surgery Vol.25 No.-
Diagnostic model for pancreatic cancer using a multi-biomarker panel
Yoo Jin Choi,Woongchang Yoon,Areum Lee,Youngmin Han,Yoonhyeong Byun,Jae Seung Kang,Hongbeom Kim,Wooil Kwon,Young-Ah Suh,Yongkang Kim,Seungyeoun Lee,Junghyun Namkung,Sangjo Han,Yonghwan Choi,Jin Seok H 대한외과학회 2021 Annals of Surgical Treatment and Research(ASRT) Vol.100 No.3
Purpose: Diagnostic biomarkers of pancreatic ductal adenocarcinoma (PDAC) have been used for early detection to reduce its dismal survival rate. However, clinically feasible biomarkers are still rare. Therefore, in this study, we developed an automated multi-marker enzyme-linked immunosorbent assay (ELISA) kit using 3 biomarkers (leucine-rich alpha-2-glycoprotein [LRG1], transthyretin [TTR], and CA 19-9) that were previously discovered and proposed a diagnostic model for PDAC based on this kit for clinical usage. Methods: Individual LRG1, TTR, and CA 19-9 panels were combined into a single automated ELISA panel and tested on 728 plasma samples, including PDAC (n = 381) and normal samples (n = 347). The consistency between individual panels of 3 biomarkers and the automated multi-panel ELISA kit were accessed by correlation. The diagnostic model was developed using logistic regression according to the automated ELISA kit to predict the risk of pancreatic cancer (high-, intermediate-, and low-risk groups). Results: The Pearson correlation coefficient of predicted values between the triple-marker automated ELISA panel and the former individual ELISA was 0.865. The proposed model provided reliable prediction results with a positive predictive value of 92.05%, negative predictive value of 90.69%, specificity of 90.69%, and sensitivity of 92.05%, which all simultaneously exceed 90% cutoff value. Conclusion: This diagnostic model based on the triple ELISA kit showed better diagnostic performance than previous markers for PDAC. In the future, it needs external validation to be used in the clinic.