RISS 검색 - 학위논문

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
Deep Learning-based Denoising for Image Classification

Rahim Tariq 금오공과대학교 일반대학원 2020 국내박사

RANK : 231983
- 원문보기
- 음성듣기
Machine learning (ML) is a subclass of artificial intelligence (AI) that presents systems with the capability of automatically learn and upgrade from the knowledge without programmed explicitly. ML concentrates on computer program development for accessing data to utilize it for learning themselves. ML is performing a central role in medical image analysis. Several algorithms based on ML have been implemented in medical imaging for problems like segmentation, classification, and detection. With the broad application of deep learning (DL) methods, there has been a significant improvement in the performance of medical image analysis. DL methods are a set of algorithms in ML that attempt to automatically learn multiple levels of abstraction and representation for assisting to make sense of data. Wireless capsule endoscopy (WCE) is a method in which a patient swallows a camera-embedded pill-shaped device that moves through the gastrointestinal (GI) tract, captures and transmits images to an outside receiver. WCE devices produce over 60,000 images during their course of operation inside the GI tract. These images need to be examined by expert physicians who attempt to identify images that contain inflammation/disease. It can be hectic for a physician to go through such a large number of frames, therefore computer-aided detection methods are considered an efficient alternative. Numerous anomalies that can take place in the GI tract of a human being in which the most vital are tumors, polyp, and ulcers. Moreover, general abnormalities and bleeding inside the GI tract may be the symptoms of these diseases. Recently, there has been a lot of research in the field of WCE and medical image analysis. Segmentation, classification, and detection of the diseases mentioned above are thoroughly investigated using both conventional image processing and DL approaches. A detailed survey of all the attempts for WCE medical image analysis for the detection of tumor, polyp, and ulcer was an open challenge. The motivation behind this thesis is to provide a comprehensive review of the techniques adopted for the detection of tumors, polyps, and ulcers while considering purely WCE source. Additionally, generalized anomalies found in WCE images followed by bleeding/lesion detection are provided to avoid the insight limitations of research. The focal point is to provide a comparative investigation of the current techniques, thereby creating possibilities for future research in this specific domain. Detailing all the recent attempts that have been taken in the work, we have implemented a cascaded DL approach for the joint classification of tumor, polyp, and ulcer. The motivation behind using a cascaded approach is to consider that output generated from WCE is a compressed frame due to the limitation of battery life and storage capacity of WCE device resulting in the degradation of the quality of images. Furthermore, during the transmission of WCE images, channel noise such as an additive white Gaussian noise (AWGN) and compression artifacts can degrade the quality of WCE images. The proposed DL approach employs two deep neural networks (DNNs), denoising convolutional neural network (DnCNN) as a denoiser and CNN model for the joint classification of tumor, polyp, and ulcer found within the GI tract of human body. In addition to WCE, colonoscopy is another way for the examination of the colon/large intestine of a human body. A DL-based detection of the polyp within the colonoscopy images is proposed in this dissertation where a CNN model that utilizes lesser hidden layers resulting the model to be lighter for processing and effective at the same time is employed. We introduce a new activation function i.e., MISH for some hidden layers for improved propagation of information along with ReLU. A generalized intersection over union (GIoU) is adopted as new loss to optimized the non-overlapping bounding box considering the shape and orientation of polyp structure. Detailed performance comparison of the proposed deep CNN model is provided by benchmarking with other method showing efficient results. The Internet revolution has caused the user’s for easier access to fast multimedia data exchange that compels intellectual property security. The communication of information across the Internet is prevalent quite often, and since the communication channel is not always secure; hence it is essential to protect data alteration to ensure the dependability of the data. An efficient watermarking scheme that uses low frequency components named as modified selective embedding in low Frequency (M-SELF) employing Fast Fourier Transform (FFT) is proposed. A coordinates calculation computes an optimal value of radial implementation for embedding the watermark at the encoder side. The decoder side follows the same process but in an inverse way to extract the watermark. The transmission of sensitive information in the form of a watermarked image is a crucial task to consider. Impairments in the form of noises such as additive white Gaussian noise (AWGN), etc., can corrupt the embedded watermark/bits and can eventually degrade the perceptual quality of an image. To avoid such situations a denoiser approach i.e., DnCNN is implemented in an integrated way for the first time after the encoder i.e., M-SELF technique. Performance evaluation of the proposed integrated M-SELF watermarking technique is done in terms of percentage of bits being retrieved as a success rate at each level of noise and perceived visual quality by using full-reference image quality assessment (IQA) metrics such as peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Moreover, the performance of the proposed integrated M-SELF watermarking technique is benchmarked and analyzed with other DL approach and conventional filtering techniques. The recent wave in the practice of multimedia services has highlighted the demand for constant monitoring and supervision of multimedia systems based on the users’ quality of experience (QoE). The fast and vigorous quality of the video’s measurements is essential to maximize the QoE and resource allocation control within networks. Degradation in video stream quality can occur due to compression and transmission through communication channel that error-prone. For the estimation of the perceptual quality of video stream, there are two approaches, i.e., subjective quality assessment (human observers) and objective video quality metrics (VQM). A typical video transmission system comprised of an encoder that encode a video, transmission channel, and decoder which decode the video. Giving to the network condition i.e., bandwidth, the encoder encodes the video stream at specific bitrate. The process of encoding generally results in compression leading into quality degradation of the video. Nowadays, most online video web sites and broadcasters produce video content at high-definition (HD) resolutions. The success of HD contents has led to the development of 4K ultra-high definition (UHD) contents regarded as the future standard in video applications. In perfect conditions, UHD is supposed to accommodate viewers with improved visual experience by a wide field of view (FOV) in both vertical and horizontal directions of the screen. Higher frame rates (HFR) have a direct relationship with the quality of the videos. The perceived quality of the human as a subjective analysis is more accurate when it comes to HFR and high-resolution video contents. The video encoding industry is evolving with each year passing; High Efficiency Video Coder (HEVC) H.265/HEVC, and open-source VP9 encoders providing 50% bandwidth saving when compared to previous encoder i.e., Advanced Video Coding (AVC) H.264. A detailed subjective analysis is conducted for UHD videos at frame rate of 30fps and 60fps that are compressed at five different quantization parameters (QP) to investigate the perceptual quality of the users. This thesis also includes a preliminary work that use three different encoders i.e., H.264, HEVC/H.265, and VP9 at five different QP levels for different frame rates making an extensive subjective analysis reflected as differential mean opinion score (DMOS). Furthermore, the encoding efficiency as the encoding time for each encoder and qualitative performance by employing full-reference (FR) quality metrics are presented. Moreover, a qualitative result for finding a correlation for the subjective quality assessment model and FR quality metrics are also provided. 머신러닝(ML)은 인공지능(AI)의 서브클래스로, 경험을 통해 자동으로 개선하는 컴퓨터 알고리즘 연구이다. 이 기술은 수집한 데이터를 통해 스스로 학습하여 컴퓨터 프로그램을 향상시킨다. 이러한 머신러닝 기술은 의학 영상 분석에서도 중심 역할을 수행한다. 세분화, 분류, 탐지와 같은 문제를 해결하는데 도움을 주며, 또한 딥러닝(DL) 기술이 광범위하게 적용됨에 따라 의료 이미지 분석에 대한 성능이 크게 향상되었다. 딥러닝 기술은 데이터를 이해할 수 있도록 여러 수준의 추상화 및 표현을 자동으로 학습하려는 머신러닝의 알고리즘 집합이다. 무선 캡슐 내시경(WCE)은 환자가 카메라가 내장된 알약 모양의 캡슐 장치를 삼켜 소화기관의 이미지를 촬영하여, 외부 수신기로 전송하는 방법이다. WCE 장치는 소화기관 내에서 작동하는 동안 60,000개 이상의 이미지를 촬영한다. 이러한 이미지들은 염증/질병이 있는 이미지를 식별할 수 있는 전문 의사들에 의해 검사된다. 하지만 의사가 그 많은 이미지들을 확인하는 것은 힘든 일이다. 따라서 효율적인 대안으로 컴퓨터 보조 검출 방법이 쓰인다. 인간의 소화기관에서는 종양, 용종, 궤양과 같은 기형이 발생할 수 있다. 더구나 일반적이지 않거나 소화기관 내 출혈은 이러한 질병의 증상일 수 있다. 최근 WCE 및 의료 영상 분석 분야에서 많은 연구가 이루어지고 있다. 위에서 언급한 세분화, 분류, 탐지는 기존 영상처리 기술 및 딥러닝 접근 방식을 사용하여 철저하게 조사된다. 종양, 용종, 궤양의 검출을 위한 WCE 의료 영상 분석의 모든 방법에 대한 상세한 조사는 하나의 공공연한 과제이다. 이 논문에서는 WCE 정보를 고려하여 종양, 용종, 궤양의 검출을 위해 채택 된 기술에 대해 포괄적으로 검토하는 것을 목표로 한다. 또한 WCE 이미지에서 발견된 이상 징후와 출혈 검출이 제공되어 연구의 통찰력의 한계를 피할 수 있다. 이 연구의 초점은 현재 기술에 대한 비교 조사를 제공하여 향후 이 영역에서 연구 가능성을 창출한다. 최근 연구 내용을 설명하며 종양, 용종, 궤양의 관절 분류를 위한 계단식 딥러닝 접근법을 구현한다. 계단식 접근 방식을 사용하는 동기는 WCE 장치의 배터리 수명과 저장 용량 한계로 인해 WCE에서 생선된 출력이 압축 프레임으로 인해 영상 화질이 저하되는 것을 고려하기 위함이다. 또한 WCE 이미지를 전송하는 동안 Additive White Gaussian Noise(AWGN) 및 압축과 같은 채널 잡음으로 인해 WCE 이미지의 품질이 저할 될 수 있다. 제안된 딥러닝 접근 방식은 두 개의 심층신경망(DNN)을 채택한다. denoising convolutional neural network(DnCNN)으로 인체의 소화기관에서 발견된 종양, 용종, 궤양을 분류하기 위한 CNN 모델과 디노이저이다. 추가적으로 대장내시경 영상 내 딥러닝 기반 용종 검출은 이 논문에서 덜 숨겨진 은닉층을 활용함으로써 모델이 더 가벼워지고 동시에 효과적이게 되는 CNN 모델을 채용하는 것이 제안되었다. ReLu와 함께 정보의 전파를 개선하기 위해 일부 은닉층에 대한 MISH와 같은 새로운 활성화 기능을 도입했다. 용종 구조의 모양과 방향을 고려하여 비 중첩 경계 상지를 최적화하기 위해 Generalized Intersection Over Union(GIOU)를 새로운 손실로 채택된다. 제안된 Deep CNN 모델의 자세한 성능 비교는 효율적인 결과를 보여주는 다른 방법의 벤치마킹으로 제공된다. 인터넷의 발전은 사용자들은 지적 재산에 대한 보안을 구현하는 빠른 멀티미디어 데이터 교환에 더 쉽게 전근할 수 있도록 하였다. 인터넷을 통한 정보의 통신은 빈번하게 이루어지고 있으며, 통신 채널이 항상 안전한 것은 아니기 때문에 데이터의 신뢰성을 보장하기 이해서는 데이터를 보호하는 것이 필수적이다. 이 논문에서는 FFT(Fast Fourier Transform) 기술을 통해 Modified selective embedding in low Frequency (M-SELF) 기술로 명명된 저주파수를 사용하는 효율적인 워터마킹 기술을 제안한다. 좌표 계산은 인코더 측에서 워터 마크를 삽입하기 위한 방사형 구현의 최적값을 계산한다. 디코더 측면은 같은 과정을 따르지만 역방향으로 워터마크를 추출한다. 워터마크 이미지의 형태로 민감한 정보를 전송하는 것은 고려해야하는 중요한 과제이다. AWGN과 같은 잡음 형태의 손상은 내장된 워터마크/비트를 손상시킬 수 있으며 결국 이미지의 품질을 저하시킬 수 있다. 그러한 상황을 피하기 위해 디노이저가 사용된다. DnCNN은 인코더, 즉 M-SELF 기법 사용된 후 바로 결합되는 방식으로 구현된다. 제안된 integrated M-SELF 워터마킹 기법의 성능은 peak signal-to-noise ratio(PSNR)와 structural similarity index(SSIM)와 같은 전체 기준 image quality assessment(IQA) 지표를 사용과 워터마킹 데이터를 성공적으로 추출한 정도를 백분율로 나타내어 평가한다. 또한 제안된 integrated M-SELF 워터마킹 기법의 성능을 다른 딥러닝 방식과 기존 필터링 기법으로 벤치마킹하며 분석한다. 최근의 멀티미디어 서비스의 흐름은 사용자의 경험의 질(Quality of Experience: QoE)에 근거한 시스템의 지속적인 감시와 감독에 대한 수요를 부각시키고 있다. 비디오 측정의 빠르고 왕성한 품질은 네트워크 내의 QoE와 자원 할당 제어를 극대화하기 위해 필수적이다. 오류가 발생하기 쉬운 통신 채널을 통한 압축과 전송으로 비디오 스트림 품질의 저하가 발생 할 수 있다. 비디오 스트림의 시각적 품질 추정에 대해서는 주관적(subjective) 품질 평가(사람이 관찰)와 객관적(objective) 비디오 품질 측정(VQM)이라는 두가지 접근법이 있다. 전형적인 비디오 전송 시스템은 비디오, 전송 채널, 인코더와 디코더 등으로 구성되었다. 인코더는 네트워크 조건(대역폭)에 따라 특정 비트 전송률로 비디오 스트림을 인코딩 한다. 인코딩 프로세스는 일반적으로 압축을 통해 동영상의 품질 저하를 초래한다. 요즘 대부분의 온라인 동영상 웹 사이트와 방송사는 고화질(HD) 해상도로 동영상 콘텐츠를 제작한다. HD 콘텐츠의 성공은 영상 어플리케이션에서 미래 표준으로 여겨지는 4K 초고화질(UHD) 콘텐츠의 개발로 이어지고 있다. UHD는 화면 수직방향과 수평방향 모두 넓은 시야(FOV)로 개선되어 시청자에거 보다 향상된 환경을 제시한다. 높은 프레임률(HFR)은 비디오의 품질과 직접적인 관계가 있다. HER과 고해상도 영상 콘텐츠에 대해서는 주관적 분석이 더 정확하다. 비디오 인코딩 산업은 매년 발전하고있다; High Efficiency Video Coder(HEVC)H.265, VP9 오픈 소스 인코더는 이전 인코더(Advanced Video Coding: AVC, H.264)에 비해 50% 대역폭을 절약한다. 사용자의 지각 품질을 조사하기 위해 5개의 다른 정량화 파라미터(QP)로 압축된 30fps, 60fps의 프레임률의 UHD 비디오에 대해 상세한 주관적 분석을 실시한다. 이 논문은 또한 3개의 서로 다른 인코더(H.264, HEVC/H.265, VP9)를 서로 다른 프레임률에 대해 5개의 서로 다른 QP 수준에서 사용하는 예비 연구를 포함하며, 광범위한 주관적 분석을 차등 평균 의견 점수(DMOS)로 반영한다. 또한, 각 인코더의 인코딩 시간으로서의 인코딩 효율성과 FR(Full-Reference) 품질 지표를 채택하여 질적인 성능을 제시한다. 또한 주관적 품질평가 모델과 FR 품질 지표에 대한 상관관계를 찾기 위한 정성적 결과도 제공된다.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천