RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      Resource-Efficient Machine Learning Systems: From Natural Behavior to Natural Language.

      한글로보기

      https://www.riss.kr/link?id=T17162625

      • 저자
      • 발행사항

        Ann Arbor : ProQuest Dissertations & Theses, 2024

      • 학위수여대학

        Columbia University Neurobiology and Behavior

      • 수여연도

        2024

      • 작성언어

        영어

      • 주제어
      • 발행국

        United States of America

      • 학위

        Ph.D.

      • 페이지수

        233 p.

      • 지도교수/심사위원

        Advisor: Cunningham, John P.

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract) kakao i 다국어 번역

      Contemporary machine learning models exhibit unprecedented performance in the text, vision, and time-series domains, but at the cost of significant computational and human resources. Applying these technologies for science requires balancing accuracy and resource allocation, which I investigate here via three unique case studies.In Chapter 1, I present a deep learning system for animal pose estimation from video. Existing approaches rely on frame-by-frame supervised deep learning, which requires extensive manual labeling, fails to generalize to data far outside of its training set, and occasionally produces scientifically-critical errors that are hard to detect. The solution proposed here includes semi-supervised learning on unlabeled videos, video-centric network architectures, and a post-processing step that combines network ensembling and state-space modeling. These methods improve performance both with scarce and abundant labels, and are implemented in an easy-to-use software package and cloud application. In Chapter 2, I turn to the Gaussian process, a canonical nonparametric model, known for its poor scaling with dataset size. Existing methods accelerate Gaussian processes at the cost of modeling biases. I analyze two common techniques -- early truncated conjugate gradients and random Fourier features -- showing that they find hyperparameters that underfit and overfit the data, respectively. I then propose to eliminate these biases in exchange of increased variance, via randomized truncation estimators. In In Chapter 3, I investigate continual learning, or "finetuning", in large language models (LLMs) with billions of weights. Training these models requires more memory than typically available in academic clusters. Low-Rank Adaptation (LoRA) is a widely-used technique that saves memory by training only low rank perturbations to selected weight matrices in a so-called "base model'". I compare the performance of LoRA and full finetuning on two target domains, programming and mathematics, across different data regimes. I find that in most common settings, LoRA underperforms full finetuning, but it nevertheless exhibits a desirable form of regularization: it better maintains the base model's performance on tasks outside the target domain. I then propose best practices for finetuning with LoRA.In summary, applying state-of-the-art models to large scientific datasets necessitates taking computational shortcuts. This thesis highlights the implications of these shortcuts and emphasizes the need for careful empirical and theoretical investigation to find favorable trade-offs between accuracy and resource allocation.
      번역하기

      Contemporary machine learning models exhibit unprecedented performance in the text, vision, and time-series domains, but at the cost of significant computational and human resources. Applying these technologies for science requires balancing accuracy...

      Contemporary machine learning models exhibit unprecedented performance in the text, vision, and time-series domains, but at the cost of significant computational and human resources. Applying these technologies for science requires balancing accuracy and resource allocation, which I investigate here via three unique case studies.In Chapter 1, I present a deep learning system for animal pose estimation from video. Existing approaches rely on frame-by-frame supervised deep learning, which requires extensive manual labeling, fails to generalize to data far outside of its training set, and occasionally produces scientifically-critical errors that are hard to detect. The solution proposed here includes semi-supervised learning on unlabeled videos, video-centric network architectures, and a post-processing step that combines network ensembling and state-space modeling. These methods improve performance both with scarce and abundant labels, and are implemented in an easy-to-use software package and cloud application. In Chapter 2, I turn to the Gaussian process, a canonical nonparametric model, known for its poor scaling with dataset size. Existing methods accelerate Gaussian processes at the cost of modeling biases. I analyze two common techniques -- early truncated conjugate gradients and random Fourier features -- showing that they find hyperparameters that underfit and overfit the data, respectively. I then propose to eliminate these biases in exchange of increased variance, via randomized truncation estimators. In In Chapter 3, I investigate continual learning, or "finetuning", in large language models (LLMs) with billions of weights. Training these models requires more memory than typically available in academic clusters. Low-Rank Adaptation (LoRA) is a widely-used technique that saves memory by training only low rank perturbations to selected weight matrices in a so-called "base model'". I compare the performance of LoRA and full finetuning on two target domains, programming and mathematics, across different data regimes. I find that in most common settings, LoRA underperforms full finetuning, but it nevertheless exhibits a desirable form of regularization: it better maintains the base model's performance on tasks outside the target domain. I then propose best practices for finetuning with LoRA.In summary, applying state-of-the-art models to large scientific datasets necessitates taking computational shortcuts. This thesis highlights the implications of these shortcuts and emphasizes the need for careful empirical and theoretical investigation to find favorable trade-offs between accuracy and resource allocation.

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼