Leveraging Unlabeled Client Data for Federated Sensor-based Human Activity Recognition|RISS 상세보기

국문 초록 (Abstract)

연합 학습 (federated learning)은 웨어러블 디바이스를 사용하는 센서 기반 인간 동작 인식 (human activity recognition)에 적합한 방법이지만, 현실적인 적용은 다음의 세 가지 문제로 인하여 어려움이 있다: (1) 대용량의 센서 데이터 스트림에 라벨을 부여하는 작업은 비용이 많이 들고 현실적으로 어려움 (라벨 희소성 문제), (2) 다양한 사용자, 디바이스 및 센서 배치로 인하여 데이터의 내용과 형식이 상이함 (데이터 이질성), (3) 웨어러블 디바이스 전반에 걸쳐 컴퓨팅 및 네트워크 자원이 불평등하게 배분됨 (시스템 이질성). 본 논문에서는 서버에만 소규모의 라벨이 있는 데이터가 존재하고 클라이언트에는 라벨이 없는 데이터만 보유하는 라벨-앳 서버 (label-at-server) 환경에서, 반지도 (semi-supervised) 학습과 자기 지도 (self-supervised) 학습을 통하여 라벨이 없는 클라이언트 데이터를 활용하는 세 가지 연합 학습 프레임워크를 제안한다. 첫째 방법은 ATCoFed (activity transition consistency federated learning) 방법으로서, 의사 라벨링 (pseudo-labeling)과 믹스업 기반 웜업 (MixUp-based warm up)을 결합하여 라벨 희소성 문제를 해결한다. 이 방법에서는 동작 전이 일관성 (activity transition consistency) 필터링 방법을 도입하는데, LSTM과 대형 언어 모델 (LLM)이 각 후보 의사 라벨이 적합한 활동 전이를 형성하는지 평가하여 낮은 신뢰 임계값에서도 의사 라벨의 품질을 유지한다. 두 번째 방법은 FedSuDoC (federated subject domain clustering) 방법으로서, 재구성 기반 연합 자가 지도 학습 (federated self-supervised)을 통하여 데이터 이질성 문제를 해결한다. 클라이언트에 경량 오토인코더를 사용함으로써 일반화 가능한 특징 추출기를 학습한 다음, 잠재 도메인 공간 (latent domain space)에서 클라이언트를 클러스터링한다. 클러스터별로 연합 학습 집계 (aggregation) 및 분류기 미세 조정 (fine tuning)을 수행하여 각 클러스터에 적합한 인코더와 분류기를 생성함으로써 사람 간, 디바이스 간, 센서 배치 간 변동성을 보다 효과적으로 처리한다. 세 번째 방법은 FedDeLoC (federated delta loss clustering) 방법으로서, 그래디언트 기반 클러스터링을 초기 재구성 손실 (initial reconstruction loss) 및 상대 델타 손실 (relative delta loss)과 같은 간편한 손실 기반 특징으로 대체하여 라벨 희소성, 데이터 이질성 및 시스템 이질성을 동시에 해결한다. 또한, 높은 사용성을 보이고 낮은 지연 시간을 가지는 클라이언트를 선택하는 델타 손실 게이트 메커니즘 (delta loss gating mechanism)을 도입함으로써 정확성을 유지하면서도 통신 및 학습 시간을 단축할 수 있다. 매우 높은 이질성을 갖는 조건 하에서 여러 동작 인식 벤치마크 데이터 세트를 통하여 검증한 결과, 제안한 세 가지 프레임워크는 라벨링된 데이터만으로 학습하는 기존의 지도 학습 방법들보다 일관되게 향상된 성능을 보여주었으며, 라벨링되지 않은 클라이언트 데이터를 학습하고, 시간적 일관성, 클러스터 기반 개인화, 손실 및 지연 인식 스케줄링을 구현함으로써, 웨어러블 센서 기반 인간 동작 인식에서 정확하고, 통신 부하를 줄이며, 효율적 학습을 향한 실질적인 방안을 제시할 것으로 기대된다.

번역하기

연합 학습 (federated learning)은 웨어러블 디바이스를 사용하는 센서 기반 인간 동작 인식 (human activity recognition)에 적합한 방법이지만, 현실적인 적용은 다음의 세 가지 문제로 인하여 어려움이 ...

다국어 초록 (Multilingual Abstract)

Federated learning (FL) is a natural fit for sensor-based human activity recognition (HAR) with wearable devices, but realistic deployment is hindered by three intertwined challenges: (1) label scarcity, because annotating raw sensor streams is costly; (2) data heterogeneity, due to diverse users, devices, and sensor placements; and (3) system heterogeneity, arising from unequal compute and network resources across wearables. This thesis adopts a label-at-server setting in which only the server has a small, labeled dataset while clients hold only unlabeled data, and proposes three FL frameworks that exploit unlabeled client data through semi-supervised and self-supervised learning. ATCoFed (Activity Transition Consistency Federated Learning) addresses label scarcity by combining pseudo-labeling with MixUp-based warm up on the server and introduces Activity Transition Consistency filtering, where an LSTM and a large language model (LLM) evaluate whether each candidate pseudo-label forms a plausible activity transition, thus maintaining pseudo-label quality even at lower confidence thresholds. FedSuDoC (Federated Subject Domain Clustering) focuses on data heterogeneity through reconstruction based federated self-supervised learning, using lightweight autoencoders on clients to learn generalizable feature extractors and then clustering clients in a latent subject domain space. FL aggregation and classifier fine tuning are performed per cluster, yielding cluster specific encoders and classifiers that better handle cross subject, cross device, and cross placement variability. FedDeLoC (Federated Delta Loss Clustering) jointly addresses label scarcity, data heterogeneity, and system heterogeneity by replacing gradient-based clustering with compact loss-based features, namely initial reconstruction loss and relative delta loss, and introducing a latency-aware delta loss gating mechanism that selects high utility, low latency clients for aggregation, thereby reducing communication and training time while preserving accuracy. Across multiple HAR benchmarks under highly heterogeneous conditions, the three frameworks consistently outperform conventional supervised FL trained only on limited labeled data, demonstrating that unlabeled client data, temporal consistency, cluster-based personalization, and loss and latency aware scheduling together provide a practical path toward accurate, communication efficient, and system aware FL for wearable sensor based HAR.

번역하기

목차 (Table of Contents)

Chapter 1: Introduction 1
1.1. Background and Motivation 1
1.2. Technical Challenges 2
1.3. Research Goals 4
1.4. Scope of the Thesis 4

Chapter 1: Introduction 1
1.1. Background and Motivation 1
1.2. Technical Challenges 2
1.3. Research Goals 4
1.4. Scope of the Thesis 4
1.5. Contributions 5
Chapter 2: Literature Review 6
2.1. Notation 6
2.2. Sensor-based Human Activity Recognition 6
2.2.1. Input Sensor Data 7
2.3. Deep Learning under Limited Labeled Data 9
2.3.1. Semi-supervised Learning 9
2.3.2. Self-supervised Learning 10
2.4. Federated Learning for sensor-based HAR 11
2.4.1. Supervised Federated Learning 12
2.4.2. Semi-supervised Federated Learning 13
2.4.3. Federated Self-Supervised Learning 13
Chapter 3: Activity Transition for Semi-Supervised FL 15
3.1. Preliminary and Problem Definition 17
3.2. Proposed Framework 17
3.2.1. Warm-up Rounds: Pseudo-labeling with Mixup Augmentation 18
3.2.2. Main Rounds: Pseudo-label Filtering with ATCo Method 19
3.2.3. Generating Prompt for LLM 21
3.2.4. ATCoFed Framework 23
3.3. Experiment and Results 24
3.3.1. Experiment Setting 25
3.3.2. Experiment Results 26
3.3.3. Comparison with More Label Scarcity 29
3.3.4. Ablation Study 30
3.3.5. Training Time Analysis 30
3.3.6. Effectiveness of Data-Driven Evaluators 32
3.4. Discussion 33
Chapter 4: Subject Domain Classification for FSSL 34
4.1. Preliminary and Problem Definition 35
4.2. Proposed Method 37
4.2.1. FL Clustering Stage 37
4.2.2. FL Training Stage 40
4.2.3. Server Fine-tuning Phase 41
4.3. Experiments and Results 42
4.3.1. Experiment Setting 42
4.3.2. Experiment Results 43
4.3.3. Computational Cost Analysis 44
4.3.4. Effectiveness of Clustering 45
4.4. Discussion 46
Chapter 5: Delta-Loss Clustering and Gating for FSSL 47
5.1. Preliminary and Problem Definition 48
5.2. Proposed Framework 49
5.2.1. FL Clustering Stage 49
5.2.2. FL Training Stage 50
5.2.3. Fine-tuning Classifier Model Stage 52
5.3. Experiments and Results 52
5.3.1. Simulation Setup 52
5.3.2. Performance Comparison 53
5.3.3. Computational and Communication Cost 54
5.3.4. Comparison with FedSuDoC Framework 55
5.4. Discussion 57
Chapter 6: Final Remarks 58
6.1. Conclusion 58
6.2. Future Work 60
Bibliography 61

상세검색

RISS 보유자료

상세검색

해외전자자료

Leveraging Unlabeled Client Data for Federated Sensor-based Human Activity Recognition

부가정보

분석정보

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료