RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      KCI등재

      Apache Spark를 활용한 대용량 데이터의 처리 = Processing large-scale data with Apache Spark

      한글로보기

      https://www.riss.kr/link?id=A105348483

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract)

      Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.
      번역하기

      Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in ...

      Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.

      더보기

      참고문헌 (Reference)

      1 RStudio, "sparklyr–R interface for Apache Spark"

      2 dplyr, "dplyr: A grammar of data manipulation"

      3 TopicModeling, "Topic modeling on Apache Spark"

      4 Scala, "The Scala programming language"

      5 Shvachko, K., "The Hadoop distributed file system" IEEE 1-10, 2010

      6 Spark-tfocs, "TFOCS for Spark: A community port of TFOCS for Apache Spark"

      7 H2O.ai, "Sparkling Water"

      8 Sparkit-learn, "Sparkit-learn"

      9 SparkR, "SparkR (R on spark)"

      10 Moritz, P., "SparkNet: Training deep networks in Spark"

      1 RStudio, "sparklyr–R interface for Apache Spark"

      2 dplyr, "dplyr: A grammar of data manipulation"

      3 TopicModeling, "Topic modeling on Apache Spark"

      4 Scala, "The Scala programming language"

      5 Shvachko, K., "The Hadoop distributed file system" IEEE 1-10, 2010

      6 Spark-tfocs, "TFOCS for Spark: A community port of TFOCS for Apache Spark"

      7 H2O.ai, "Sparkling Water"

      8 Sparkit-learn, "Sparkit-learn"

      9 SparkR, "SparkR (R on spark)"

      10 Moritz, P., "SparkNet: Training deep networks in Spark"

      11 Zaharia, M., "Spark: cluster computing with working sets" USENIX Association 2010

      12 Armbrust, M., "Spark SQL: Relational data processing in Spark" ACM 1383-1394, 2015

      13 Spark-cassandra-connector, "Spark Cassandra Connector"

      14 Spark-sklearn, "Scikit-learn integration package for Apache Spark"

      15 Pedregosa, F., "Scikit-learn : Machine learning in Python" 12 : 2825-2830, 2011

      16 Hunter, T., "Scaling the mobile millennium system in the cloud" ACM 2011

      17 Bahmani, B., "Scalable k-means++" 5 : 622-633, 2012

      18 Zaharia, M., "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing" USENIX Association 2012

      19 Spark Wiki, "Powered By Spark"

      20 Hindman, B., "Mesos: A platform for fine-grained resource sharing in the data center" USENIX Association 2011

      21 Zadeh, R. B., "Matrix computations and optimization in Apache Spark" ACM 31-38, 2016

      22 Dean, J., "MapReduce: simplified data processing on large clusters" 51 : 107-113, 2008

      23 Meng, X., "MLlib: Machine learning in apache spark" 17 : 1-7, 2016

      24 Kraska, T., "MLbase: A distributed machine-learning system" 2013

      25 H2O.ai, "H2O.ai - AI for Business"

      26 Xin, R., "GraySort on Apache Spark by Databricks"

      27 Xin, R. S., "GraphX: Unifying data-parallel and graph-parallel analytics"

      28 Zaharia, M., "Discretized streams: Fault-tolerant streaming computation at scale" ACM 423-438, 2013

      29 Kim, H., "DeepSpark: Spark-based deep learning supporting asynchronous updates and Caffe compatibility"

      30 Spark Wiki, "Committers"

      31 Lakshman, A., "Cassandra: a decentralized structured storage system" 44 : 35-40, 2010

      32 Spark, "Apache spark"

      33 Zeppelin, "Apache Zeppelin"

      34 Vavilapalli, V. K., "Apache Hadoop YARN: Yet another resource negotiator" ACM 2013

      35 HBase, "Apache HBase"

      36 Lehoucq, R. B., "ARPACK users’ guide : solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods, 6" SIAM 1998

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      인용정보 인용지수 설명보기

      학술지 이력

      학술지 이력
      연월일 이력구분 이력상세 등재구분
      2027 평가예정 재인증평가 신청대상 (재인증)
      2021-01-01 평가 등재학술지 유지 (재인증) KCI등재
      2018-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2015-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2011-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2009-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2007-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2005-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2002-07-01 평가 등재학술지 선정 (등재후보2차) KCI등재
      2000-01-01 평가 등재후보학술지 선정 (신규평가) KCI등재후보
      더보기

      학술지 인용정보

      학술지 인용정보
      기준연도 WOS-KCI 통합IF(2년) KCIF(2년) KCIF(3년)
      2016 0.38 0.38 0.38
      KCIF(4년) KCIF(5년) 중심성지수(3년) 즉시성지수
      0.35 0.34 0.565 0.17
      더보기

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼