The extreme-scale distributed science workflows play an essential function for scientific discoveries. Today’s large scientific experimental facilities are generating tremendous amount of data. In recent years, the significant growth of scientific d...
The extreme-scale distributed science workflows play an essential function for scientific discoveries. Today’s large scientific experimental facilities are generating tremendous amount of data. In recent years, the significant growth of scientific data analysis has been observed across scientific centers. The scientific experimental facilities are producing unprecedented amount of data and scientific community encounters new challenges to data intensive computing. The performance of extreme-scale distributed science is highly depends on high-performance, adaptive, and robust network service infrastructures. To support data transfer for extreme-scale distributed science, there is the need of high performance, scalable, end-to-end, and programmable networks that enable scientific applications to use network efficiently.
The existing network paradigm that support extreme-scale distributed science workflows consists of three major components: terabit networks that provide high network bandwidths, Data Transfer Nodes (DTNs) and Science DMZ architecture that bypasses the performance hotspots in typical campus networks, and on-demand secure circuits/paths reservation systems, such as ESNet OSCARS and Internet2 AL2S, which provides automated, guaranteed bandwidth service in WAN. This network paradigm has proven to be very successful. However, to reach its full potentials, we claim that existing network paradigm for extreme-scale distributed science must address three major problems: last mile problem; scalability problem; and the agility, automation and programmability problem.
The recently emerged concept in network world is called Software-Defined Networking (SDN). This latest technology introduced the new methods of configuration and management of networking. In SDN, the underlying network devices are simply considered as packets forwarding elements and control logic of network is managed centrally by using a software program that dictates the entire network behavior. To address above mentioned problems, this thesis proposed a solution called AmoebaNet. AmoebaNet applies SDN technology to provide “QoS-guaranteed” network services in campus or local area networks. AmoebaNet complements existing network paradigm for extreme-scale distributed science: it allows application to program networks at run-time for optimum performance; and, in conjunction with WAN circuits/paths reservation system such as ESNet OSCARS and Internet2 AL2S; in result, it solves the problems of last mile, scalability, and the agility, automation and programmability. In this thesis, we also presented Congestion Aware Multipath Optimal Routing (CAMOR) solution which can be an additional service for AmoebaNet.