http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
An Optimized Splitting Attribute Algorithm for Inconsistent Conflict in Context Lattice
Zhou Zhong,Junzhong Gu 보안공학연구지원센터 2014 International Journal of Database Theory and Appli Vol.7 No.5
With the emergence of Cloud computing and Internet of Things, Context-aware applications face new challenges. One of them is big data from huge context application and sources. The main stream of applications have used not only real-time versions but also history versions of context data. This paper concerned about optimization techniques of storage and reasoning in the CMS (context management system). For our storage of context data from different sources, FCA Lattice has been employed as a kind of storage schema to support modeling and fusion of these different context data. Further, context conditions about data are essential to logical reasoning. Under different context conditions, context data can be promoted to be knowledge, which makes context reasoning readily. In the dynamic environment, to get reasonable results, reasoning services require their input to keep consistent in the changeable conditions. The changeable conditions can be represented as context attributes, intervals and relations etc. To make consistent knowledge available in the conditions, our pervious works have analyzed incremental cache and check of consistent intervals, and proposed a context lattice-based distributed optimized update algorithm. In this paper, based on the algorithm, our problem is to optimize the split function. The split is needed when current lattice has no condition making knowledge consistent. The main aim of this paper is to improve time performance of splitting attributes or intervals or fuzzy relations that could be detailed. We propose a new parallel split algorithm. This algorithm computes the priorities of candidates. To reduce time cost, it decreases the split scope by choosing the split candidate with the highest priority value. To decrease the full lattice update time in the split process, it generates the sub lattices split by the candidates concurrently and merges them after. On the theory, we analyze the feasibility of the algorithm. On the test, as a new part of the whole update algorithm, it is compared with the naïve one, and it shows the better time performance. What’s more, it makes multi-threads execute on the same lattice to avoid producing more memory cost caused by copying the lattice for an independent thread.
PFIN : A Parallel Frequent Itemset Mining Algorithm Using Nodesets
Chen Lin,Junzhong Gu 보안공학연구지원센터 2016 International Journal of Database Theory and Appli Vol.9 No.6
Frequent Itemset Mining (FIM) is one of most fundamental techniques in data mining with extensive applications to a variety of data mining problems such as association rule mining, correlations, clustering and classification. Since the first proposal of frequent itemset mining, numerous serial algorithms have been proposed in order to improve mining performance, yet most of them cannot scale to massive datasets which are very common nowadays. In this paper, we propose a new parallel FIM algorithm named PFIN based on Nodeset which is a more efficient data structure for mining frequent itemsets. PFIN can intelligently decompose a large-scale FIM problem into a set of tasks, where each task can be executed in parallel without unnecessary communication overheads. Moreover, a hash-based load balancing strategy has been adopted to optimize resource use and maximize throughput. For evaluating the performance of PFIN, we have conduct extensive experiments on Spark which is an emerging distributed in-memory processing framework to compare it against PFP which is one of state-of-the-art parallel FIM algorithms on a range of real datasets. The experimental results demonstrate that our proposed PFIN are highly competitive with PFP in scalability performance, outperforming PFP in speed performance.