http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Multiclass Least Squares Twin Support Vector Machine for Pattern Classification
Divya Tomar,Sonali Agarwal 보안공학연구지원센터 2015 International Journal of Database Theory and Appli Vol.8 No.6
This paper proposes a Multiclass Least Squares Twin Support Vector Machine (MLSTSVM) classifier for multi-class classification problems. The formulation of MLSTSVM is obtained by extending the formulation of recently proposed binary Least Squares Twin Support Vector Machine (LSTSVM) classifier. For M-class classification problem, the proposed classifier seeks M-non parallel hyper-planes, one for each class, by solving M-linear equations. A regularization term is also added to improve the generalization ability. MLSTSVM works well for both linear and non-linear type of datasets. It is relatively simple and fast algorithm as compared to the other existing approaches. The performance of proposed approach has been evaluated on twelve benchmark datasets. The experimental result demonstrates the validity of proposed MLSTSVM classifier as compared to the typical multi-classifiers based on ‘Support Vector Machine’ and ‘Twin Support Vector Machine’. Statistical analysis of the proposed classifier with existing classifiers is also performed by using Friedman’s Test statistic and Nemenyi post hoc techniques.
A Hybrid Approach of Clustering and Time-Aware Based Novel Test Case Prioritization Technique
Geetanjali Chaurasia,Sonali Agarwal 보안공학연구지원센터 2016 International Journal of Database Theory and Appli Vol.9 No.4
Regression testing is an activity during the maintenance phase to validate the changes made to the software and to ensure that these changes would not affect the previously verified code or functionality. Often, regression testing is performed with limited computing resources and time budget. So, fully comprehensive testing is not possible at this stage. Test-case prioritization techniques are applied to ensure the execution of test cases in some prioritized order and to achieve some specific goals in minimum possible time like, increasing the rate of fault detection, detecting the most critical faults as early as possible etc. The main objective of this paper is to achieve higher value of average percentage of faults detected, execute the higher priority test cases before lower priority test cases and also we target to decrease the execution time for achieving the maximum value of average percentage of faults detected. We proposed a new prioritization technique that uses a clustering approach and also considers various factors like, execution time of every test case, code coverage metric, fault detection ratio, test case failure rate and code complexity metric to reorder the execution of test cases. The results of this research work will show the importance of clustering technique and various factors taken into consideration, for achieving effective prioritization of test cases. The results of implementation will subsequently show that the proposed approach is more effective than the existing coverage and clustering based prioritization techniques. From the experimental results, we found that our proposed approach achieved higher value of average percentage of faults detected than other clustering based and coverage based techniques. Also, this approach reduces the execution time taken by the prioritized test cases.
Link Prediction for Authorship Association in Heterogeneous Network Using Streaming Classification
Harshal Singh,Divya Tomar,Sonali Agarwal 보안공학연구지원센터 2016 International Journal of Grid and Distributed Comp Vol.9 No.4
Prediction of links or relations between the objects in any network is no longer a new task these days; in fact it has become a high rated area of research and has attracted many researchers seeking their contribution to the mentioned area. Research has seen an exponential growth over the passing years, and the active researchers do not hesitate in linking with fellow researchers working in same domain irrespective of their geographic location. However this in turn has generated a very complex network of objects and links which are needed to be analyzed and dealt with. Prediction of co-authorship is the sub domain of link prediction and with the increasing complexity of co-authorship network the authors are treated as heterogeneous entity not as homogeneous ones. The rule is simple analyze the data preprocess it, train the classifier according to desired classification rules and then get the classified form of data. But irrelevant features always reflect various impacts and issues on generation of a classifier and consequently the impact is sustained to further classification results. Therefore, this paper proposes streaming classification algorithm combined with Correlation based Feature selection as a solution to the stated problem. The consistent and relevant features are selected with the help of feature selection algorithm and then these features are classified with the help of streaming classification algorithm- Very Fast Decision Tree (VFDT). VFDT is a streaming classification algorithm and it takes the dataset in the form of continuous stream as an input. Finally the effectiveness of the proposed algorithm can be seen in the experimental results.
Comparative Study of Big Data Computing and Storage Tools : A Review
Bakshi Rohit Prasad,Sonali Agarwal 보안공학연구지원센터 2016 International Journal of Database Theory and Appli Vol.9 No.1
As a result of tremendous rise in internet usage like social media and forums, mail systems, scholarly and research articles, daily online transactions from multiple sources like health care systems, meteorological and environmental organizations etc., the data collected has shoot up exponentially. This vast collection of data, called Big Data, has caused the traditional tools incompetent for managing it from either of storage, computing or analytical perspective. There is an immense need of architectures, platforms, tools, techniques and algorithms to handle Big Data. The available technologies deal with two broad aspects related to Big Data that are Big Data Storage Management and Big Data Computing, focused to overcome various challenges such as scalability, faster processing speed, multiple format data processing, availability, faster response time and analytics etc. This paper reviews recent trends of storage and computing tools with their relative capabilities, limitations and environment they are suitable to work with.
Stream Data Mining: Platforms, Algorithms, Performance Evaluators and Research Trends
Bakshi Rohit Prasad,Sonali Agarwal 보안공학연구지원센터 2016 International Journal of Database Theory and Appli Vol.9 No.9
Streaming data are potentially infinite sequence of incoming data at very high speed and may evolve over the time. This causes several challenges in mining large scale high speed data streams in real time. Hence, this field has gained a lot of attention of researchers in previous years. This paper discusses various challenges associated with mining such data streams. Several available stream data mining algorithms of classification and clustering are specified along with their key features and significance. Also, the significant performance evaluation measures relevant in streaming data classification and clustering are explained and their comparative significance is discussed. The paper illustrates various streaming data computation platforms that are developed and discusses each of them chronologically along with their major capabilities. This paper clearly specifies the potential research directions open in high speed large scale data stream mining from algorithmic, evolving nature and performance evaluation measurement point of view. Finally, Massive Online Analysis (MOA) framework is used as a use case to show the result of key streaming data classification and clustering algorithms on the sample benchmark dataset and their performances are critically compared and analyzed based on the performance evaluation parameters specific to streaming data mining.