Data mining alludes to extracting knowledge form a huge amount of data. So, it’s a process of identifying or discovering meaningful correlation, patterns, and trends using simple database queries, statistical and mathematical techniques. Big data is a huge amount of structure, unstructured, complex, growing data sets from multiple sources. It needs to be filtered and extract the useful information from that huge data source that’s why big data mining concept has come. This paper provides the information about most puissant hybrid data mining algorithms in the research perspective also compare few significant hybrid techniques that have been used for big data implementation.
Keywords:Data Mining; Data mining algorithms; Hybrid approach; Clustering; Hybrid Data Mining Algorithms; Big data.
Data Mining in Cloud Computing applications is data retrieving from huge collection of data sets. In data mining collect a massive amount of data from search surfing data, social interaction, transaction level data, health care data, an enormous amount of sensor data from the internet of things called big data.
Dealing with a large amount of data but doesn’t know how to extract valuable information from the data source, wants insights and that’s why data mining is getting a lots of attention. The main motivation is to study data mining, significant demand for talent that knows how to manage process, analyze, predict and discover insight from massive data using quantitative and technical expertise to solve business, social and economic problems.
In this study, we have discussed few popular hybrid data mining algorithms like Hybrid evolution clustering with empty clustering solution (H (EC) 2 S), Hybrid Clustering Algorithm (HBCA) using BRICH and K-Means, GA/DT Hybrid data mining algorithm, Hybrid GA-SVM model, VAMR Algorithm and Apriori-MapReduce Algorithm. We compared drawbacks of some different hybrid techniques which is already applied to image classification like GA-SVM, KEM-EELM, NB-SVM, SVM-CART and DT-NB. The execution of hybrid algorithm was broke down in the view of classification accuracy. The execution of hybrid techniques was investigated in light of the clustering accuracy.
- Cui, X., Yang, S., & Wang, D. (2016, August). An algorithm of apriori based on medical big data and cloud computing. In Cloud Computing and Intelligence Systems (CCIS), 2016 4th International Conference on (pp. 361-365). IEEE.
- Sun, D., Lee, V. C., Burstein, F., &Haghighi, P. D. (2015, June). An efficient vertical-AprioriMapreduce algorithm for frequent item-set mining. In Industrial Electronics and Applications (ICIEA), 2015 IEEE 10th Conference on (pp. 108-112). IEEE.
- Homaeinezhad, M. R., Atyabi, S. A., Tavakkoli, E., Toosi, H. N., Ghaffari, A., &Ebrahimpour, R. (2012). ECG arrhythmia recognition via a neuro-SVM–KNN hybrid classifier with virtual QRS image-based geometrical features. Expert Systems with Applications, 39(2), 2047-2058. ELSEVIER.
- Zare, M. R., Mueen, A., Awedh, M., &Seng, W. C. (2013). Automatic classification of medical X-ray images: hybrid generative-discriminative approach. IET Image Processing, 7(5), 523-532.IET.
- Thamilselvan, P., &Sathiaseelan, J. G. R. (2015). A Comparative Study of Data Mining Algorithms for Image Classification. I.J. Education and Management Engineering, Modern Education and Computer Science Press (2), 1-9. IEEE.