Artikel ; Online: Improved Parallel Random Forest Algorithm Combining Information Theory and Norm
Jisuanji kexue yu tansuo, Vol 16, Iss 5, Pp 1064-
2022 Band 1075
Abstract: Aiming at the problems of excessive redundancy and irrelevant features, low training feature information and low parallelization efficiency in big data random forest algorithm based on MapReduce, this paper proposes a parallel random forest algorithm ... ...
Abstract | Aiming at the problems of excessive redundancy and irrelevant features, low training feature information and low parallelization efficiency in big data random forest algorithm based on MapReduce, this paper proposes a parallel random forest algorithm based on information theory and norm (PRFITN). Firstly, the algorithm designs the DRIGFN (dimension reduction based on information gain and Frobenius norm) strategy to reduce the number of redundant and irrelevant features. Secondly, a feature grouping strategy based on information theory (FGSIT) is proposed. According to the FGSIT strategy, the features are grouped, and the stratified sampling method is adopted to ensure the information amount of the training features when constructing the decision tree in the random forest. Accuracy of classification results is improved. Finally, in order to improve the parallel efficiency of the cluster, the redistribution of key-value pairs (RSKP) is presented to realize the rapid and uniform distribution of key-value pairs, and obtain the global classification results. Experimental results show that the algorithm has better classification effect in big data environment, especially for datasets with more features. |
---|---|
Schlagwörter | |mapreduce|random forest (rf)|drigfn strategy|feature grouping strategy based on information theory (fgsit)|redistribution of key-value pairs (rskp) strategy ; Electronic computers. Computer science ; QA75.5-76.95 |
Thema/Rubrik (Code) | 006 |
Sprache | Chinesisch |
Erscheinungsdatum | 2022-05-01T00:00:00Z |
Verlag | Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press |
Dokumenttyp | Artikel ; Online |
Datenquelle | BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl) |
Volltext online
Zusatzmaterialien
Kategorien
Fernleihe an ZB MED
Sie können sich den gewünschten Titel als lokale Nutzerin oder lokaler Nutzer von ZB MED direkt an den Standort Köln schicken lassen.