Gene expression data (DNA microarray) enable researchers to simultaneously measure the levels of expression of several thousand genes. These levels of expression are very important in the classification of different types of tumors. In this work, we are interested in gene selection, which is an essential step in the data pre-processing for cancer classification. This selection makes it possible to represent a small subset of genes from a large set, and to eliminate the redundant, irrelevant or noisy genes. The combinatorial nature of the selection problem requires the development of specific techniques such as filters and Wrappers, or hybrids combining several optimization processes. In this context, we propose two hybrid approaches (RBPSO-1NN and FBPSO-SVM) for the gene selection problem, based on the combination of the filter methods (the Fisher criterion and the ReliefF algorithm), the BPSO metaheuristic algorithms and the Backward algorithm using the classifiers (SVM and 1NN) for the evaluation of the relevance of the candidate subsets. In order to verify the performance of our methods, we have tested them on eight well-known microarray datasets of high dimensions varying from 2308 to 11225 genes. The experiments carried out on the different datasets show that our methods prove to be very competitive with the existing works.
Accepté le :
DOI : 10.1051/ro/2018059
Mots-clés : Gene selection, cancer classification, BPSO, backward generation, SVM, 1NN, ReliefF, Fisher criterion, DNA microarray
@article{RO_2019__53_1_269_0, author = {Bir-Jmel, Ahmed and Mohamed Douiri, Sidi and Elbernoussi, Souad}, title = {Gene selection via {BPSO} and {Backward} generation for cancer classification}, journal = {RAIRO - Operations Research - Recherche Op\'erationnelle}, pages = {269--288}, publisher = {EDP-Sciences}, volume = {53}, number = {1}, year = {2019}, doi = {10.1051/ro/2018059}, zbl = {1418.62388}, mrnumber = {3912473}, language = {en}, url = {http://www.numdam.org/articles/10.1051/ro/2018059/} }
TY - JOUR AU - Bir-Jmel, Ahmed AU - Mohamed Douiri, Sidi AU - Elbernoussi, Souad TI - Gene selection via BPSO and Backward generation for cancer classification JO - RAIRO - Operations Research - Recherche Opérationnelle PY - 2019 SP - 269 EP - 288 VL - 53 IS - 1 PB - EDP-Sciences UR - http://www.numdam.org/articles/10.1051/ro/2018059/ DO - 10.1051/ro/2018059 LA - en ID - RO_2019__53_1_269_0 ER -
%0 Journal Article %A Bir-Jmel, Ahmed %A Mohamed Douiri, Sidi %A Elbernoussi, Souad %T Gene selection via BPSO and Backward generation for cancer classification %J RAIRO - Operations Research - Recherche Opérationnelle %D 2019 %P 269-288 %V 53 %N 1 %I EDP-Sciences %U http://www.numdam.org/articles/10.1051/ro/2018059/ %R 10.1051/ro/2018059 %G en %F RO_2019__53_1_269_0
Bir-Jmel, Ahmed; Mohamed Douiri, Sidi; Elbernoussi, Souad. Gene selection via BPSO and Backward generation for cancer classification. RAIRO - Operations Research - Recherche Opérationnelle, Tome 53 (2019) no. 1, pp. 269-288. doi : 10.1051/ro/2018059. http://www.numdam.org/articles/10.1051/ro/2018059/
[1] FRBPSO: a Fuzzy rule based binary PSO for feature selection. Proc. Nat. Acad. Sci. India Sec. A: Phys. Sci. 87 (2017) 221–233.
, and ,[2] Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: IEEE Congress on Evolutionary Computation, 2007. CEC 2007. IEEE (2007, September) 284–290. | DOI
, , and ,[3] Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403 (2000) 503. | DOI
, , , , , , et al.,[4] On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theor. Comput. Sci. 209 (1998) 237–260. | DOI | MR | Zbl
and ,[5] Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl. Soft Comput. 38 (2016) 922–932. | DOI
, and ,[6] Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data. Appl. Soft Comput. 24 (2014) 773–780. | DOI
, , and ,[7] The application of ant colony optimization for gene selection in microarray-based cancer classification. In: International Conference on Machine Learning and Cybernetics, 2008. IEEE (2008) 4001–4006. | DOI
, and ,[8] Improved binary PSO for feature selection using gene expression data. Comput. Biol. Chem. 32 (2008) 29–38. | DOI | Zbl
, , and ,[9] Tabu search and binary particle swarm optimization for feature selection using microarray data. J. Comput. Biol. 16 (2009) 1689–1703. | DOI | MR
, and ,[10] Support-vector networks. Mach. Learn. 20 (1995) 273–297. | DOI | Zbl
and ,[11] Nearest neighbor pattern classification. IEEE Trans. Info. Theory 13 (1967) 21–27. | DOI | Zbl
and ,[12] Gene selection for tumor classification using a novel bio-inspired multi-objective approach. Genomics 110 (2018) 10–17. | DOI
, and ,[13] Discriminatory Analysis-Nonparametric Discrimination: Consistency Properties. California Univ Berkeley, Berkeley (1951). | Zbl
and ,[14] Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286 (1999) 531–537. | DOI
, , , and , et al.,[15] SVM multiclasses, théorie et applications. Habilitation à diriger des recherches. UHP (2007).
,[16] Generalized fisher score for feature selection. Preprint arXiv: 1202.3725 (2012).
, and[17] A practical guide to support vector classification. Available at: http://www.csie.ntu.edu.tw/ cjlin/ papers/guide/guide.pdf (2003).
, and ,[18] Linear and kernel classification: when to use which? In: Proc. of the 2016 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics (2016) 216–224. | DOI
and ,[19] An assessment of recently published gene expression data analyses: reporting experimental design and statistical factors. BMC Med. Info. Decis. Mak. 6 (2006) 27. | DOI
and ,[20] PSO optimization. In: Proc. IEEE Int. Conf. Neural Networks. IEEE Service Center, Piscataway, NJ 4 (1995) 1941–1948.
and ,[21] A discrete binary version of the particle swarm algorithm. In: Systems, Man, and Cybernetics, 1997. IEEE International Conference on Computational Cybernetics and Simulation. IEEE 5 (1997) 4104–4108. | DOI
and ,[22] A practical approach to feature selection. In: Proc. of the Ninth International Workshop on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1992) 249–256.
and ,[23] Wrappers for features subset selection. Artif. Intell. 97 (1997) 273–324. | DOI | Zbl
and ,[24] Estimating attributes: analysis and extensions of RELIEFIn: European Conference on Machine Learning. Springer, Berlin, Heidelberg (1994) 171–182.
,[25] Filter versus wrapper feature subset selection in large dimensionality micro array: a review. Int. J. Comput. Sci. Inf. Technol. 2 (2011) 1048–1053.
and ,[26] Gene selection using information gain and improved simplified swarm optimization. Neurocomputing 218 (2016) 331–338. | DOI
, and ,[27] A novel hybrid feature selection method for microarray data analysis. Appl. Soft Comput. 11 (2011) 208–213. | DOI
and ,[28] An ant colony optimization based dimension reduction method for high-dimensional datasets. J. Bionic Eng. 10 (2013) 231–241. | DOI
, , , and ,[29] Gene selection using hybrid particle swarm optimization and genetic algorithm. Soft Comput. 12 (2008) 1039–1048. | DOI
, and ,[30] Feature selection for knowledge discovery and data mining. In Vol. 454. Springer Science Business Media (2012). | Zbl
and ,[31] Feature selection for cancer classification: a signal-to-noise ratio approach. Int. J. Sci. Eng. Res. 2 (2011) 1–7.
and ,[32] An enhancement of binary particle swarm optimization for gene selection in classifying cancer classes. Algorithm Mol. Biol. 8 (2013) 15. | DOI
, , , , and ,[33] Gene selection using multi-objective genetic algorithm integrating cellular automata and rough set theory. In: International Conference on Swarm, Evolutionary, and Memetic Computing. Springer, Cham (2013) 144–155. | DOI | MR
, , ,[34] Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc. Nat. Acad. Sci. 91 (1994) 5022–5026. | DOI
, , , , and ,[35] Large margin DAGs for multiclass classification. In: Proc. of Advances in neural information processing systems (2000) 547–553.
, and ,[36] A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics 107 (2016) 231–238. | DOI
, and ,[37] Hybridizing ReliefF, MRMR filters and GA wrapper approaches for gene selection. J. Theor. Appl. Inf. Technol. 46 (2012) 1034–1039.
, , and ,[38] Gems: Gene Expression Model Selector. Available at: http://www.gems-system.org (2005).
, and ,[39] Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168 (2015) 1024–1036. | DOI
, , and ,[40] Neuro-fuzzy modeling for microarray cancer gene expression data. First year transfer report. University of Oxford (2005).
,[41] Hybrid binary imperialist competition algorithm and tabu search approach for feature selection using gene expression data. BioMed Res. Int. 2016 (2016) 9721713.
, , and ,[42] Top 10 algorithms in data mining. Knowl. Info. Syst. 14 (2008) 1–37. | DOI
, , , , , et al.,[43] Recent advances of large-scale linear classification. Proc. IEEE 100 (2012) 2584–2603. | DOI
, and ,[44] A modified ant colony optimization algorithm for tumor marker gene selection. Genomics Proteomics Bioinf. 7 (2009) 200–208. | DOI
, , , and ,[45] A novel framework for gene selection. Int. J. Adv. Comput. Technol. 3 (2011) 184–191.
, , , , and ,[46] Gene selection for cancer tumor detection using a novel memetic algorithm with a multi-view fitness function. Eng. App. Artif. Intell. 26 (2013) 1274–1281. | DOI
and ,Cité par Sources :