Journal Article2025
RSTHFS: A Rough Set Theory-Based Hybrid Feature Selection Method for Phishing Website Classification
Jahanggir Hossain Setu, Nabarun Halder, Ashraful Islam, M. Ashraful Amin
IEEE Access
Institute of Electrical and Electronics Engineers (IEEE), Vol. 13, pp. 68820-68830, ISBN: 2169-3536
CCDS Authors
References
- 1.Ying Xue. (2019). An Overview of Overfitting and its Solutions. Journal of Physics Conference Series, 1168, 022022[10.1088/1742-6596/1168/2/022022]
- 2.Steven J. Rigatti. (2017). Random Forest. Journal of Insurance Medicine, 47(1), 31–39[10.17849/insm-47-01-31-39.1]
- 3.John Hancock, Taghi M. Khoshgoftaar. (2020). CatBoost for big data: an interdisciplinary review. Journal Of Big Data, 7(1), 94[10.1186/s40537-020-00369-8]
- 4.Özgür Koray Şahingöz, Ebubekir Buber, Önder Demir, Banu Di̇ri̇. (2018). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345–357[10.1016/j.eswa.2018.09.029]
- 5.Željko Vujović. (2021). Classification Model Evaluation Metrics. International Journal of Advanced Computer Science and Applications, 12(6)[10.14569/ijacsa.2021.0120670]
- 6.Abdul Basit, Maham Zafar, Xuan Liu, Abdul Rehman Javed, Zunera Jalil, Kashif Kifayat. (2020). A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommunication Systems, 76(1), 139–154[10.1007/s11235-020-00733-2]
- 7.Xuelian Deng, Yuqing Li, Jian Weng, Jilian Zhang. (2018). Feature selection for text classification: A review. Multimedia Tools and Applications, 78(3), 3797–3816[10.1007/s11042-018-6083-5]
- 8.Kang Leng Chiew, Choon Lin Tan, KokSheik Wong, Kelvin S. C. Yong, Wei King Tiong. (2019). A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Information Sciences, 484, 153–166[10.1016/j.ins.2019.01.064]
- 9.Brij B. Gupta, Krishna Yadav, Imran Razzak, Kostas E. Psannis, Arcangelo Castiglione, Xiaojun Chang. (2021). A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment. Computer Communications, 175, 47–57[10.1016/j.comcom.2021.04.023]
- 10.L. Lakshmi, M. Purushotham Reddy, Chukka Santhaiah, Ummadi Janardhan Reddy. (2021). Smart Phishing Detection in Web Pages using Supervised Deep Learning Classification and Optimization Technique ADAM. Wireless Personal Communications, 118(4), 3549–3564[10.1007/s11277-021-08196-7]
- 11.Asit Kumar Das, Shampa Sengupta, Siddhartha Bhattacharyya. (2018). A group incremental feature selection for classification using rough set theory based genetic algorithm. Applied Soft Computing, 65, 400–411[10.1016/j.asoc.2018.01.040]
- 12.Mahendra Prasad, Sachin Tripathi, Keshav Dahal. (2019). An efficient feature selection based Bayesian and Rough set approach for intrusion detection. Applied Soft Computing, 87, 105980[10.1016/j.asoc.2019.105980]
- 13.Abdelhakim Hannousse, Salima Yahiouche. (2021). Towards benchmark datasets for machine learning based website phishing detection: An experimental study. Engineering Applications of Artificial Intelligence, 104, 104347[10.1016/j.engappai.2021.104347]
- 14.Erzhou Zhu, Yuyang Chen, Chengcheng Ye, Xuejun Li, Feng Liu. (2019). OFS-NN: An Effective Phishing Websites Detection Model Based on Optimal Feature Selection and Neural Network. IEEE Access, 7, 73271–73284[10.1109/access.2019.2920655]
- 15.Chidimma Opara, Yingke Chen, Bo Wei. (2023). Look before you leap: Detecting phishing web pages by exploiting raw URL and HTML characteristics. Expert Systems with Applications, 236, 121183[10.1016/j.eswa.2023.121183]
- 16.Saad Al-Ahmadi, Afrah Alotaibi, Omar Alsaleh. (2022). PDGAN: Phishing Detection With Generative Adversarial Networks. IEEE Access, 10, 42459–42468[10.1109/access.2022.3168235]
- 17.Youness Mourtaji, Mohammed Bouhorma, Daniyal Alghazzawi, Ghadah Aldabbagh, Abdullah Alghamdi. (2021). Hybrid Rule‐Based Solution for Phishing URL Detection Using Convolutional Neural Network. Wireless Communications and Mobile Computing, 2021(1)[10.1155/2021/8241104]
- 18.Ali Fahad Al-Qahtani, Stefano Cresci. (2022). The COVID‐19 scamdemic: A survey of phishing attacks and their countermeasures during COVID‐19. IET Information Security, 16(5), 324–345[10.1049/ise2.12073]
- 19.Luka Jovanović, Dijana Jovanovic, Miloš Antonijević, Boško Nikolić, Nebojša Bačanin, Miodrag Živković, Ivana Strumberger. (2023). Improving Phishing Website Detection Using a Hybrid Two-level Framework for Feature Selection and XGBoost Tuning. Journal of Web Engineering[10.13052/jwe1540-9589.2237]
- 20.Rubul Kumar Bania, Anindya Halder. (2021). R-HEFS: Rough set based heterogeneous ensemble feature selection method for medical data classification. Artificial Intelligence in Medicine, 114, 102049[10.1016/j.artmed.2021.102049]
- 21.Yi Wei, Yuji Sekiya. (2022). Sufficiency of Ensemble Machine Learning Methods for Phishing Websites Detection. IEEE Access, 10, 124103–124113[10.1109/access.2022.3224781]
- 22.Parvathapuram Pavan Kumar, T. Jaya, V. Rajendran. (2021). SI-BBA – A novel phishing website detection based on Swarm intelligence with deep learning. Materials Today Proceedings, 80, 3129–3139[10.1016/j.matpr.2021.07.178]
- 23.Mohsena Ashraf, Farzana Anowar, Jahanggir Hossain Setu, Atiqul Islam Chowdhury, Eshtiak Ahmed, Ashraful Islam, Abdullah Al Mamun. (2023). A Survey on Dimensionality Reduction Techniques for Time-Series Data. IEEE Access, 11, 42909–42923[10.1109/access.2023.3269693]
- 24.Mahdieh Sabahno, Fatemeh Safara. (2021). ISHO: improved spotted hyena optimization algorithm for phishing website detection. Multimedia Tools and Applications, 81(24), 34677–34696[10.1007/s11042-021-10678-6]
- 25.Choon Lin Tan, Kang Leng Chiew, Kelvin S. C. Yong, Yakub Sebastian, Joel Chia Ming Than, Wei King Tiong. (2023). Hybrid phishing detection using joint visual and textual identity. Expert Systems with Applications, 220, 119723[10.1016/j.eswa.2023.119723]
- 26.Ashish Kumar Jha, Raja Muthalagu, Pranav M. Pawar. (2023). Intelligent phishing website detection using machine learning. Multimedia Tools and Applications, 82(19), 29431–29456[10.1007/s11042-023-14731-4]
- 27.Routhu Srinivasa Rao, Alwyn Roshan Pais. (2019). Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach. Journal of Ambient Intelligence and Humanized Computing, 11(9), 3853–3872[10.1007/s12652-019-01637-z]
- 28.Khairan Rajab. (2017). New Hybrid Features Selection Method: A Case Study on Websites Phishing. Security and Communication Networks, 2017, 1–10[10.1155/2017/9838169]
- 29.Jimmy Moedjahedy, Arief Setyanto, Fawaz Khaled Alarfaj, Mohammed Alreshoodi. (2022). CCrFS: Combine Correlation Features Selection for Detecting Phishing Websites Using Machine Learning. Future Internet, 14(8), 229[10.3390/fi14080229]
- 30.Issa Qabajeh, Fadi Thabtah. (2014). An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods. , 125–132[10.1109/acsat.2014.29]
- 31.Sonkarlay J. Y. Weamie. (2022). Cross-Site Scripting Attacks and Defensive Techniques: A Comprehensive Survey*. International Journal of Communications Network and System Sciences, 15(08), 126–148[10.4236/ijcns.2022.158010]
- 32.Mengli Wang, Lipeng Song, Luyang Li, Yuhui Zhu, Jing Li. (2024). Phishing webpage detection based on global and local visual similarity. Expert Systems with Applications, 252, 124120[10.1016/j.eswa.2024.124120]
- 33.Kibreab Adane, Berhanu Beyene. (2022). Machine Learning and Deep Learning Based Phishing Websites Detection: The Current Gaps and Next Directions. Review of Computer Engineering Research, 9(1), 13–29[10.18488/76.v9i1.2983]
- 34.Mohamad Asraf Daniel, Siew-Chin Chong, Lee-Ying Chong, Kuok-Kwee Wee. (2025). Optimising Phishing Detection: A Comparative Analysis of Machine Learning Methods with Feature Selection. Journal of Informatics and Web Engineering, 4(1), 200–212[10.33093/jiwe.2025.4.1.15]
- 35.Shehan Vidyakeerthi, Mohamed Nabeel, Charith Elvitigala, Chamath Keppitiyagama. (2022). Demo: PhishChain: A Decentralized and Transparent System to Blacklist Phishing URLs. Companion Proceedings of the Web Conference 2022, 286–289[10.1145/3487553.3524235]
- 36.Younis A. Younis, Mohamed S. Musbah. (2020). A Framework to Protect Against Phishing Attacks. Proceedings of the 6th International Conference on Engineering & MIS 2020, 1–6[10.1145/3410352.3410825]
- 37.Sayak Saha Roy, Unique Karanjit, Shirin Nilizadeh. (2021). Evaluating the Effectiveness of Phishing Reports on Twitter. , 1–13[10.1109/ecrime54498.2021.9738786]
- 38.Kehan Gao, Taghi M. Khoshgoftaar, Amri Napolitano. (2009). Exploring Software Quality Classification with a Wrapper-Based Feature Ranking Technique. , 67–74[10.1109/ictai.2009.24]
- 39.Yousif Al-Tamimi, Mohammad Shkoukani. (2022). Employing cluster-based class decomposition approach to detect phishing websites using machine learning classifiers. International Journal of Data and Network Science, 7(1), 313–328[10.5267/j.ijdns.2022.10.002]
- 40.Ongoma Jackson, Alilah David Anekeya, Okuto Erick. (2021). Optimal Allocation in Small Area Mean Estimation Using Stratified Sampling in the Presence of Non-Response. International Journal of Statistical Distributions and Applications, 7(1), 13[10.11648/j.ijsd.20210701.13]
- 41.Arpit Singh, Subhas Chandra Misra. (2022). A Comparison of Performance of Rough Set Theory with Machine Learning Techniques in Detecting Phishing Attack. Lecture notes in networks and systems, 631–650[10.1007/978-3-030-87049-2_22]
- 42.(2024). Random forests. Physics Subject Headings (PhySH)[10.29172/7c2a6982-6d72-4cd8-bba6-2fccb06a7011]
