Optimization of SVM and Gradient Boosting Models Using GridSearchCV in Detecting Fake Job Postings

  • Rofik Rofik Universitas Negeri Semarang, Semarang, Indonesia
  • Roshan Aland Hakim Universitas Negeri Semarang, Semarang, Indonesia
  • Jumanto Unjung Universitas Negeri Semarang, Semarang, Indonesia
  • Budi Prasetiyo Universitas Negeri Semarang, Semarang, Indonesia
  • Much Aziz Muslim Universiti Tun Hussein Onn Malaysia, Batu Pahat, Malaysia
Keywords: Fake job recruitment, Fraud detection, Gradient Boositng, GridSearchCV, Support Vector Machine

Abstract

Online job searching is one of the most efficient ways to do this, and it is widely used by people worldwide because of the automated process of transferring job recruitment information. The easy and fast process of transferring information in job recruitment has led to the rise of fake job vacancy fraud. Several studies have been conducted to predict fake job vacancies, focusing on improving accuracy. However, the main problem in prediction is choosing the wrong parameters so that the classification algorithm does not work optimally. This research aimed to increase the accuracy of fake job vacancy predictions by tuning parameters using GridSearchCV. The research method used was SVM and Gradient Boosting with parameter adjustments to improve the parameter combination and align it with the predicted model characteristics. The research process was divided into preprocessing, feature extraction, data separation, and modeling stages. The model was tested using the EMSCAD dataset. This research showed that the SVM algorithm can achieve the highest accuracy of 98.88%, while gradient enhancement produces an accuracy of 98.08%. This research showed that optimizing the SVM model with GridSearchCV can increase accuracy in predicting fake job recruitment.

Downloads

Download data is not yet available.

References

[1] S. Nematzadeh, F. Kiani, M. Torkamanian-Afshar, and N. Aydin, “Tuning hyperparameters of machine learning algorithms
and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases,” Computational
Biology and Chemistry, vol. 97, p. 107619, Apr. 2022, https://doi.org/10.1016/j.compbiolchem.2021.107619. [Online].
Available: https://linkinghub.elsevier.com/retrieve/pii/S1476927121001894
[2] C. Vercellino, A. Scionti, G. Varavallo, P. Viviani, G. Vitali, and O. Terzo, “A Machine Learning Approach for an HPC Use
Case: the Jobs Queuing Time Prediction,” Future Generation Computer Systems, vol. 143, pp. 215–230, Jun. 2023, https:
//doi.org/10.1016/j.future.2023.01.020. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0167739X23000274
[3] A. F. Mulyana, W. Puspita, and J. Jumanto, “Increased accuracy in predicting student academic performance
using random forest classifier,” Journal of Student Research Exploration, vol. 1, no. 2, pp. 94–103, Jul. 2023,
https://doi.org/10.52465/josre.v1i2.169. [Online]. Available: https://shmpublisher.com/index.php/josre/article/view/169
[4] W. Lyu and J. Liu, “Artificial Intelligence and emerging digital technologies in the energy sector,” Applied
Energy, vol. 303, p. 117615, Dec. 2021, https://doi.org/10.1016/j.apenergy.2021.117615. [Online]. Available: https:
//linkinghub.elsevier.com/retrieve/pii/S0306261921009843
[5] B. Chiraratanasopha and T. Chay-intr, “Detecting Fraud Job Recruitment Using Features Reflecting from Realworld
Knowledge of Fraud,” Current Applied Science and Technology, vol. 22, no. 6, Feb. 2022, https:
//doi.org/10.55003/cast.2022.06.22.008. [Online]. Available: https://li01.tci-thaijo.org/index.php/cast/article/view/254033
[6] G. Othman Alandjani, “Online fake job advertisement recognition and classification using machine learning,”
3C TIC: Cuadernos de desarrollo aplicados a las TIC, vol. 11, no. 1, pp. 251–267, Jun.
2022, https://doi.org/10.17993/3ctic.2022.111.251-267. [Online]. Available: https://www.3ciencias.com/articulos/articulo/
online-fake-job-advertisement-recognition-and-classification-using-machine-learning/
[7] B. Alghamdi and F. Alharby, “An Intelligent Model for Online Recruitment Fraud Detection,” Journal of Information
Security, vol. 10, no. 03, pp. 155–176, 2019, https://doi.org/10.4236/jis.2019.103009. [Online]. Available: http:
//www.scirp.org/journal/doi.aspx?DOI=10.4236/jis.2019.103009
[8] T. F. Waddell, H. Overton, and Robert McKeever, “Does sample source matter for theory? Testing model invariance
with the influence of presumed influence model across Amazon Mechanical Turk and Qualtrics Panels,” Computers
in Human Behavior, vol. 137, p. 107416, Dec. 2022, https://doi.org/10.1016/j.chb.2022.107416. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S0747563222002382
[9] K. Nanath and L. Olney, “An investigation of crowdsourcing methods in enhancing the machine learning approach for
detecting online recruitment fraud,” International Journal of Information Management Data Insights, vol. 3, no. 1, p.
100167, Apr. 2023, https://doi.org/10.1016/j.jjimei.2023.100167. [Online]. Available: https://linkinghub.elsevier.com/retrieve/
pii/S2667096823000149
[10] N. Jing, Z. Wu, S. Lyu, and V. Sugumaran, “Information credibility evaluation in online professional social network
using tree augmented nave Bayes classifier,” Electronic Commerce Research, vol. 21, no. 2, pp. 645–669, Jun. 2021,
https://doi.org/10.1007/s10660-019-09387-y. [Online]. Available: https://link.springer.com/10.1007/s10660-019-09387-y
[11] A. Amaar, W. Aljedaani, F. Rustam, S. Ullah, V. Rupapara, and S. Ludi, “Detection of Fake Job Postings by Utilizing Machine
Learning and Natural Language Processing Approaches,” Neural Processing Letters, vol. 54, no. 3, pp. 2219–2247, Jun. 2022,
https://doi.org/10.1007/s11063-021-10727-z. [Online]. Available: https://link.springer.com/10.1007/s11063-021-10727-z
[12] S. Dutta and S. K. Bandyopadhyay, “Fake Job Recruitment Detection Using Machine Learning Approach,”
International Journal of Engineering Trends and Technology, vol. 68, no. 4, pp. 48–53, Apr. 2020, https:
//doi.org/10.14445/22315381/IJETT-V68I4P209S. [Online]. Available: https://ijettjournal.org/archive/ijett-v68i4p209s
[13] A. Mehboob and M. S. I. Malik, “Smart Fraud Detection Framework for Job Recruitments,” Arabian Journal for Science
and Engineering, vol. 46, no. 4, pp. 3067–3078, Apr. 2021, https://doi.org/10.1007/s13369-020-04998-2. [Online]. Available:
http://link.springer.com/10.1007/s13369-020-04998-2
[14] H. Sabita, F. Fitria, and R. Herwanto, “Analisa dan Prediksi Iklan Lowongan Kerja Palsu dengan Metode Natural Language
Programing dan Machine Learning,” Jurnal Informatika, vol. 21, no. 1, pp. 14–22, Jun. 2021, https://doi.org/10.30873/ji.v21i1.
2865. [Online]. Available: https://jurnal.darmajaya.ac.id/index.php/JurnalInformatika/article/view/2865
[15] S. Lal, R. Jiaswal, N. Sardana, A. Verma, A. Kaur, and R. Mourya, “ORFDetector: Ensemble Learning
Based Online Recruitment Fraud Detection,” in 2019 Twelfth International Conference on Contemporary Computing
(IC3). Noida, India: IEEE, Aug. 2019, pp. 1–5, https://doi.org/10.1109/IC3.2019.8844879. [Online]. Available:
https://ieeexplore.ieee.org/document/8844879/
[16] J. Kim, H.-J. Kim, and H. Kim, “Fraud detection for job placement using hierarchical clusters-based deep neural networks,”
Applied Intelligence, vol. 49, no. 8, pp. 2842–2861, Aug. 2019, https://doi.org/10.1007/s10489-019-01419-2. [Online].
Available: http://link.springer.com/10.1007/s10489-019-01419-2
[17] B. Baesens, S. Hppner, and T. Verdonck, “Data engineering for fraud detection,” Decision Support Systems, vol. 150, p.
113492, Nov. 2021, https://doi.org/10.1016/j.dss.2021.113492. [Online]. Available: https://linkinghub.elsevier.com/retrieve/
pii/S0167923621000026
[18] M. Naud, K. J. Adebayo, and R. Nanda, “A machine learning approach to detecting fraudulent job types,” AI &
SOCIETY, vol. 38, no. 2, pp. 1013–1024, Apr. 2023, https://doi.org/10.1007/s00146-022-01469-0. [Online]. Available:
https://link.springer.com/10.1007/s00146-022-01469-0
[19] E. Baraneetharan, “Detection of Fake Job Advertisements using Machine Learning algorithms,” Journal of Artificial
Intelligence and Capsule Networks, vol. 4, no. 3, pp. 200–210, Oct. 2022, https://doi.org/10.36548/jaicn.2022.3.006. [Online].
Available: https://irojournals.com/aicn/article/pdf/4/3/6
[20] M. Thanh Vo, A. H. Vo, T. Nguyen, R. Sharma, and T. Le, “Dealing with the Class Imbalance Problem in
the Detection of Fake Job Descriptions,” Computers, Materials & Continua, vol. 68, no. 1, pp. 521–535, 2021,
https://doi.org/10.32604/cmc.2021.015645. [Online]. Available: https://www.techscience.com/cmc/v68n1/41824
[21] A. S. Pillai, “Detecting Fake Job Postings Using Bidirectional LSTM,” International Research Journal of Modernization
in Engineering Technology and Science, Apr. 2023, https://doi.org/10.56726/IRJMETS35202. [Online]. Available:
https://www.irjmets.com/uploadedfiles/paper//issue 3 march 2023/35202/final/fin irjmets1680354157.pdf
[22] J. Jumanto, M. A. Muslim, Y. Dasril, and T. Mustaqim, “Accuracy of Malaysia Public Response to Economic
Factors During the Covid-19 Pandemic Using Vader and Random Forest,” Journal of Information System Exploration
and Research, vol. 1, no. 1, pp. 49–70, Dec. 2022, https://doi.org/10.52465/joiser.v1i1.104. [Online]. Available:
https://shmpublisher.com/index.php/joiser/article/view/104
[23] S. Akuma, T. Lubem, and I. T. Adom, “Comparing Bag of Words and TF-IDF with different models for hate speech
detection from live tweets,” International Journal of Information Technology, vol. 14, no. 7, pp. 3629–3635, Dec. 2022,
https://doi.org/10.1007/s41870-022-01096-4. [Online]. Available: https://link.springer.com/10.1007/s41870-022-01096-4
[24] H. Tabassum, G. Ghosh, A. Atika, and A. Chakrabarty, “Detecting Online Recruitment Fraud Using Machine
Learning,” in 2021 9th International Conference on Information and Communication Technology (ICoICT). Yogyakarta,
Indonesia: IEEE, Aug. 2021, pp. 472–477, https://doi.org/10.1109/ICoICT52021.2021.9527477. [Online]. Available:
https://ieeexplore.ieee.org/document/9527477/
[25] M. R. Ningsih, K. A. H. Wibowo, A. U. Dullah, and J. Jumanto, “Global recession sentiment analysis utilizing VADER and
ensemble learning method with word embedding,” Journal of Soft Computing Exploration, vol. 4, no. 3, pp. 142–151, Sep. 2023,
https://doi.org/10.52465/joscex.v4i3.193. [Online]. Available: https://shmpublisher.com/index.php/joscex/article/view/193
[26] X. Wang, C. Wang, J. Yao, H. Fan, Q. Wang, Y. Ren, and Q. Gao, “Comparisons of deep learning and machine
learning while using text mining methods to identify suicide attempts of patients with mood disorders,” Journal of
Affective Disorders, vol. 317, pp. 107–113, Nov. 2022, https://doi.org/10.1016/j.jad.2022.08.054. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S0165032722009107
[27] P. Khandagale, A. Utekar, A. Dhonde, and P. S. S. Karve, “Fake Job Detection Using Machine Learning,”
International Journal for Research in Applied Science and Engineering Technology, vol. 10, no. 4, pp. 1822–
1827, Apr. 2022, https://doi.org/10.22214/ijraset.2022.41641. [Online]. Available: https://www.ijraset.com/best-journal/
fake-job-detection-using-machine-learning
[28] G. Chaubey, P. R. Gavhane, D. Bisen, and S. K. Arjaria, “Customer purchasing behavior prediction using
machine learning classification techniques,” Journal of Ambient Intelligence and Humanized Computing, vol. 14,
no. 12, pp. 16 133–16 157, Dec. 2023, https://doi.org/10.1007/s12652-022-03837-6. [Online]. Available: https:
//link.springer.com/10.1007/s12652-022-03837-6
[29] L. Zhou, H. Fujita, H. Ding, and R. Ma, “Credit risk modeling on data with two timestamps in peer-to-peer lending by gradient
boosting,” Applied Soft Computing, vol. 110, p. 107672, Oct. 2021, https://doi.org/10.1016/j.asoc.2021.107672. [Online].
Available: https://linkinghub.elsevier.com/retrieve/pii/S1568494621005937
[30] T. Wang, Y. Bian, Y. Zhang, and X. Hou, “Classification of earthquakes, explosions and mining-induced earthquakes based on
XGBoost algorithm,” Computers & Geosciences, vol. 170, p. 105242, Jan. 2023, https://doi.org/10.1016/j.cageo.2022.105242.
[Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0098300422001911
[31] D. Valero-Carreras, J. Alcaraz, and M. Landete, “Comparing two SVM models through different metrics based on the confusion
matrix,” Computers & Operations Research, vol. 152, p. 106131, Apr. 2023, https://doi.org/10.1016/j.cor.2022.106131.
[Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0305054822003616
[32] Z. M. Alhakeem, Y. M. Jebur, S. N. Henedy, H. Imran, L. F. A. Bernardo, and H. M. Hussein, “Prediction of Ecofriendly
Concrete Compressive Strength Using Gradient Boosting Regression Tree Combined with GridSearchCV Hyperparameter-
Optimization Techniques,” Materials, vol. 15, no. 21, p. 7432, Oct. 2022, https://doi.org/10.3390/ma15217432. [Online].
Available: https://www.mdpi.com/1996-1944/15/21/7432
Published
2024-03-26
How to Cite
Rofik, R., Hakim, R. A., Unjung, J., Prasetiyo, B., & Muslim, M. A. (2024). Optimization of SVM and Gradient Boosting Models Using GridSearchCV in Detecting Fake Job Postings. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 23(2), 419-430. https://doi.org/https://doi.org/10.30812/matrik.v23i2.3566
Section
Articles