Detecting Hidden Illegal Online Gambling on .go.id Domains Using Web Scraping Algorithms

  • Muchlis Nurseno Universitas Nusa Putra, Sukabumi, Indonsia
  • Umar Aditiawarman Universitas Nusa Putra, Sukabumi, Indonsia
  • Haris Al Qodri Maarif Universitas Nusa Putra, Sukabumi, Indonsia
  • Teddy Mantoro Sampoerna University, Jakarta, Indonesia
Keywords: Black Hat SEO, Government Website, Online Gambling, Stealthy Defacement, Web Scraper

Abstract

The profitable gambling business has encouraged operators to promote online gambling using black hat SEO by targeting official sites such as government sites. Operators have used various techniques to prevent search engines from distinguishing between genuine and illegal content. This research aims to determine whether websites with the go.id domain have been compromised with hidden URLs affiliated with online gambling sites. The method used in this research is an experiment using a FOFA.info dataset containing a complete list of 450,000 .go.id domains. A web scraping algorithm developed in Python was used to identify potentially compromised websites from the targeted list
by analyzing gambling-related keywords in local languages, such as ’slot,’ ’judi,’ ’gacor,’ and ’togel'. The results showed that 958 of the 1,482 suspected.go.id sites had been compromised with an accuracy rate of 99.1%. This implies that security gaps have been exploited by illegal online gambling sites, posing a reputational risk to the government. Lastly, the scrapping algorithm tool developed in this research can detect illegal online gambling hidden in domains such as .ac.id, .or.id, .sch.id, and help authorities take necessary action.

Downloads

Download data is not yet available.

References

[1] H. Yang, K. Du, Y. Zhang, S. Hao, H. Wang, J. Zhang, and H. Duan, “Mingling of Clear and Muddy
Water: Understanding and Detecting Semantic Confusion in Blackhat SEO,” in Computer Security ESORICS
2021. Springer, Cham, 2021, pp. 263–284, https://doi.org/10.1007/978-3-030-88418-5 13. [Online]. Available: https:
//link.springer.com/chapter/10.1007/978-3-030-88418-5 13
[2] S. Vaishy and H. Gupta, “Cybercriminals’ Motivations for Targeting Government Organizations,” in 2021 9th
International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO).
Noida, India: IEEE, Sep. 2021, pp. 1–6, https://doi.org/10.1109/ICRITO51393.2021.9596104. [Online]. Available:
https://ieeexplore.ieee.org/document/9596104/
[3] S. Setiawati, Pratiwi Ayu Sri Daulat, Sunarto, and SumartiniDewi, “The Urgency of Special Regulations for online
Gambling in Indonesia,” International Journal of Arts and Social Science, vol. 5, no. 7, pp. 108–115, Mar. 2023,
https://doi.org/10.5281/ZENODO.7754860. [Online]. Available: https://zenodo.org/record/7754860
[4] M. Albalawi, R. Aloufi, N. Alamrani, N. Albalawi, A. Aljaedi, and A. R. Alharbi, “Website Defacement Detection and
Monitoring Methods: A Review,” Electronics, vol. 11, no. 21, p. 3573, Nov. 2022, https://doi.org/10.3390/electronics11213573.
[Online]. Available: https://www.mdpi.com/2079-9292/11/21/3573
[5] R. Zhao, “The Chameleon on the Web: an Empirical Study of the Insidious Proactive Web Defacements,” in Proceedings of the
ACM Web Conference 2023. Austin TX USA: ACM, Apr. 2023, pp. 2241–2251, https://doi.org/10.1145/3543507.3583377.
[Online]. Available: https://dl.acm.org/doi/10.1145/3543507.3583377
[6] A. Arora, P. Nakov, M. Hardalov, S. M. Sarwar, V. Nayak, Y. Dinkov, D. Zlatkova, K. Dent, A. Bhatawdekar, G. Bouchard,
and I. Augenstein, “Detecting Harmful Content on Online Platforms: What Platforms Need vs. Where Research Efforts
Go,” ACM Computing Surveys, vol. 56, no. 3, pp. 1–17, Mar. 2024, https://doi.org/10.1145/3603399. [Online]. Available:
https://dl.acm.org/doi/10.1145/3603399
[7] H. Syahputra and A. Wibowo, “Comparison of Support Vector Machine (SVM) and Random Forest Algorithm for Detection of
Negative Content on Websites,” Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), vol. 9, no. 1, pp. 165–173,
Mar. 2023, https://doi.org/10.26555/jiteki.v9i1.25861.
[8] Y. Chen, R. Zheng, A. Zhou, S. Liao, and L. Liu, “Automatic Detection of Pornographic and Gambling Websites
Based on Visual and Textual Content Using a Decision Mechanism,” Sensors, vol. 20, no. 14, p. 3989, Jul. 2020,
https://doi.org/10.3390/s20143989. [Online]. Available: https://www.mdpi.com/1424-8220/20/14/3989
[9] H. Yang, K. Du, Y. Zhang, S. Hao, Z. Li, M. Liu, H. Wang, H. Duan, Y. Shi, X. Su, G. Liu, Z. Geng, and J. Wu, “Casino
royale: a deep exploration of illegal online gambling,” in Proceedings of the 35th Annual Computer Security Applications
Conference. San Juan Puerto Rico USA: ACM, Dec. 2019, pp. 500–513, https://doi.org/10.1145/3359789.3359817. [Online].
Available: https://dl.acm.org/doi/10.1145/3359789.3359817
[10] M. Min, J. J. Lee, and K. Lee, “Detecting Illegal Online Gambling (IOG) Services in the Mobile Environment,” Security
and Communication Networks, vol. 2022, pp. 1–12, Feb. 2022, https://doi.org/10.1155/2022/3286623. [Online]. Available:
https://www.hindawi.com/journals/scn/2022/3286623/
[11] C. Wang, M. Zhang, F. Shi, P. Xue, and Y. Li, “A Hybrid Multimodal Data Fusion-Based Method for Identifying Gambling
Websites,” Electronics, vol. 11, no. 16, p. 2489, Aug. 2022, https://doi.org/10.3390/electronics11162489. [Online]. Available:
https://www.mdpi.com/2079-9292/11/16/2489
[12] C. Wang, P. Xue, M. Zhang, and M. Hu, “Identifying Gambling Websites with Co-training,” Jul. 2022, pp. 598–603,
https://doi.org/10.18293/SEKE2022-106. [Online]. Available: http://ksiresearchorg.ipage.com/seke/seke22paper/paper106.pdf
[13] J. Liu, Y. Su, S. Lv, and C. Huang, “Detecting Web Spam Based on Novel Features from Web Page Source Code,” Security
and Communication Networks, vol. 2020, pp. 1–14, Dec. 2020, https://doi.org/10.1155/2020/6662166. [Online]. Available:
https://www.hindawi.com/journals/scn/2020/6662166/
[14] R. Yang, J. Liu, L. Gu, and Y. Chen, “Search & Catch: Detecting Promotion Infection in the Underground through Search
Engines,” in 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications
(TrustCom). Guangzhou, China: IEEE, Dec. 2020, pp. 1566–1571, https://doi.org/10.1109/TrustCom50675.2020.00216.
[Online]. Available: https://ieeexplore.ieee.org/document/9343210/
[15] J. Schedlbauer, G. Raptis, and B. Ludwig, “Medical informatics labor market analysis using web crawling, web scraping, and
text mining,” International Journal of Medical Informatics, vol. 150, p. 104453, Jun. 2021, https://doi.org/10.1016/j.ijmedinf.
2021.104453. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S1386505621000794
[16] J.-C. Bricongne, B. Meunier, and S. Pouget, “Web-scraping housing prices in real-time: The Covid-19 crisis in the UK,”
Journal of Housing Economics, vol. 59, p. 101906, Mar. 2023, https://doi.org/10.1016/j.jhe.2022.101906. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S105113772200078X
[17] E. S. Hasibuan, “The Police are Indecisive: Online Gambling is Rising. Facts about the Eradication of Online Gambling in the
Field,” Journal of Social Research, vol. 2, no. 10, Aug. 2023, https://doi.org/10.55324/josr.v2i10.1405. [Online]. Available:
https://ijsr.internationaljournallabs.com/index.php/ijsr/article/view/1405
[18] E. C. Listiyanto and A. Arpangi, “Implementation Effectiveness of Police Role in Eradication of Online Gaming Crime in
Digital Era,” Law Development Journal, vol. 3, no. 2, p. 362, Aug. 2021, https://doi.org/10.30659/ldj.3.2.362-370. [Online].
Available: http://jurnal.unissula.ac.id/index.php/ldj/article/view/16072
[19] M. Senjaya, “Law Enforcement of the Crime of Money Laundering That Comes from Online Gambling,” International
Journal of Social Science, vol. 2, no. 3, pp. 1641–1650, Oct. 2022, https://doi.org/10.53625/ijss.v2i3.3626. [Online]. Available:
https://bajangjournal.com/index.php/IJSS/article/view/3626
Published
2024-03-08
How to Cite
Nurseno, M., Aditiawarman, U., Al Qodri Maarif, H., & Mantoro, T. (2024). Detecting Hidden Illegal Online Gambling on .go.id Domains Using Web Scraping Algorithms. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 23(2), 365-378. https://doi.org/https://doi.org/10.30812/matrik.v23i2.3824
Section
Articles

Most read articles by the same author(s)