Leveraging Vector Quantized Variational Autoencoder for Accurate Synthetic Data Generation in Multivariate Time Series
DOI:
https://doi.org/10.30812/matrik.v24i3.4514Keywords:
Financial Market, Multivariate Time Series, Synthetic Data Generation, Variational Autoencoder, Vector QuantizedAbstract
This study addresses the challenge of generating high-quality synthetic financial time series data, a
critical issue in financial forecasting due to limited access to complete and reliable historical datasets.
The aim of this research was to compare the performance of the standard Variational Autoencoder and
the Vector Quantized Variational Autoencoder (VQ-VAE) in generating synthetic multivariate time series
data using the Adaro Energy Indonesia stock dataset. The VQ-VAE incorporates a discrete latent
space to improve the structure and control of the data generation process, whereas the standard VAE
utilizes a continuous latent space. This research method was based on the implementation of both
models, followed by a quantitative evaluation using statistical metrics, including mean absolute error
(MAE), mean squared error (MSE), root mean squared error (RMSE), and R² score. This research
showed that the VQ-VAE outperformed the standard VAE in replicating the statistical characteristics
of stock prices, as shown by lower error values and higher R² scores across all tested features. The discrete
latent space of the VQ-VAE led to the generation of more structured and statistically consistent
synthetic data. The implications of these findings suggest that the VQ-VAE model is highly suitable
for financial forecasting applications and indicate the potential for future enhancements through
integration with hybrid models, such as attention mechanisms or generative adversarial networks.
Downloads
References
[1] A. O. Aseeri, “Effective short-term forecasts of Saudi stock price trends using technical indicators and large-scale multivariate
time series,” PeerJ Computer Science, vol. 9, p. e1205, Jan. 2023, https://doi.org/10.7717/peerj-cs.1205.
[2] K. Alkhatib, H. Khazaleh, H. A. Alkhazaleh, A. R. Alsoud, and L. Abualigah, “A New Stock Price Forecasting Method Using
Active Deep Learning Approach,” Journal of Open Innovation: Technology, Market, and Complexity, vol. 8, no. 2, p. 96, Jun.
2022, https://doi.org/10.3390/joitmc8020096.
[3] Y.-C. Chen andW.-C. Huang, “Constructing a stock-price forecast CNN model with gold and crude oil indicators,” Applied Soft
Computing, vol. 112, p. 107760, Nov. 2021, https://doi.org/10.1016/j.asoc.2021.107760.
[4] A. U. Haq, A. Zeb, Z. Lei, and D. Zhang, “Forecasting daily stock trend using multi-filter feature selection and deep learning,”
Expert Systems with Applications, vol. 168, p. 114444, Apr. 2021, https://doi.org/10.1016/j.eswa.2020.114444.
[5] Z. Zhang and M. Wu, “Predicting Real-Time Locational Marginal Prices: A GAN-Based Approach,” IEEE Transactions on
Power Systems, vol. 37, no. 2, pp. 1286–1296, Mar. 2022, https://doi.org/10.1109/TPWRS.2021.3106263.
[6] L. B. Iantovics and C. En˘achescu, “Method for Data Quality Assessment of Synthetic Industrial Data,” Sensors, vol. 22, no. 4,
p. 1608, Feb. 2022, https://doi.org/10.3390/s22041608.
[7] S. Tuarob, P.Wettayakorn, P. Phetchai, S. Traivijitkhun, S. Lim, T. Noraset, and T. Thaipisutikul, “DAViS: A unified solution for
data collection, analyzation, and visualization in real-time stock market prediction,” Financial Innovation, vol. 7, no. 1, p. 56,
Dec. 2021, https://doi.org/10.1186/s40854-021-00269-7.
[8] J. Shen and M. O. Shafiq, “Short-term stock market price trend prediction using a comprehensive deep learning system,” Journal
of Big Data, vol. 7, no. 1, p. 66, Dec. 2020, https://doi.org/10.1186/s40537-020-00333-6.
[9] X. Hou, K. Wang, C. Zhong, and Z. Wei, “ST-Trader: A Spatial-Temporal Deep Neural Network for Modeling Stock Market
Movement,” IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 5, pp. 1015–1024, May 2021, https://doi.org/10.1109/JAS.
2021.1003976.
[10] A. H. Bukhari, M. A. Z. Raja, M. Sulaiman, S. Islam, M. Shoaib, and P. Kumam, “Fractional Neuro-Sequential ARFIMALSTM
for Financial Market Forecasting,” IEEE Access, vol. 8, pp. 71 326–71 338, 2020, https://doi.org/10.1109/ACCESS.
2020.2985763.
[11] H. Gunduz, “An efficient stock market prediction model using hybrid feature reduction method based on variational autoencoders
and recursive feature elimination,” Financial Innovation, vol. 7, no. 1, p. 28, Dec. 2021, https://doi.org/10.1186/
s40854-021-00243-3.
[12] H. Li, Y. Cui, S. Wang, J. Liu, J. Qin, and Y. Yang, “Multivariate Financial Time-Series Prediction With Certified Robustness,”
IEEE Access, vol. 8, pp. 109 133–109 143, 2020, https://doi.org/10.1109/ACCESS.2020.3001287.
[13] D. Panfilo, A. Boudewijn, S. Saccani, A. Coser, B. Svara, C. R. Chauvenet, C. A. Mami, and E. Medvet, “A Deep Learning-
Based Pipeline for the Generation of Synthetic Tabular Data,” IEEE Access, vol. 11, pp. 63 306–63 323, 2023, https://doi.org/
10.1109/ACCESS.2023.3288336.
[14] Y. Jin, R. McDaniel, N. J. Tatro, M. J. Catanzaro, A. D. Smith, P. Bendich, M. B. Dwyer, and P. T. Fletcher, “Implications of
data topology for deep generative models,” Frontiers in Computer Science, vol. 6, p. 1260604, Aug. 2024, https://doi.org/10.
3389/fcomp.2024.1260604.
[15] Y. L. Chow, S. Singh, A. E. Carpenter, and G. P. Way, “Predicting drug polypharmacology from cell morphology readouts
using variational autoencoder latent space arithmetic,” PLOS Computational Biology, vol. 18, no. 2, p. e1009888, Feb. 2022,
https://doi.org/10.1371/journal.pcbi.1009888.
[16] B. Hernandez, O. Stiff, D. K. Ming, C. Ho Quang, V. Nguyen Lam, T. Nguyen Minh, C. Nguyen Van Vinh, N. Nguyen Minh,
H. Nguyen Quang, L. Phung Khanh, T. Dong Thi Hoai, T. Dinh The, T. Huynh Trung, B. Wills, C. P. Simmons, A. H.
Holmes, S. Yacoub, P. Georgiou, and on behalf of the Vietnam ICU Translational Applications Laboratory (VITAL) investigators,
“Learning meaningful latent space representations for patient risk stratification: Model development and validation for
dengue and other acute febrile illness,” Frontiers in Digital Health, vol. 5, p. 1057467, Feb. 2023, https://doi.org/10.3389/fdgth.
2023.1057467.
[17] M. Nabipour, P. Nayyeri, H. Jabani, S. S., and A. Mosavi, “Predicting Stock Market Trends Using Machine Learning and Deep
Learning Algorithms Via Continuous and Binary Data; a Comparative Analysis,” IEEE Access, vol. 8, pp. 150 199–150 212,
2020, https://doi.org/10.1109/ACCESS.2020.3015966.
[18] J. Wu, K. Plataniotis, L. Liu, E. Amjadian, and Y. Lawryshyn, “Interpretation for Variational Autoencoder Used to Generate
Financial Synthetic Tabular Data,” Algorithms, vol. 16, no. 2, p. 121, Feb. 2023, https://doi.org/10.3390/a16020121.
[19] R. Wei and A. Mahmood, “Recent Advances in Variational Autoencoders With Representation Learning for Biomedical Informatics:
A Survey,” IEEE Access, vol. 9, pp. 4939–4956, 2021, https://doi.org/10.1109/ACCESS.2020.3048309.
[20] Z. Feng, M. Dakovi´c, H. Ji, X. Zhou, M. Zhu, X. Cui, and L. Stankovi´c, “Interpretation of Latent Codes in InfoGAN with SAR
Images,” Remote Sensing, vol. 15, no. 5, p. 1254, Feb. 2023, https://doi.org/10.3390/rs15051254.
[21] S. Saha, F. Bovolo, and L. Bruzzone, “Building Change Detection in VHR SAR Images via Unsupervised Deep Transcoding,”
IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 3, pp. 1917–1929, Mar. 2021, https://doi.org/10.1109/
TGRS.2020.3000296.
[22] M. Elbattah, C. Loughnane, J.-L. Gu´erin, R. Carette, F. Cilia, and G. Dequen, “Variational Autoencoder for Image-Based Augmentation
of Eye-Tracking Data,” Journal of Imaging, vol. 7, no. 5, p. 83, May 2021, https://doi.org/10.3390/jimaging7050083.
[23] X. Jiang, X. Peng, H. Xue, Y. Zhang, and Y. Lu, “Latent-Domain Predictive Neural Speech Coding,” IEEE/ACM Transactions
on Audio, Speech, and Language Processing, vol. 31, pp. 2111–2123, 2023, https://doi.org/10.1109/TASLP.2023.3277693.
[24] X. Tan, J. Chen, H. Liu, J. Cong, C. Zhang, Y. Liu, X. Wang, Y. Leng, Y. Yi, L. He, S. Zhao, T. Qin, F. Soong, and T.-Y. Liu,
“NaturalSpeech : End-to-End Text-to-Speech Synthesis With Human-Level Quality,” IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 46, no. 6, pp. 4234–4245, Jun. 2024, https://doi.org/10.1109/TPAMI.2024.3356232.
[25] A. Asperti, L. Bugo, and D. Filippini, “Enhancing Variational Generation Through Self-Decomposition,” IEEE Access, vol. 10,
pp. 67 510–67 520, 2022, https://doi.org/10.1109/ACCESS.2022.3185654.
[26] L. Li, J. Yan, H. Wang, and Y. Jin, “Anomaly Detection of Time Series With Smoothness-Inducing Sequential Variational
Auto-Encoder,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 3, pp. 1177–1191, Mar. 2021,
https://doi.org/10.1109/TNNLS.2020.2980749.
[27] Y. Liu, W. Xie, Y. Li, Z. Li, and Q. Du, “Dual-Frequency Autoencoder for Anomaly Detection in Transformed Hyperspectral
Imagery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2022, https://doi.org/10.1109/TGRS.2022.
3152263.
[28] M. Dogariu, L.-D. S¸ tefan, B. A. Boteanu, C. Lamba, B. Kim, and B. Ionescu, “Generation of Realistic Synthetic Financial
Time-series,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 18, no. 4, pp. 1–27, Nov.
2022, https://doi.org/10.1145/3501305.
[29] M. Diqi, E. Utami, K. Kusrini, and F. W. Wibowo, “Indonesian Stocks,” https://doi.org/10.34740/KAGGLE/DSV/8357240.
[30] J. I. Monroe and V. K. Shen, “Systematic control of collective variables learned from variational autoencoders,” The Journal of
Chemical Physics, vol. 157, no. 9, p. 094116, Sep. 2022, https://doi.org/10.1063/5.0105120.
[31] H. Guo, F. Xie, X. Wu, F. K. Soong, and H. Meng, “MSMC-TTS: Multi-Stage Multi-Codebook VQ-VAE Based Neural TTS,”
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1811–1824, 2023, https://doi.org/10.1109/
TASLP.2023.3272470.
[32] K. El Emam, L. Mosquera, X. Fang, and A. El-Hussuna, “An evaluation of the replicability of analyses using synthetic health
data,” Scientific Reports, vol. 14, no. 1, p. 6978, Mar. 2024, https://doi.org/10.1038/s41598-024-57207-7.
[33] X. Q. Chen, L. Zhang, and T. J. Cui, “Intelligent autoencoder for space-time-coding digital metasurfaces,” Applied Physics
Letters, vol. 122, no. 16, p. 161702, Apr. 2023, https://doi.org/10.1063/5.0132635.
[34] R. Wei, C. Garcia, A. El-Sayed, V. Peterson, and A. Mahmood, “Variations in Variational Autoencoders - A Comparative
Evaluation,” IEEE Access, vol. 8, pp. 153 651–153 670, 2020, https://doi.org/10.1109/ACCESS.2020.3018151.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Mohammad Diqi, Ema Utami, Kusrini Kusrini, Ferry Wahyu Wibowo

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- Fadhilah Dwi Ananda, Yoga Pristyanto, Analisis Sentimen Pengguna Twitter Terhadap Layanan Internet Provider Menggunakan Algoritma Support Vector Machine , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 20 No. 2 (2021)
- Erlin Erlin, Yenny Desnelita, Nurliana Nasution, Laili Suryati, Fransiskus Zoromi, Dampak SMOTE terhadap Kinerja Random Forest Classifier berdasarkan Data Tidak seimbang , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 3 (2022)
- Fitra Ahya Mubarok, Mohammad Reza Faisal, Dwi Kartini, Dodon Turianto Nugrahadi, Triando Hamonangan Saragih, Gender Classification of Twitter Users Using Convolutional Neural Network , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Ni Wayan Sumartini Saraswati, I Gusti Ayu Agung Diatri Indradewi, Recognize The Polarity of Hotel Reviews using Support Vector Machine , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Amir Ali, Klasterisasi Data Rekam Medis Pasien Menggunakan Metode K-Means Clustering di Rumah Sakit Anwar Medika Balong Bendo Sidoarjo , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 1 (2019)
- Desi Vinsensia, Siskawati Amri, Jonhariono Sihotang, Hengki Tamando Sihotang, New Method for Identification and Response to Infectious Disease Patterns Based on Comprehensive Health Service Data , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Hermila A., Rahmat Taufik R. L Bau, Sitti Suhada, Abdulaziz Ahmed siyad, Predicting Gen Z’s Sentiments on Gorontalo’s CulturalWisdom UsingSentiment Analysis Models , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Paska Marto Hasugian, Devy Mathelinea, Siska Simamora, Pandi Barita Nauli Simangunsong, Comparative Evaluation of Data Clustering Accuracy through Integration of Dimensionality Reduction and Distance Metric , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
- Hartono, Khusnul Khotimah, Rokin Maharjan, Improving Detection Accuracy of Brute-Force Attacks on MariaDBUsing Standard Isolation Forest: A Comparative Analysis with RotatedVariant , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Syahril Efendi, Poltak Sihombing, Sentiment Analysis of Food Order Tweets to Find Out Demographic Customer Profile Using SVM , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 3 (2022)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Andris Faesal, Aziz Muslim, Aditya Hastami Ruger, Kusrini Kusrini, Sentimen Analisis pada Data Tweet Pengguna Twitter Terhadap Produk Penjualan Toko Online Menggunakan Metode K-Means , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
.png)











