Adaptive Gated Networks for Intelligent Diagnosis in Complex Systems | IJCSE Volume 10 โ Issue 2 | IJCSE-V10I2P20
Table of Contents
ToggleInternational Journal of Computer Science Engineering Techniques
ISSN: 2455-135X
Volume 10, Issue 2
|
Published:
Author
Harsh Hemantkumar Patel
Abstract
This research presents a novel gated neural ar- chitecture tailored for high-efficiency feature extraction and intelligent decision processes in complex prediction tasks. By dynamically prioritizing salient inputs, the model strengthens both learning precision and system resilience. Comprehensive evaluations across multiple benchmark datasets reveal that the framework consistently outperforms conventional deep learning models. Results underscore its capability to capture intricate data relationships and adaptive patterns. Additionally, the study highlights the modelโs scalability and its potential integration into real-world intelligent systems, offering a robust pathway toward advanced automation solutions.
Keywords
^Conclusion
This study has presented several novel modifications to the original TabTransformer architecture [6] aimed at enhancing binary classification performance across three distinct datasets. Our proposed adjustments yielded consistent improvements exceeding 1% in the area under the receiver operating charac- teristic curve (AUC-ROC), demonstrating their effectiveness in practical scenarios. A key innovation introduced involves replacing the original final multilayer perceptron (MLP) block of the TabTransformer with a linear projection mechanism inspired by gated multilayer perceptrons (gMLP) [7], which refines the way final logits are produced.
Furthermore, we conducted an extensive hyperparameter tuning process, systematically experimenting with various acti- vation functions, learning rates, hidden layer sizes, and overall layer configurations. These explorations revealed crucial in- sights about how architectural choices and training parameters influence the modelโs predictive capabilities on tabular data. Our findings underscore the importance of carefully adapting model components and training strategies to optimize tabular data modeling tasks.
To facilitate reproducibility and encourage further research, we have made our implementation publicly available as open- source software. Additionally, we have demonstrated the prac- tical applicability of our enhanced TabTransformer in real- world use cases, reinforcing its potential value for practitioners working with tabular datasets. Overall, this work contributes meaningful advancements in the field of tabular data repre- sentation learning, and we anticipate that future investigations can build upon our approach to develop even more robust and accurate models.
References
[1]T. Chen and C. Guestrin, โXgboost: A scalable tree boosting system,โ in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785โ794.
[2]J. Fiedler, โSimple modifications to improve tabular neural networks,โ
arXiv preprint arXiv:2108.03214, 2021.
[3]A. Abutbul, G. Elidan, L. Katzir, and R. El-Yaniv, โDnf-net: A neural architecture for tabular data,โ arXiv preprint arXiv:2006.06465, 2020.
[4]S. Arik and T. Pfister, โTabnet: Attentive interpretable tabular learning. arxiv 2019,โ arXiv preprint arXiv:1908.07442.
[5]A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, ล. Kaiser, and I. Polosukhin, โAttention is all you need,โ in Advances in neural information processing systems, 2017, pp. 5998โ6008.
[6]X. Huang, A. Khetan, M. W. Cvitkovic, and Z. S. Karnin, โTabtrans- former: Tabular data modeling using contextual embeddings,โ ArXiv, vol. abs/2012.06678, 2020.
[7]H. Liu, Z. Dai, D. R. So, and Q. V. Le, โPay attention to mlps,โ ArXiv, vol. abs/2105.08050, 2021.
[8]M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, โMobilenetv2: Inverted residuals and linear bottlenecks,โ in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510โ4520.
[9]A. de Brebisson and G. Montana, โDeep neural networks for anatomical brain segmentation,โ in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2015, pp. 20โ28.
[10]Z. Li, W. Cheng, Y. Chen, H. Chen, and W. Wang, โInterpretable click- through rate prediction through hierarchical attention,โ in Proceedings of the 13th International Conference on Web Search and Data Mining, 2020, pp. 313โ321.
[11]A. Kadra, M. Lindauer, F. Hutter, and J. Grabocka, โWell-tuned simple nets excel on tabular datasets,โ in Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
[12]P. et al., โPytorch: An imperative style, high-performance deep learning library,โ in Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019, pp. 8024โ8035.
[13]NVIDIA, P. Vingelmann, and F. H. Fitzek, โCuda, release: 10.2.89,โ 2020. [Online]. Available: https://developer.nvidia.com/cuda-toolkit
[14]Wes McKinney, โData Structures for Statistical Computing in Python,โ in Proceedings of the 9th Python in Science Conference, Steยดfan van der Walt and Jarrod Millman, Eds., 2010, pp. 56 โ 61.
[15]M. L. Waskom, โseaborn: statistical data visualization,โ Journal of Open Source Software, vol. 6, no. 60, p. 3021, 2021. [Online].
Available: https://doi.org/10.21105/joss.03021
[16]R. Liaw, E. Liang, R. Nishihara, P. Moritz, J. E. Gonzalez, and I. Stoica, โTune: A research platform for distributed model selection and training,โ arXiv preprint arXiv:1807.05118, 2018.
[17]A. P. Bradley, โThe use of the area under the roc curve in the evaluation of machine learning algorithms,โ Pattern recognition, vol. 30, no. 7, pp. 1145โ1159, 1997.
[18]F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander- plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch- esnay, โScikit-learn: Machine learning in Python,โ Journal of Machine Learning Research, vol. 12, pp. 2825โ2830, 2011.
[19]J. D. Hunter, โMatplotlib: A 2d graphics environment,โ Computing in Science & Engineering, vol. 9, no. 3, pp. 90โ95, 2007.
[20]T. Elsken, J. H. Metzen, and F. Hutter, โNeural architecture search: A survey,โ The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1997โ2017, 2019.
[21]D. So, Q. Le, and C. Liang, โThe evolved transformer,โ in International Conference on Machine Learning. PMLR, 2019, pp. 5877โ5886.
[22]P. Yin, G. Neubig, W.-t. Yih, and S. Riedel, โTabert: Pretraining for joint understanding of textual and tabular data,โ arXiv preprint arXiv:2005.08314, 2020.
[23]I. Padhi, Y. Schiff, I. Melnyk, M. Rigotti, Y. Mroueh, P. Dognin, J. Ross,
R. Nair, and E. Altman, โTabular transformers for modeling multivariate time series,โ in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 3565โ3569.
