REVIEW OF THE RESEARCH ON NOTE RECOGNITION FROM AUDIO DATA USING NEURAL NETWORKS

DOI: 10.31673/2786-8362.2025.018603

Authors

  • Я. В. Бай, (Bai Y.V.) State University of Information and Communication Technologies, Kyiv
  • Ю. І. Катков, (Katkov Yu.I.) State University of Information and Communication Technologies, Kyiv

DOI:

https://doi.org/10.31673/2786-8362.2025.018603

Abstract

This article explores modern neural network approaches to automatic music note recognition
from audio recordings. The study examines various methods, including Fast Fourier Transform (FFT),
Convolutional Neural Networks (CNN), and Residual Shuffle-Exchange Networks (RSE). Each approach
is analyzed in terms of accuracy, adaptability, and performance in real-time scenarios. The paper highlights
the strengths of deep learning techniques in handling challenges such as background noise, polyphonic
textures, and variability in note articulation across different instruments. Experimental results show that
CNNs and RSE-based models significantly outperform traditional signal processing methods, achieving
precision rates exceeding 80%. Special attention is given to data preprocessing, feature extraction, and the
design of deep architectures suited for musical tasks. The research also emphasizes the importance of
diverse datasets, including both real and synthetic recordings, for improving generalization. The findings
indicate the strong potential of neural networks in applications such as music transcription, live
performance analysis, and music education, offering real-time and highly accurate note recognition
systems.
Keywords: Note recognition, audio analysis, neural networks, convolutional neural network, RSE
network, music transcription, deep learning, sound processing, machine learning, artificial intelligence

References
1. A tutorial on onset detection in music signals / J. P. Bello et al. IEEE transactions on speech
and audio processing. 2005. Vol. 13, no. 5. P. 1035–1047. URL:
https://doi.org/10.1109/tsa.2005.851998.
2. Duan Z., Zhang D. Note recognition of various instruments played in noisy environment by
deep convolutional neural networks. Applied acoustics. 2018. Vol. 141. P. 154–164.
3. Pons J., Serra X., Gómez E. End-to-end learning for music audio tagging at scale. Proceedings
of the 17th international society for music information retrieval conference. 2016. P. 315–321.
4. Slepkov A. D., Steedman M. A convolutional neural network approach to real-time pitch
detection. Journal of the Acoustical Society of America. 2017. Vol. 141, no. 5. P. EL462–EL468.
5. Uhle C., Schedl M., Pohle T. Deep learning for musical instrument recognition in audio
recordings. Journal of the Audio Engineering Society. 2018. Vol. 66, no. 9. P. 680–693.
6. Lidy T., Rauber A. Evaluation of convolutional neural networks for music classification tasks.
Journal of new music research. 2015. Vol. 44, no. 2. P. 101–114.
7. Yang Y., Lee H. Pitch tracking of guitar notes using deep convolutional neural networks.
Proceedings of the International Conference on New Interfaces for Musical Expression. 2018. P.
229–234.
8. Azarloo A., Farokhi F. Automatic musical instrument recognition using K-NN and MLP
neural networks. 2012 4th international conference on computational intelligence, communication
systems and networks (cicsyn 2012), Phuket, Thailand, 24–26 July 2012. 2012. URL:
https://doi.org/10.1109/cicsyn.2012.61.
9. Thickstun J., Harchaoui Z., Kakade S. M. Learning features of music from scratch. ICLR
(Poster). 2017.
10. Freivalds K., Ozolins E., Sostaks A. Neural Shuffle-Exchange Networks – Sequence
Processing in O(n log n) Time. Advances in Neural Information Processing Systems. 2019. Vol. 32.
P. 6626–6637.
11. Residual shuffle-exchange networks for fast processing of long sequences / A. Draguns et al.
Proceedings of the AAAI conference on artificial intelligence. 2021. Vol. 35, no. 8. P. 7245–7253.
URL: https://doi.org/10.1609/aaai.v35i8.16890.
12. Fujinaga I., MacMillan K. Realtime Recognition of Orchestral Instruments. Proceedings of
the International Computer Music Conference (ICMC). 2000. P. 141–143.

Published

2025-06-21

Issue

Section

Articles