Understanding CNNs for Musical Instrument Recognition
Author Information
Author(s): Chen Rujia, Ghobakhlou Akbar, Narayanan Ajit
Primary Institution: Auckland University of Technology
Hypothesis
How do different spectrogram representations impact the decision-making process of convolutional neural networks in musical instrument recognition?
Conclusion
The study found that MFCC and Log-Mel spectrograms generally perform better for musical instrument recognition, while other types provide unique insights.
Supporting Evidence
- MFCC and Log-Mel spectrograms showed superior performance across most instruments.
- Heatmap analysis provided insights into the model's focus during classification.
- Different spectrogram types captured unique features essential for instrument recognition.
Takeaway
This study looks at how computers can recognize musical instruments by analyzing different types of sound pictures, called spectrograms, to see which ones work best.
Methodology
The study used convolutional neural networks to analyze various spectrogram representations and employed heatmap analysis to assess feature importance.
Limitations
The study used a small subsample of the NSynth dataset and relied on synthesized audio, which may not fully represent real-world audio characteristics.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website