Interpreting CNN models for musical instrument recognition using multi-spectrogram heatmap analysis: a preliminary study
2024

Understanding CNNs for Musical Instrument Recognition

Sample size: 10 publication Evidence: moderate

Author Information

Author(s): Chen Rujia, Ghobakhlou Akbar, Narayanan Ajit

Primary Institution: Auckland University of Technology

Hypothesis

How do different spectrogram representations impact the decision-making process of convolutional neural networks in musical instrument recognition?

Conclusion

The study found that MFCC and Log-Mel spectrograms generally perform better for musical instrument recognition, while other types provide unique insights.

Supporting Evidence

  • MFCC and Log-Mel spectrograms showed superior performance across most instruments.
  • Heatmap analysis provided insights into the model's focus during classification.
  • Different spectrogram types captured unique features essential for instrument recognition.

Takeaway

This study looks at how computers can recognize musical instruments by analyzing different types of sound pictures, called spectrograms, to see which ones work best.

Methodology

The study used convolutional neural networks to analyze various spectrogram representations and employed heatmap analysis to assess feature importance.

Limitations

The study used a small subsample of the NSynth dataset and relied on synthesized audio, which may not fully represent real-world audio characteristics.

Digital Object Identifier (DOI)

10.3389/frai.2024.1499913

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication