2024. 5. 20. 12:29ㆍAudio Signal Processing for ML
As we have seen before, audio features can be roughly divided to time domain features and frequency domain features.
Let's take a deeper look at several types of time domain features and the ways to calculate them using python codes.
Amplitude Envelope
The envelope of a soundwave is a curve outlining it's extremes.
Thus, it gives us rough ideas of loudness.
Since it uses the maximum value(amplitude) of a given frame, it is sensitive to outliers.
This concept can be utilized to dynamic fields such as onset detection, music genre classification.
RMSE(Root Mean Square Energy)
While amplitude envelope is sensitive to outliers, RMSE is much more stable because it uses overall energy during computation.
It can be used in fields such as audio segmentation, music genre classification.
Zero Crossing Rate
The zero crossing rate indicates the number of times that a signal crosses the horizontal axis in a given time interval.
It is usually tapped into in recognition of percussive and pitched sounds, or monophonic pitch estimation.
'Audio Signal Processing for ML' 카테고리의 다른 글
Extracting audio features Pipelines (0) | 2024.05.18 |
---|---|
ADC(Analog to Digital Conversion) (0) | 2024.05.18 |
Basic features of sound wave (0) | 2024.05.14 |
Audio Signal Processing for ML - Introduction (0) | 2024.05.12 |