Time Domain features

2024. 5. 20. 12:29Audio Signal Processing for ML

As we have seen before, audio features can be roughly divided to time domain features and frequency domain features.

Let's take a deeper look at several types of time domain features and the ways to calculate them using python codes.

 

Amplitude Envelope

The envelope of a soundwave is a curve outlining it's extremes.

Thus, it gives us rough ideas of loudness.

Since it uses the maximum value(amplitude) of a given frame, it is sensitive to outliers.

This concept can be utilized to dynamic fields such as onset detection, music genre classification.

 

https://en.wikipedia.org/wiki/Envelope_(waves)

 

Google Colab Notebook

Run, share, and edit Python notebooks

colab.research.google.com

 

RMSE(Root Mean Square Energy)

While amplitude envelope is sensitive to outliers, RMSE is much more stable because it uses overall energy during computation.

It can be used in fields such as audio segmentation, music genre classification.

 

Zero Crossing Rate

The zero crossing rate indicates the number of times that a signal crosses the horizontal axis in a given time interval.

It is usually tapped into in recognition of percussive and pitched sounds, or monophonic pitch estimation.

 

Google Colab Notebook

Run, share, and edit Python notebooks

colab.research.google.com