Audio Processing and Normalization

Click here to see this page in full context

Welcome < Features < Audio Processing and Normalization

Audio Processing and Normalization

Zetta^® has the capability to process audio several different ways at different points of play. Zetta has the ability to process audio, add transition markers and normalize the audio on a station-by-station basis. Giving the user more control with how audio is played on the air, on different stations, with the same piece of audio.

In this Topic: show/hide show/hide

Audio Processing

Audio Normalization

How Normalization works in Zetta

See also the Audio Utility section of the guide.

Audio Processing

The processing of Audio starts with the settings defined in the Audio Processing tab of the System Configuration menu. Once the user has chosen what Marks Analysis to use, audio added to Zetta will be processed using the selected method.

The Method drop-down allows the user to select how the audio that is imported, added, captured or recorded is processed. The Method drop-down allows the user to select one of the following options:

No Analysis – The No Analysis option will not make any changes to processed audio. Normalization values are not calculated. If No Analysis is selected, Normalization and Marks Analysis can be done later using the Audio Utility found in the Library toolbar.

Normalization and Waveform only - The Normalization and Waveform option will normalize the audio and create the waveform only. It will not locate any markers such as Trim In and Trim Out.

Marks Analysis – Marks Analysis is the default recommended analysis and will analyze the audio and set the marks based on Marks Analysis Settings.

Trim-in threshold – Trim-in is an open marker to define where “silence” ends and actual audio begins. (Default -50dB)
Trim-out threshold – Trim-out is a closing marker where audio becomes “silence”. (Default -50dB)
Segue threshold – Segue is the transition point in the audio when one event is mixed with the next event. (Default -18dB)
Determine intros – Zetta will determine where the accents in the introduction for each audio event are and mark up to three Intros.

Voice Track audio is always analyzed using the peak analysis method. The trim is set based on the trim threshold settings set in the Voice Track Settings section of the System | Audio Processing or Station | Audio Processing tab.

In addition to the processing of newly added audio, existing audio can be processed using the Audio Utility in the Library Module Toolbar. The Audio Utility will analyze chosen audio based on values set in the Marks Analysis Settings section of the Configuration | System | Audio Processing tab.

The following flow chart shows the path an audio file can take in the Audio Processing in Zetta^®

Audio Normalization

The Audio Normalization in Zetta is a non-destructive normalization, which means all the normalization is done within Zetta and the actual audio file is untouched. The following are Normalization algorithms used by Zetta and Zetta2GO:

Absolute Peak – (As used in AFC 4.0) Absolute peak normalization checks every sample in the audio file and calculates the largest audio peak value.

Root Mean Square – (As used in Selector/MC 15) Root Mean Square uses the square root mean algorithm to calculate peak values, which is the square root of the mean or middle of the squares of the values.

Filtered Peak – (As used in AFC 3.0) Filtered peak normalization analyzes all the samples within an audio file and calculates the peak level by ignoring 5% of highest samples.

Average Peak – Average peak normalization which is new with Zetta, analyzes all the samples within an audio file and calculates the median or middle value of the top 5% of the peaks.

Loudness (EBU R128) ITU-R BS.1770rev0 - Loudness normalization is concerned with balancing audio according to their actually perceived loudness. The EBU R128 is the recommended level of loudness at -23LUFS. The ITU-R BS.1770 standards provide an algorithm for quantifying the loudness and loudness level of a given audio asset in loudness units (LU or LUFS).

Loudness (Gated) ITU-R BS.1770rev3 - Loudness (Gated) ITU-R BS.1770rev3 is an algorithm to measure audio loudness of a Gated block of audio. This is the default setting and is recommended for Broadcast audio.

These new and smarter normalization methods became needed in response to a loudness race, a symptom where audio content providers master audio (music, commercials etc.) to be increasingly louder. The phenomenon is known as a Loudness War and the full story can be found for example here. This loudness war can be experienced first-hand - grab a CD released in say early 80s (or older) and play a song from that CD back-to-back with a song released recently. The "younger" material is noticeably louder. That is a result of heavy dynamic compression then normalizing the newer song, a practice embraced by music industry since the mid-90s or so. The difference can also be seen by looking at the waveform for both songs; while for the older song will show a nice waveform with its ups and downs, the new material will most likely look like a solid rectangle with hardly any visible waveform (let alone ups and downs) - until the use zooms into the song.

The new normalization methods now deal with this "issue" as the algorithm used calculates perceived loudness (rather than peaks-based loudness), i.e. factor in the way the human ear and the brain perceive the audio, regardless how dynamically compressed the material actually is. In other words, the algorithm adjusts the volume in a similar way the volume would be adjust manually by hand, had the two (old and new) songs played back to back. Similar technology has been around for years in consumer electronics (commonly called Replay Gain), and the EBU R128 / Loudness is the flavor of the same for the broadcast industry. The results are impressively good.

How Normalization works in Zetta

Normalization is calculated on every audio file that is added to Zetta whether it is through importing, auto load, recording or drag and drop. With normalization being non-destructive Zetta will find the normalization value for each of the algorithms supported in Zetta and save it to the database. In Zetta, you can set a station normalization type and value for each type of audio. Once set, all audio for that station/type will be played at that normalization value. If you decide to change the value, it is a simple configuration change from any Zetta workstation and the change takes effect right away, with no restarts and no touching of the audio files. Zetta has the capability to process audio several different ways at different points of play. Zetta has the ability to process audio, add transition markers and normalize the audio on a station-by-station basis. Giving you more control with how audio is played on the air, on different stations, with the same piece of audio.

These values can be displayed in the Library Module by adding the desired columns to the Library Module Layout. See the section Changing the Library Module Layout for more information on adding and removing columns. The Calculated Gain column can also be displayed to show the percentage of gain for each piece of audio.

Important Note!

A Full Analysis, using the Audio Utility, will need to be performed for assets that have been imported to Zetta prior to version 2.9 to calculate the values for the Loudness (EBU R128) ITU-R BS.1770rev0 and Loudness (Gated) ITU-R BS.1770rev3

The Play of the audio on-air for each station is based on the normalization chosen in Configuration | Station | Audio Processing tab for each media asset type.

Welcome < Features < Audio Processing and Normalization