Mention audio processing to most iOS developers, and they’ll give you a look of fear and trepidation. That’s because, prior to iOS 8, it meant diving into the depths of the low-level Core Audio framework — a trip only a few brave souls dared to make. Thankfully, that all changed in 2014 with the release of iOS 8 and AVAudioEngine. This AVAudioEngine tutorial will show you how to use Apple’s new, higher level audio toolkit to make audio processing apps without needing to dive into Core Audio.
That’s right! No longer do you need to search through obscure pointer-based C/C++ structs and memory buffers to gather your raw audio data.
In this AVAudioEngine tutorial, you’ll use AVAudioEngine
to build the next great podcasting app: Raycast. More specifically, you’ll add the audio functionality controlled by the UI: play/pause button, skip forward/back buttons, progress bar and playback rate selector. When you’re done, you’ll have a fantastic app for listening to Dru and Janie.
Getting Started
To get started, download the materials for this tutorial (you can find a link at the top or bottom of this tutorial). Build and run your project in Xcode, and you’ll see the basic UI.
The controls don’t do anything yet, but they’re all connected to IBOutlets
and associated IBActions
in the view controllers.
iOS Audio Framework Introduction
Before jumping into the project, here’s a quick overview of the iOS Audio frameworks:
- CoreAudio and AudioToolbox are the low-level C frameworks.
- AVFoundation is an Objective-C/Swift framework.
- AVAudioEngine is a part of AVFoundation.
-
AVAudioEngine is a class that defines a group of connected audio nodes. You’ll be adding two nodes to the project:
AVAudioPlayerNode
andAVAudioUnitTimePitch
.
Setup Audio
Open ViewController.swift and take a look inside. At the top, you’ll see all of the connected outlets and class variables. The actions are also connected to the appropriate outlets in the storyboard.
Add the following code to setupAudio()
:
// 1
audioFileURL = Bundle.main.url(forResource: "Intro", withExtension: "mp4")
// 2
engine.attach(player)
engine.connect(player, to: engine.mainMixerNode, format: audioFormat)
engine.prepare()
do {
// 3
try engine.start()
} catch let error {
print(error.localizedDescription)
}
Take a closer look at what’s happening:
- This gets the URL of the bundle audio file. When set, it will instantiate
audioFile
inaudioFileURL
‘sdidSet
block in the variable declaration section above. - Attach the player node to the engine, which you must do before connecting other nodes. These nodes will either produce, process or output audio. The audio engine provides a main mixer node that you connect to the player node. By default, the main mixer connects to the
engine
default output node (iOS device speaker).prepare()
preallocates needed resources.
Next, add the following to scheduleAudioFile()
:
guard let audioFile = audioFile else { return }
skipFrame = 0
player.scheduleFile(audioFile, at: nil) { [weak self] in
self?.needsFileScheduled = true
}
This schedules the playing of the entire audioFile
. at:
is the time (AVAudioTime
) in the future you want the audio to play. Setting to nil
starts playback immediately. The file is only scheduled to play once. Tapping the Play button again doesn’t restart it from the beginning. You’ll need to reschedule to play it again. When the audio file is done playing, the flag, needsFileScheduled
, is set in the completion block.
There are other variants of scheduling audio for playback:
scheduleBuffer(AVAudioPCMBuffer, completionHandler: AVAudioNodeCompletionHandler? = nil)
: This provides a buffer preloaded with the audio data.scheduleSegment(AVAudioFile, startingFrame: AVAudioFramePosition, frameCount: AVAudioFrameCount, at: AVAudioTime?, completionHandler: AVAudioNodeCompletionHandler? = nil)
: This is likescheduleFile
except you specify which audio frame to start playing from and how many frames to play.
Then, add the following to playTapped(_:)
:
// 1
sender.isSelected = !sender.isSelected
// 2
if player.isPlaying {
player.pause()
} else {
if needsFileScheduled {
needsFileScheduled = false
scheduleAudioFile()
}
player.play()
}
Here’s the breakdown:
- Toggle the selection state of button, which changes the button image as set in storyboard.
- Use
player.isPlaying
to determine if the player currently playing. If so, pause it; if not, play. You also checkneedsFileScheduled
and reschedule the file if required.
Build and run, then tap the playPauseButton
. You should hear Ray’s lovely intro to The raywenderlich.com Podcast. :] But, there’s no UI feedback; you have no idea how long the file is or where you are in it.
Add Progress Feedback
Add the following to the end of viewDidLoad()
:
updater = CADisplayLink(target: self, selector: #selector(updateUI))
updater?.add(to: .current, forMode: .defaultRunLoopMode)
updater?.isPaused = true
CADisplayLink
is a timer object that synchronizes with the display’s refresh rate. You instantiate it with the selector, updateUI
. Then, you add it to a run loop — in this case, the default run loop. Finally, it doesn’t need to start running yet, so set isPaused
to true
.
Replace the implementation of playTapped(_:)
with the following:
sender.isSelected = !sender.isSelected
if player.isPlaying {
disconnectVolumeTap()
updater?.isPaused = true
player.pause()
} else {
if needsFileScheduled {
needsFileScheduled = false
scheduleAudioFile()
}
connectVolumeTap()
updater?.isPaused = false
player.play()
}
The key thing here is to pause the UI with updater.isPaused = true
when the player
pauses. You’ll learn about connectVolumeTap()
and disconnectVolumeTap()
in the VU Meter section below.
Replace var currentFrame: AVAudioFramePosition = 0
with the following:
var currentFrame: AVAudioFramePosition {
// 1
guard
let lastRenderTime = player.lastRenderTime,
// 2
let playerTime = player.playerTime(forNodeTime: lastRenderTime)
else {
return 0
}
// 3
return playerTime.sampleTime
}
currentFrame
returns the last audio sample rendered by player
. Here’s a closer look:
player.lastRenderTime
returns the time in reference toengine
start time. Ifengine
is not running,lastRenderTime
returnsnil
.player.playerTime(forNodeTime:)
convertslastRenderTime
to time relative toplayer
start time. Ifplayer
is not playing, thenplayerTime
returnsnil
.sampleTime
is time as a number of audio samples within the audio file.
Now for the UI updates. Add the following to updateUI()
:
// 1
currentPosition = currentFrame + skipFrame
currentPosition = max(currentPosition, 0)
currentPosition = min(currentPosition, audioLengthSamples)
// 2
progressBar.progress = Float(currentPosition) / Float(audioLengthSamples)
let time = Float(currentPosition) / audioSampleRate
countUpLabel.text = formatted(time: time)
countDownLabel.text = formatted(time: audioLengthSeconds - time)
// 3
if currentPosition >= audioLengthSamples {
player.stop()
updater?.isPaused = true
playPauseButton.isSelected = false
disconnectVolumeTap()
}
Let’s step through this:
- The property
skipFrame
is an offset added to or subtracted fromcurrentFrame
, initially set to zero. Make surecurrentPosition
doesn’t fall outside the range of the file. - Update
progressBar.progress
tocurrentPosition
withinaudioFile
. Computetime
by dividingcurrentPosition
bysampleRate
ofaudioFile
. UpdatecountUpLabel
andcountDownLabel
text to current time withinaudioFile
. - If
currentPosition
is at the end of the file, then:- Stop the player.
- Pause the timer.
- Reset the
playPauseButton
selection state. - Disconnect the volume tap.
Build and run, then tap the playPauseButton
. Once again, you’ll hear Ray’s intro, but this time the progressBar
and timer labels supply the missing status information.
Implement the VU Meter
Now it’s time for you to add the VU Meter functionality. It’s a UIView
positioned to fit between the pause icon’s bars. The height of the view determined by the average power of the playing audio. This is your first opportunity for some audio processing.
You’ll compute the average power on a 1k buffer of audio samples. A common way to determine the average power of a buffer of audio samples is to calculate the Root Mean Square (RMS) of the samples.
Average power is the representation, in decibels, of the average value of a range of audio sample data. There’s also peak power, which is the max value in a range of sample data.
Add the following helper method below connectVolumeTap()
:
func scaledPower(power: Float) -> Float {
// 1
guard power.isFinite else { return 0.0 }
// 2
if power < minDb {
return 0.0
} else if power >= 1.0 {
return 1.0
} else {
// 3
return (fabs(minDb) - fabs(power)) / fabs(minDb)
}
}
scaledPower(power:)
converts the negative power
decibel value to a positive value that adjusts the volumeMeterHeight.constant
value above. Here’s what it does:
power.isFinite
checks to make sure power is a valid value — i.e., notNaN
— returning 0.0 if it isn’t.- This sets the dynamic range of our vuMeter to 80db. For any value below -80.0, return 0.0. Decibel values on iOS have a range of -160db, near silent, to 0db, maximum power.
minDb
is set to -80.0, which provides a dynamic range of 80db. You can alter this value to see how it affects the vuMeter. - Compute the scaled value between 0.0 and 1.0.
Now, add the following to connectVolumeTap()
:
// 1
let format = engine.mainMixerNode.outputFormat(forBus: 0)
// 2
engine.mainMixerNode.installTap(onBus: 0, bufferSize: 1024, format: format) { buffer, when in
// 3
guard
let channelData = buffer.floatChannelData,
let updater = self.updater
else {
return
}
let channelDataValue = channelData.pointee
// 4
let channelDataValueArray = stride(from: 0,
to: Int(buffer.frameLength),
by: buffer.stride).map{ channelDataValue[$0] }
// 5
let rms = sqrt(channelDataValueArray.map{ $0 * $0 }.reduce(0, +) / Float(buffer.frameLength))
// 6
let avgPower = 20 * log10(rms)
// 7
let meterLevel = self.scaledPower(power: avgPower)
DispatchQueue.main.async {
self.volumeMeterHeight.constant = !updater.isPaused ?
CGFloat(min((meterLevel * self.pauseImageHeight), self.pauseImageHeight)) : 0.0
}
}
There’s a lot going on here, so here’s the breakdown:
- Get the data format for the
mainMixerNode
‘s output. installTap(onBus: 0, bufferSize: 1024, format: format)
gives you access to the audio data on themainMixerNode
‘s output bus. You request a buffer size of 1024 bytes, but the requested size isn’t guaranteed, especially if you request a buffer that’s too small or large. Apple’s documentation doesn’t specify what those limits are. The completion block receives anAVAudioPCMBuffer
and aAVAudioTime
as parameters. You can checkbuffer.frameLength
to determine the actual buffer size.when
provides the capture time of the buffer.buffer.floatChannelData
gives you an array of pointers to each sample’s data.channelDataValue
is an array ofUnsafeMutablePointer<Float>
- Converting from an array of
UnsafeMutablePointer<Float>
to an array ofFloat
makes later calculations easier. To do that, usestride(from:to:by:)
to create an array of indexes intochannelDataValue
. Thenmap{ channelDataValue[$0] }
to access and store the data values inchannelDataValueArray
. - Computing the RMS involves a map/reduce/divide operation. First, the map operation squares all of the values in the array, which the reduce operation sums. Divide the sum of the squares by the buffer size, then take the square root, producing the RMS of the audio sample data in the buffer. This should be a value between 0.0 and 1.0, but there could be some edge cases where it’s a negative value.
- Convert the RMS to decibels (Acoustic Decibel reference). This should be a value between -160 and 0, but if
rms
is negative, this value would beNaN
. - Scale the decibels into a value suitable for your vuMeter.
Finally, add the following to disconnectVolumeTap()
:
engine.mainMixerNode.removeTap(onBus: 0)
volumeMeterHeight.constant = 0
AVAudioEngine
allows only a single-tap per bus. It’s a good practice to remove it when not in use.
Build and run, then tap playPauseButton
. The vuMeter is now active, providing average power feedback of the audio data.
Implementing Skip
Time to implement the skip forward and back buttons. skipForwardButton
jumps ahead 10 seconds into the audio file, and skipBackwardButton
jumps back 10 seconds.
Add the following to seek(to:)
:
guard
let audioFile = audioFile,
let updater = updater
else {
return
}
// 1
skipFrame = currentPosition + AVAudioFramePosition(time * audioSampleRate)
skipFrame = max(skipFrame, 0)
skipFrame = min(skipFrame, audioLengthSamples)
currentPosition = skipFrame
// 2
player.stop()
if currentPosition < audioLengthSamples {
updateUI()
needsFileScheduled = false
// 3
player.scheduleSegment(audioFile,
startingFrame: skipFrame,
frameCount: AVAudioFrameCount(audioLengthSamples - skipFrame),
at: nil) { [weak self] in
self?.needsFileScheduled = true
}
// 4
if !updater.isPaused {
player.play()
}
}
Here's the play by play:
- Convert
time
, which is in seconds to frame position by multiplying byaudioSampleRate
, and add it tocurrentPosition
. Then, make sureskipFrame
is not before the start of the file and not past the end of the file. player.stop()
not only stops playback, but it also clears all previously scheduled events. CallupdateUI()
to set the UI to the newcurrentPosition
value.player.scheduleSegment(_:startingFrame:frameCount:at:)
schedules playback starting atskipFrame
position ofaudioFile
.frameCount
is the number of frames to play. You want to play to the end of file, so set it toaudioLengthSamples - skipFrame
. Finally,at: nil
specifies to start playback immediately instead of at some time in the future.- If
player
was playing beforeskip
was called, then callplayer.play()
to resume playback.updater.isPaused
is convenient for determining this, because it is onlytrue
ifplayer
was previously paused.
Build and run, then tap the playPauseButton
. Tap skipBackwardButton
and skipForwardButton
to skip forward and back. Watch as the progressBar
and count labels change.
Implementing Rate Change
The last thing to implement is changing the rate of playback. Listening to podcasts at higher than 1x speeds is a popular feature these days.
In setupAudio()
, replace the following:
engine.attach(player)
engine.connect(player, to: engine.mainMixerNode, format: audioFormat)
with:
engine.attach(player)
engine.attach(rateEffect)
engine.connect(player, to: rateEffect, format: audioFormat)
engine.connect(rateEffect, to: engine.mainMixerNode, format: audioFormat)
This attaches and connects rateEffect
, a AVAudioUnitTimePitch
node, to the audio graph. This node type is an effects node, specifically it can change the rate of playback and pitch shift the audio.
The didChangeRateValue()
action handles changes to rateSlider
. It computes an index into rateSliderValues
array and sets rateValue
, which sets rateEffect.rate
. rateSlider
has a value range of 0.5x to 3.0x
Build and run, then tap the playPauseButton
. Adjust rateSlider
to hear what Ray sounds like when he has had too much or too little coffee.
Where to Go From Here?
You can download the final project using the link at the top or bottom of this tutorial.
Look at the other effects you can add to audioSetup()
. One option is to wire up a pitch shift slider to rateEffect.pitch
and make Ray sound like a chipmunk. :]
To learn more about AVAudioEngine
and related iOS audio topics, check out:
- WWDC 2014 Session 502: AVAudioEngine in Practice
- Apple's "Working with Audio"
- Beginning Audio with AVFoundation: Audio Effects
- Audio Tutorial for iOS: File and Data Formats
We hope you enjoyed this tutorial on AVAudioEngine
. If you have any questions or comments, please join the forum discussion below!
The post AVAudioEngine Tutorial for iOS: Getting Started appeared first on Ray Wenderlich.