Wednesday, May 14, 2008

Adobe Is Making Some Noise Part 2

[Update: I have updated the code sample to match the API changes in build 10.0.1.525]

The public beta of Adobe® Flash® Player 10, code named Astro has been released. When you read the release notes you'll notice a small sections talking about audio:

"Dynamic Sound Generation — Dynamic sound generation extends the Sound class to play back dynamically created audio content through the use of an event listener on the Sound object."

Yes, in Flash Player 10 you will be able to dynamically create audio. It's not an all powerful API, it is designed to provide a low level abstraction of the native sound driver, hence providing the most flexible platform to build your music applications on. The API has one big compromise which I can't address without large infrastructural changes and that is latency. Latency is horrible to the point where some applications will simply not be possible. To improve latency will require profound changes in the Flash Player which I will tackle for the next major revision. But for now this simple API will likely change the way you think about sound in the Flash Player.

Programming dynamic sound it is all about how quickly and consistently you can deliver the data to the sound card. Most sound cards work using a ring buffer, meaning you as the programmer push data into that ring buffer while the sound card feeds from it at the same time. The high level APIs to deal with this revolve around two concepts: 1. the device model 2. the interrupt model.

For model 1 we run in a loop (usually in a thread) and write sound data to the device. The write will block if the ring buffer is full. The loop continues until the sound ends. This is the most common method of playing back sound on Unix like systems like Linux.

In model 2 we have a function which is called by the system (usually from an interrupt on older systems) in which the applications fills part of the ring buffer. The callback function is called whenever the sound card hits a point where it runs low on samples in the ring buffer. In an OS without real threading this is usually the only way to make sound playback work. MacOS9 or older and Windows 98 or older were using this system and OSX continues to provide a way to do this in CoreAudio. As ActionScript has no threading it is advisable to use this model. We could use frames event to implement a loop, but that would represent an odd programming model.

Flash Player 10 code named Astro supports a new event on the Sound object: "samplesCallback". It will be dispatched on regular interval requesting more audio data. In the event callback function you will have to fill a given ByteArray (Sound.samplesCallbackData) with a certain amount of sound data. The amount is variable, from 512 samples to 8192 samples per event. That is something you decide on and is a balance between performance and latency in your application. The less data you provide per event the more overhead is spent in the Flash Player. The more data you provide the longer the latency for your application will be. If you just play continious audio we suggest to use the maximum amount of data per event as the difference in overall performance can be quite large.

I should note that this API will slightly change (names changes only mostly) in the final release of the Flash Player, this beta represents an older build. I'll update this post with new code once the API is finalized.

Now some real code, some of you on the beta program have seen it. Here is how you play a continuous sine wave in Flash Player 10 with the smallest amount of code:

var sound:Sound = new Sound();
function sineWavGenerator(event:SampleDataEvent):void {
for ( var c:int=0; c<1234; c++ ) {
var sample:Number = Math.sin(
(Number(c+event.position)/Math.PI/2))*0.25;
event.data.writeFloat(sample);
event.data.writeFloat(sample);
}
}
sound.addEventListener("sampleData",sineWavGenerator);
sound.play();

That's it. That simple. You can't get any more low level or flexible than this. The sample above is simple, actual code would probably not call Math.sin() in the inner loop. You would rather prepare a ByteArray or Array outside and copy the data from there.

The sound format is fixed at a sample rate of 44100Hz, 2 channels (stereo) and using 32bit floating point normalized samples. This is currently the highest quality format possible within the Flash Player. We will be targeting a more flexible system in a future Flash Player. It was not possible to offer different samples rates and more or less channels in this version. If you need to resample you can either use pure ActionScript 3 or even Adobe Pixel Bender.

The SamplesCallbackEvent.position property which is passed in is the sample position, not the time, of the segment of audio which is being requested. You can convert this value to milliseconds by dividing it by 44.1.

Your event handler has to provide at least 512 samples each time it is dispatched, at most 8192. If you provide less the Flash Player makes the assumption that you reached the end of the sound, will play the remaining samples and dispatch a SOUND_COMPLETE event. If you provide more than 8192 an exception occurs. In the sample above I use 1234 to make it clear that it can be any value between 512 and 8192

The event will be called in real time. That means you can inject new audio data interactively. The key part to understand here that we are not dealing with long amounts of sound data at any given time.

There is an internal buffer in the Flash Player which is about 0.2 to 0.5 seconds depending on the platform which is preventing drop outs. It will automatically be increased if drop outs occur. This internal buffer is the key for the high latency I was alluding to earlier. You should never depend on a certain latency with this API in your application. To enforce this there is a slight random factor in choosing this buffer size when the Flash Player launches.

Continue to read Part 3 which talks about one more new Sound API in Flash Player 10.

10 Comments:

OpenID jkozniewski said...

Hi

Great news indeed :)

Although one sentence needs some clarification -
"If you need to resample you can either use pure ActionScript 3 or even Adobe Pixel Bender"

Could you give an example of how to use Pixel Bender to modify sound ? I thought that Pixel Bender is completely focused on processing image data...

Regards.

Thursday, May 15, 2008 8:39:00 AM  
Blogger Makc said...

Why, will writing to byte array outside of event be not possible?

Thursday, May 15, 2008 9:39:00 AM  
Blogger Frédéric said...

As jkozniewski, I'm really interested in processing sound with pixel bender.

Does it mean that the computing will be done by the GPU ?

Friday, May 16, 2008 1:02:00 AM  
Blogger Bryan Gale said...

All Pixel Blender does is take some numbers and spit out other numbers. It is primarily intended for image processing, but it's no great stretch of the imagination to say it could be used for audio data.

Will there be any way to determine the overall latency? Even if we can't lessen it, just knowing what it is will make a lot of things possible that wouldn't be otherwise.

Friday, May 16, 2008 4:00:00 AM  
Blogger goann said...

Has Adobe begun to implement MIDI in Flash Player and ActionScript? If not, are there plans to? The present lack of MIDI is an astonishing oversight ... especially for multimedia products that frequently need to delivery music quickly efficiently.

QuickTime has had functional MIDI since version 3 or earlier. Java has a MIDI synthesizer, and a sequencer API. If MIDI is not coming to ActionScript Flash/Flex anytime soon, an AIR application access APIs such as Quicktime or MacOS's CoreAudio and tap into those MIDI resources?


zq

Sunday, May 18, 2008 11:41:00 PM  
Blogger Stephen said...

Tinic,

Great set of articles here on the new sound capabilities. Have been playing around for a couple days now. A lot of fun.

I think there is a need for a correction in the article on this line:
"The sound format is fixed at a sample rate of 44100Khz, 2 channels (stereo) and using 32bit"

should read either:
"The sound format is fixed at a sample rate of 44.1KHz, 2 channels (stereo) and using 32bit"

or

"The sound format is fixed at a sample rate of 44100Hz, 2 channels (stereo) and using 32bit"

Otherwise this could be very confusing for those who don't understand much about sound. Your math later in the article for converting samples into milliseconds is correct and supports the change.

Keep bringing the noise!

Tuesday, May 20, 2008 1:44:00 PM  
Blogger fabien said...

"Will there be any way to determine the overall latency? Even if we can't lessen it, just knowing what it is will make a lot of things possible that wouldn't be otherwise."


very simple:

latency in seconds = (1/samplerate) * bufferlength

Wednesday, May 21, 2008 12:03:00 PM  
Blogger CRS said...

How would you go about finding the total number of samples in an mp3 and then using that number to extract all of its samples into a bytearry for processing later?

Monday, June 30, 2008 3:32:00 AM  
Blogger parker said...

How might the sound buffer be manipulated in such a way as to enforce time-syncing between various audio sources into a globally-referenced object (e.g. "turntable-style" mixing of sound objects with pitch control). Any related info would be much appreciated. Thank you in advance!

Tuesday, November 04, 2008 2:29:00 PM  
Blogger starpause said...

why do you event.data.writeFloat(sample); twice?

also, i'm trying to compile the above code as a pure as3 application, here's the source

http://pastebin.com/d745da114

i see a single trace of ' ! SimpleSine()' and ' ! sineWavGenerator()' but just one. i would expect to see a series of ' ! sineWavGenerator()' traces in response to many "sampleData" events.

any ideas on where this simple implementation is failing?

Monday, November 17, 2008 9:37:00 PM  

Post a Comment

<< Home