Monday, March 01, 2010

Timing it right

Status quo

During the Flash Player 10.1 time frame, I was tasked with taking a look at the timing system we use in the Flash Player. Until now the Flash Player has been using a poll based system. Poll based means that everything which happens in the player is served from a single thread and entry point using a periodic timer which polls the run-time. In pseudo code the top level function in the Flash Player looked like this:
while ( sleep ( 1000/120 milliseconds ) ) {
// Every browser provides a different timer interval
...
if ( timerPending ) { // AS2 Intervals, AS3 Timers
handleTimers();
}
if ( localConnectionPending ) {
handleLocalConnection();
}
if ( videoFrameDue ) {
decodeVideoFrame();
}
if ( audioBufferEmpty ) {
refillAudioBuffer();
}
if ( nextSWFFrameDue ) {
parseSWFFrame();
if ( actionScriptInSWFFrame ) {
executeActionScript();
}
}
if ( needsToUpdateScreen ) {
updateScreen();
}
...
}
The periodic timer is not driven by the Flash Player, it is driven by the browser. In case of Internet Explorer there is an API for this purpose. In the case of Safari on OS X is it hard coded to 50 frames/sec. Every browser implements this slightly differently and things become very complex quickly once you go into details. This has been causing a lot of frustration among designers who could never count on a consistent cross platform behavior.

Another challenging issue with this approach has been that limiting the periodic timer to the SWF frame rate is not acceptable. The problem becomes more obvious when you think of a SWF with a frame rate of let's say 8 and play back a video inside which runs at 30 frames/sec. To get good video playback you really need to drive the periodic timer at a very high frequency to get good playback otherwise video frames will appear late. In the end the Flash Player always used the highest frequency available on a particular platform and/or browser environment.

The wrong path

The obvious way to re-architect this is to get rid of the polling and instead design an event based system. The new player code would have looked like this, with different subclasses of a Event base class encapsulating what the polling code had done before:
Event e;
while ( e=waitForNextEvent() ) {
e.dispatch();
}
This approach failed miserably:
  • CPU usage turned out to be much higher than expected due to the abstraction involved.
  • In some cases the queue would grow unbounded.
  • The queue needed a prioritization scheme which turned out to be almost impossible to tune properly.
  • Most SWF content out there depends on a certain sequence logic. Out of order events broke the majority of the SWFs out there.
It's not all bad

Back to the drawing board. This time my focus was on the actual problem: The Flash Player polls up to 120 times second even if nothing is happening. Modifying the original code slightly I came up with this:

while ( sleepuntil( nextEventTime ) OR externalEventOccured() ) {
...
if ( timerPending ) { // AS2 Intervals, AS3 Timers
handleTimers();
nextEventTime = nextTimerTime();
}
if ( localConnectionPending ) {
handleLocalConnection();
nextEventTime = min(nextEventTime , nextLocalConnectionTime());
}
if ( videoFrameDue ) {
decodeVideoFrame();
nextEventTime = min(nextEventTime , nextVideoFrameTime());
}
if ( audioBufferEmpty ) {
refillAudioBuffer();
nextEventTime = min(nextEventTime , nextAudioRebufferTime());
}
if ( nextSWFFrameDue ) {
parseSWFFrame();
if ( actionScriptInSWFFrame ) {
executeActionScript();
}
nextEventTime = min(nextEventTime , nextFrameTime());
}
if ( needsToUpdateScreen ) {
updateScreen();
}
...
}
This approach is solving several problems:
  • There is no abstraction overhead.
  • In most cases it reduces the polling frequency to a fraction.
  • It is fairly backwards compatible.
More importantly, I replaced the browser timer with a cross platform timer which can wait for a particular time code. This not only yields better cross platform behavior, it also allows us to tune it in a way I could not do before. Which leads me to the most important change you will see in Flash Player 10.1: The way we behave when a SWF is not visible.

Implications for user experience

In Flash Player 10.1 SWFs on hidden tabs are limited resource wise. Whereas they would run at full speed in Flash Player 10.0 and before (note though that we NEVER rendered, we only continued to run ActionScript, audio decoding and video decoding), we now throttle the Flash Player when a SWF instance is not visible. Doing this change was not easy as I had to add many exceptions to avoid breaking old content. Here is a list of some of the new rules:

Visible:
  • SWF frame rates are limited and aligned to jiffies, i.e. 60 frames a second. (Note that Flash Playe 10.1 Beta 3 still has an upper limit of 120 which will be changed before the final release)
  • timers (AS2 Interval and AS3 Timers) are limited and aligned to jiffies.
  • local connections are limited and aligned to jiffies. That means a full round trip from one SWF to another will take at least 33 milliseconds. Some reports we get say it can be up to 40ms.
  • video is NOT aligned to jiffies and can play at any frame rate. This increases video playback fidelity.
Invisible:
  • SWF frame rate is clocked down to 2 frames/sec. No rendering occurs unless the SWF becomes visible again.
  • timers (AS2 Interval and AS3 Timers) are clocked down to 2 a second.
  • local connections are clocked down to 2 a second.
  • video is decoded (not rendered or displayed) using idle CPU time only.
  • For backwards compatibility reasons we override the 2 frames/sec frame rate to 8 frames/sec when audio is playing.
This marks a pretty dramatic change from previous Flash Player releases. It's one of those changes which are painful for designers and developers but are unavoidable for better user experience. Let me show you a CPU usage comparisons with content running on a hidden tab (test URL was this CPU intensive SWF):

Flash Player 10.0


Flash Player 10.1


In this test case the frame rate in the background tab tab has been reduced to 8 frames/sec as audio effects are playing. If there was no audio the improvement would be even more pronounced. The test machine was a Acer Aspire Revo AR1600.

PS: You'll notice in the two screen shots that the memory usage shows a quite dramatic difference also. That's for another blog post.