Friday, September 15, 2006

linux lite and fonts

Mike has talked about some of this before, but this week I went over our Linux code and tried to make it more robust in some areas. Some parts have not been touched in a while, so house cleaning is a good thing to do. Part of this for me is to test different kind of scenarios, some of them might look a little weird. Well, let me provide a screenshot and you'll see what I mean:



We are still discussing if we should actually require GTK/GDK/glib to run at all as this would allow us to do many things we can't otherwise. Some of them are more essential than others. Something non-critial and certainly optional would be the ability to access GNOME specific font settings (which are usually slightly different than the ones specified in the global and user fonts.conf file). The biggest problem with requiring GTK/GDK/glib is that right now using any sort of GTK/GDK functions within nspluginwrapper simply does not work and we haven't spent any time addressing this. That leaves konqueror pretty crippeled, even though it is not crashing anymore. Simple things like copy&paste spit out errors like this:
(process:22016): Gtk-CRITICAL **: 
gtk_clipboard_get_for_display: assertion
`GDK_IS_DISPLAY (display)' failed

Yes I know, we could add support for X11 selections, but this is besides the point. KISS (keep it simple stupid) is our coding method of choice for this release, otherwise we will never be done.

Talking about fonts: I recently took the liberty of revamping device text support in the Flash Player, the braindead font fallback system we had in the past is now the last resort, only invoked if fontconfig is not available. That means that if you are on a simplified Chinese locale you'll get the default font configured for the 'ch-Hans' locale on your system. Finally. I am still in shock we called this list of pre-baked font names in the player a solution.

Sunday, September 10, 2006

gcc challenges

Dealing with compilers can be quite a challenge sometimes. Recently Mike stumbled over a pretty bad crash on one of his older machines. It looked innocent enough. To enable SSE1/SSE2 intrinsics the only option in gcc is to use '-msse -msse2'. While it did not use SSE to do floating point it does emit cvttss2si instructions to do general floating point conversions if this option is selected. Since his machine was an using an Athlon Model 4 an illegal instruction error killed the browser. SSE1 was introduced with the Athlon XP and Pentium III generations only. This behavior makes it completely impossible to safely do runtime detection of SSE1/SSE2 when using gcc unless you split out the various architectures (x86, MMX, SSE1, SSE2 and SSE3) into separate files and customize compile options which is also the suggested hack in the gcc man pages:

"These options will enable GCC to use these extended instructions in generated code, even without -mfpmath=sse. Applications which perform runtime CPU detection must compile separate files for each supported architecture, using the appropriate flags. In particular, the file containing the CPU detection code should be compiled without these options."

There is also an old bug about this which was essentially rejected. Why is there no option to simply turn on intrinsics? This would make things so much easier for developers and guarantee better portability of source code.

I see two practical options right now: Either we totally disable any SSE1/SSE2 optimizations which will severely cripple performance (yes, some of the rendering code will see a 30-60% slowdown which will affect most users), or, we use the Intel compiler to compile these files. Using ICC would also allow us to compile some of the inline assembly for which I have not created intrinsics yet. The real challenge for me now is to modify the autoconf scripts to use two compilers, something which it does not seem to support by default. Google and various forums I looked on are not of much help either on this subject.

Why not using separate files as suggested by the gcc man pages? Well, it would add another few weeks of refactoring and stabilizing code to make this happen. I fear that I will be tasked with this eventually though.

We also discovered recently that '-O2' can generate bad code and had to switch to '-O1' for the time being. It triggered badness in our support math routines for the JIT and funny enough some rather simple C code (our MMX code works great here):

for ( ; n ; n-- ) {
uint32 srcP = src[0];
uint32 src0 = ((srcP&0x0000FF00)<< 8)|
((srcP&0x000000FF)<< 0)|
uint32 src1 = ((srcP&0xFF000000)>> 8)|
((srcP&0x00FF0000)>>16);

uint32 dst0 = dst[0];
uint32 dst1 = dst[1];

dst0 = ((dst0*(srcP>>24))>>8) + src0;
dst1 = ((dst1*(srcP>>24))>>8) + src1;

dst[0] = dst0;
dst[1] = dst1;

src+=1;
dst+=2;
}
This is not reproducible in a standalone test app, I've tried it. Why would gcc make it easy on us anyway? :-) We are trying figure out which option is the issue and disable it. Also, the same code compiles fine using Apple's gcc build 5431 vs gcc 4.0.3 which I am using on Ubuntu right now. Even if it were to fail in a future revision of Apple's version of gcc this code would never be triggered as SSE1 and SSE2 are always guaranteed to be available on MacIntel machines.