01-03-2009 03:02 AM
I'm working on porting over a very* resource intensive game/application from generic J2ME to a 8330 (126.96.36.199), using BlackBerry specific features
to improve performace and usability.
I've gotten its performance to "painfully playable" by making more intelligent use of image data; mostly replacing Image with Bitmap and pre-allocating and reusing those religiously, along with skipping "unneccisary" screen updates. Most of the application time is spent doing serious number crunching; there is significant access to fairly large byte arrays as well.
My questions boils down to: What are some tips for optimizing applications in general and math/memory access in particular?
Things I'm working on:
-replacing all division with shifts or subtler tricks
-more intelligent partial updates of the main screen (finer grained than a single box around all changes frame-to-frame)
-aggressively re-using objects
Things I've considered, but am not sure will help:
-packing byte into int and unpacking as necessary; is it faster to operate on int vs byte? Enough to overcome unpacking costs?
Any additional ideas and observations greatly appreciated.
*For a mobile application
01-03-2009 05:06 AM
int is processing faster than byte
and int is processing faster than byte
Check the link below to get answers to your questions:
01-03-2009 08:30 AM
I used to work at Eyewonder when we had a java multimedia applet for delivering ads to browsers, and I spent a lot of time optimizing code for both size and speed ( this, along with preventing browser hangs, reduced the number of death threats we got so it was quite important, LOL).
Anyway, there are at least two things to keep in mind- the run time environment and the underlying NATIVE architecture. And, never ignore empirical results as obvious things don't always work. For example,
I'm not entirely sure what the issue with int versus byte entails as certainly doing things in native
bus size increments can be helpful but on desktop systems with caches the 4x change in size can make
cache misses a huge problem ( even on the c++ lists, people tend to ignore this issue).
Looking at the link you provide, and remembering as best I can, sure we did many of these things including loading tables from IIRC a string ( there turned out to be something like an automatic 2x size reduction in addition to whatever compression you could get from clever string encoding ) but if you dig through jad output
( decompiled code) or a class inspector, or just use dump the binary class files, it is easy to figure out size. Speed is more confusing as it depends on at least the two machines mentioned above.
I also would suggest stepping though a few system calls too- these are often designed as various compromises ( and of course most jre's can't inline/compile too well ) and I did just note the RIM GUI components are a bit OCD about checking "do I have the event lock" etc. Sometimes inheriting ( make more calls ) isn't always a good way to go
if you need a certain type of optimization and can give up something like a false sense of security.
Anything CPU intensive is going to be a trick in java. The "portable to the last bit" requirement usually means all floating point is software emulated. Divisions are always bad on most FPU's but remember that bit shifts may not execute as a single native instruction either.
01-03-2009 09:39 AM
01-03-2009 11:46 AM
01-04-2009 08:01 AM
01-06-2009 04:27 PM
Here are some links that discuss optimizing code for BlackBerry handhelds.