Friday, 28 December 2012

Error running aapt, resource does not already exist in overlay

I have added to strings.xml file definition for the new style.

<?xml version="1.0" encoding="utf-8"?>
<string name="app_name">AppicationName</string>
<style name="SpinnerStyle" parent="@android:style/Theme.Translucent.NoTitleBar">
<item name="android:windowFrame">@null</item>
<item name="android:windowBackground">@android:color/transparent</item>
<item name="android:windowIsFloating">true</item>
<item name="android:windowContentOverlay">@null</item>
<item name="android:windowTitleStyle">@null</item>
<item name="android:windowAnimationStyle">@android:style/Animation.Dialog</item>
<item name="android:windowSoftInputMode">stateUnspecified|adjustPan</item>
<item name="android:backgroundDimEnabled">false</item>
<item name="android:background">@android:color/transparent</item>

Android aapt started to show me the following error messages.

android_res\values\strings.xml:7: error: Resource does not already exist in overlay at 'SpinnerStyle'; use <add-resource> to add.
android_res\values\strings.xml:8: error: Resource at SpinnerStyle appears in overlay but not in the base package; use <add-resource> to add.
Error running aapt.exe (1)

Here is how to use <add-resource> tag.

To fix this thy need to add tag with the type and the name of desired resource. In my case resource type is "style", and the name is "SpinnerStyle". I have added this string just before the <style> tag, and everything worked.

<add-resource type="style" name="SpinnerStyle"></add-resource>

Thursday, 20 December 2012

Fixed point math to win performance on mobiles

Though fixed point calculations don't make sense as an optimization technique on x86 processors, where FPU became a standard years ago, on ARM machines it is still an approach to consider.

Although FPU (called VFP) on ARM processors was introduced in ARMv5TE, there is a lot of pitfalls with it. First, there are still a considerable number of devices without VFP. Second, VFP is an optional extension to ARM. There is no VFP unit in some newer phones, like in iPhone 4. (However, iPhone 4 has the NEON unit. NEON has single precision instructions set that can make short floating point calculation faster, but it is not a replacement for VFP.) So, if you are targeting just one platform you may know what you need - like use short floating point when targeting iPhone 4. But if you are looking for optimization on a variety of different ARM devices you may want to use fixed point calculations.

When such optimization makes sense? When your app does a lot of number crunching. For example, a 3D game may well be a candidate for such optimization if for every frame you need to push through the floating point transformation matrix, say, 10'000 vertexes.

Next I'm going to describe the how to replace floating point calculations with fixed point. The process is simple. You scale up float or double type and fit it into integral type. Then perform the calculations on the integral type, and after it is over, scale down integral type.

Suppose we need to multiply two floating point numbers.

long Scale = 20;
double a = .324, b = 3.34344;

//Scale up (a * 2^20 = a * 1048576)
signed long long aFixed = a * (1 << Scale);
//Scale up (b * 1048576)
signed long long bFixed = b * (1 << Scale); 

// ResultFixed = (a * b) * Scale = (a * Scale) * (b * Scale) / Scale = (aFixed * bFixed) / Scale
signed long long ResultFixed = (aFixed * bFixed) >> Scale;
//Scale down (ResultFixed / 1048576)
long Result = ResultFixed >> Scale;  

Why I have chosen to scale floating point numbers by the power of 2 (2^20)? That is because multiplications and divisions can be done with simple bit shifts.

I have had a story with one fixed point library that threw me off my rocker. It presented a nice fixed point class with overloaded math operators, very handy to replace double or float data types. Except when I done that I didn't get any performance increase! I looked inside and saw that the author chose 1000000 as a scale. So, while I gained speed on fixed point calculations I lost all of the performance gain while scaling back because division and multiplication by 1000000 can't be done by bit shift operations.

To look at fixed point math rules in detail visit

Now for the benchmark. I took an iPhone 4 as a test machine. Suppose, I have 200'000 numbers and want to multiply each of them by some floating point coefficient. Lets do it in fixed and floating point manner and compare the performance. The test code:

#include <algorithm>
#include <vector>

double g_Coef = 0.0098942347923576;

signed long long g_CoefFixed;

int PutNumber() 
 static int Number = 1000;
 return Number;

int ModifyNumber(int i) 
 return (int)(i * g_Coef);

int ModifyNumberFixed(int i) 
 int result = (int) (((signed long long)i * g_CoefFixed) >> 20);
 return result;

void Test()
 std::vector<int> Numbers(200000);
 std::vector<int> Result1(200000);
 std::vector<int> Result2(200000);
 std::generate(Numbers.begin(), Numbers.end(), PutNumber);

 unsigned long Time1 = ::GetTickCount();
 std::transform(Numbers.begin(), Numbers.end(), Result1.begin(), ModifyNumber);
 unsigned long Time2 = ::GetTickCount() - Time1;
 g_CoefFixed = (signed long long)(g_Coef * (1 << 20));

 Time1 = ::GetTickCount();
 std::transform(Numbers.begin(), Numbers.end(), Result2.begin(), ModifyNumberFixed);
 Time2 = ::GetTickCount() - Time1;

Running this code for 11 times (why not 10? because butter never spoils the porridge!) produced the following results.

Clearly, fixed point version has left us with 6x performance increase.

Friday, 7 September 2012

Dependency problems in Windows CE

Wonderfull operating system it is Windows CE! Except for the fact that it is built in a Platform Builder by countless number of eastern manufacturers. They all tend to have their own opinion what to have onboard, which DLL's to include and even - which functions from DLL to exclude. And then the majority of applications wouldn't start on these devices. I feel that they are even encouraged to rip off as many DLLs and functions from a devices as possible as they will have lower licencing fees.

As Windows Mobile market share started to diminish I thought Windows CE will follow. By the way, Windows CE is mostly found in GPS car navigators, while Windows Mobile - that's the name for Windows CE with strict subset of futures - was mostly installed on smartphones. But Windows CE is still widely used in navigators. It was a surprise to me when in 2011 Gartner recommended manufacturers to "Remain with Windows Mobile for ruggedized handheld-computer solutions ... ". I can clearly understand the benefits for manufacturers to stay with old OS as they won't need to rewrite drivers etc, but how about consumers? For the same price as Android navigation device you get the device with Windows CE that is generally twice slower and, what is more, you are locked in. You'll be able to use the software that is preinstalled, but most other winmobile apps would not run on it.

But enough of this palaver. I intended this post to be of some help to developers and users who can't run their programs on specific Windows CE device. First off I recommend you to read this wonderfull post by Lao K. It addresses three most common reasons why the specific program fails to run on a device.

The first two reasons, or heartaches, namely "Check the Platform" and "Check the SubSystem" are straightforward. I would only like to add that you can alternatively use the program ExecutabilityCheck to solve "Check the SubSystem" issue. This application can rewrite OS version in exe file.

The third issue, "DLL HELL", I'm going to discuss now. When application doesn't start due to missing DLL or missing function from a DLL we need to track down what exactly is missing on this OS build. I'm going to propose three ways of doing this.

1. Use ExecutabilityCheck. It can also check missing exports. The problem is, I found that on a lot of newer CE devices it doesn't work.

2. Use an app called TestWM5. It dumps all DLLs from the device ROM to SD card. Then you want to transfer them to desktop machine and put them in directory with your program. Launch Depends.exe now, open your program in it and see what is missing. Is is important to check your program against these downloaded-from-device DLLs, not the ones that reside on desktop Windows with the same names! By the way, downloaded from ROM DLLs are not usable, but they are fit for imports check.

3. Use an app that I've wrtitten - CEExports download link . Put CEExports.exe and file input.txt in the same directory on device. Input.txt is the file you should edit. Put all imports from your exe there. You can use dumpbin utility that goes with Visual Studio for this purpose or you can alternatively use Depends.exe.

This is the format used in input.txt:



Also you can put function ordinary instead of a name. See input.txt provided with program for example.

Run CEExports.exe and it will produce the file log.txt. In log.txt functions and/or DLLs missing will be listed.

If DLL couldn't be opened last error code will be mentioned. It means that you can check your own DLLs for compatabilty. If the DLL can't be opened for compatability reasons last error might be 193 (ERROR_BAD_EXE_FORMAT), if DLL can't be found you should see something like 126 (ERROR_MOD_NOT_FOUND). When you run CEExports with provided input.txt you would see log contents like:

10000 function ordinal is not exported from COREDLL.dll
SUPERDUPERDLL.DLL library not found
Last error code: 126
Total number of functions checked: 5
Total number of functions missing: 1

I intentionally checked for function ordinal 10000 in coredll which isn't exported. Further I'm checking for the presence of superduper.dll which can't be found. All other DLLs where succesfully opened, and 4 functions I check for are in place.

Hope you'll find these techniques for locating missing Windows CE exports usefull.

Thursday, 16 August 2012

Android: No USB driver? No problem.

Got another android tablet, and can't find usb driver for it, in order to use ADB? I'm going to show a step-by-step guide what to do in this case, suggested by my colleague. This approach worked on every device I have tested so far.

1. Connect the device to a computer with usb cable.

2. On Android device turn on Settings->Aplication->Development->USB Debugging option.

3. Open device manager on Windows, and there you should see Android Phone node with "Android Composite ADB Interface" leaf. Click on properties, go to details tab and select "Hardware Ids” from dropdown box. In "Value" window you'll see something like this (that's for my Altina tablet):


Copy these values to notepad.

4. Now open ..\Android\android-sdk\extras\google\usb_driver\android_winusb.inf with notepad. This is standard android usb driver from SDK. It covers some of the phones as you can see, like HTC Dream and Google Nexus One. We are going to try to add a new device to it. Go to the [Google.NTx86] section in case you are on 32-bit Windows or to [Google.NTamd64] if you have a 64-bit os.

5. Copy the strings from HTC Dream and paste them before "; HTC Dream"

%SingleAdbInterface% = USB_Install, USB\VID_0BB4&PID_0C01
%CompositeAdbInterface% = USB_Install, USB\VID_0BB4&PID_0C02&MI_01

Then change USB\VID_0BB4&PID_0C01 and USB\VID_0BB4&PID_0C02&MI_01 with your device's strings saved in notepad. For instance, in my case it would look like:

%SingleAdbInterface% = USB_Install, USB\VID_18D1&PID_0003&MI_01
%CompositeAdbInterface% = USB_Install, USB\VID_18D1&PID_0003&REV_9999&MI_01

Save and close android_winusb.inf.

6. Done. To check go to android-sdk\platform-tools\ and type adb devices. If everything worked out well you should see your device in the list of atached devices.

Tuesday, 24 July 2012

RVCT vs GCC - performance comparison on mergesort

I heard about RVCT compiler produced by ARM a lot of times. Some people are saying that it produces 30% faster code than GCC ARM compiler. Our real-time mobile application is compiled with GCC, so I decided to give RVCT a try.

I have downloaded 30-day evaluation RVDS Toolchain 4.1 which includes RVCT compiler. GCC that I use is of a little aged version 4.4.1 (July 22, 2009).

As the project we are working on is quite big and there is a lot of templates usage and so on, I would need to spend some time to make it compilable with RVCT ( "And then you discover that there are no two compilers that implement templates the same way" ). Just for now I have decided to make a small test apps to compare these compilers at first glance.

I am going to test the speed of the code that would implement two versions of a mergesort algorithm. First is iterative mergesort from here (replacing the line 34 with: l_max = (r - 1 < size) ? r - 1: size - 1;) iterative, and the second one is recursive from there . Both of the two versions I'm going to test on the same input file with 100000 unsorted integers. I generated the content for this file by heavy usage of rand() function.

The actual code looks like this, so I measure only time it takes merge to complete the sort.
DWORD t1 = GetTickCount();
mergeSort(arrayOne, arrayTwo, 100000);
DWORD t2 = GetTickCount() - t1;

On my desktop machine (Core i5-2400) the code compiled with Microsoft CL compiler with /O2 optimization took about 9-12 ms, and with optimization turned off 17-20 ms. But that's for comparison, lets move to GCC ARM and RVCT. I set GCC to -O3 optimization level, as well as RVCT. Here are the estimates for 5 consequential runs acquired on LG Optimus One P500 mobile phone with 600 MHz ARM 11 processor.

Iterative mergesort, milliseconds
RVCT - 133 135 133 135 140
GCC - 149 149 143 146 145

Recursive mergesort, milliseconds
RVCT - 92 89 88 88 91
GCC - 93 98 92 101 92

Results speak for themselves. It is prominent that in the above case implementation details of the same algorithm far outweigh the choice of a compiler. Anyway, I still have 20-something days left of evaluation for RVCT, and I am going to try it on a larger project. Upgrading GCC is an option also.

Wednesday, 18 July 2012

Marmalade splash screen issue resolved (for Android)

The glorious hour's gone forth, I finally managed to display the splash screen in a Marmalade-based android application in a proper way.

As you may know there is a problem with splash screens in Marmalade SDK. That problem is not new - it has been around a year since it first emerged link to the marmalade forum. Basically this problem manifests itself (at least on Android) with a long blank screen period after program launch. After this a splashscreen is shown (either yours or standard) followed by the short blank screen period, and only after that your main() function is triggered. This might be due to initialization of glContext, however, it takes longer for the apps with more resources.

Some people suggest to show splashscreen manually from main(), but I cannot agree with this proposal because it is always too late to launch fireworks from main() see this. I've decided to approach this issue in a different way. For Android it is possible to override standard Marmalade's LoaderActivity with your own custom activity. I think you have already guessed what it means, and yes this approach works.

Now for the details and sample code. My custom activity is based on the "examples\AndroidJNI\s3eAndroidLVL" source provided with Marmalade.

import com.ideaworks3d.marmalade.LoaderActivity;
import android.os.Bundle;
import android.view.View;
import android.widget.Button;
import android.content.Context;
import android.widget.ImageView;
import android.view.ViewGroup.LayoutParams;
import java.util.Timer;
import java.util.TimerTask;
import android.view.ViewGroup;

public class MainActivity extends LoaderActivity {
    private static MainActivity m_Activity;
    private ImageView m_ImgView;
    private final Runnable m_UiThreadStartLogo = new Runnable(){
       public void run(){ 
           m_ImgView = new ImageView((Context)LoaderActivity.m_Activity);
           LoaderActivity.m_Activity.addContentView(m_ImgView, new LayoutParams(LayoutParams.WRAP_CONTENT, LayoutParams.WRAP_CONTENT));
    private final Runnable m_UiThreadStopLogo = new Runnable(){
       public void run(){ 
           if (m_ImgView != null){
               ViewGroup vg = (ViewGroup)(m_ImgView.getParent());

    public class CustomTimerTask extends TimerTask {
       public void run() {

    public void onCreate(Bundle savedInstanceState) {
        m_Activity = this;
        Timer timer = new Timer();
        TimerTask updateProfile = new CustomTimerTask();
        timer.schedule(updateProfile, 4000);

    protected void onDestroy() {

After building the jar file do not forget to mention it in mkb file, like this:

In the "onCreate" method of the custom MainActivity I launch the ImageView. This ImageView is set to show the splash image. In the above case for simplicity I have put a numerical id for an image that is included in my Marmalade project through Gallery Icon setting (it doesn't have to be 170x170 if you would check the "Allow non-standart icon sizes checkbox). When the project is built for android target this image will be located in: "...\android\release\intermediate_files\res\drawable"
and you can find its numerical id in:
"...\android\release \intermediate_files\src\com\proj\myproj\".

Now the splash image will be shown immediately after launch. The only trouble it woudn't go away automatically, effectively covering all your app) In the above code I hide it on timer, that is set to 4 seconds delay. The better approach would be to use jni interface and call stopLogo as the first instruction in your main(). Anyhow, I would still prefer to have timer as a last resort, to be sure that the logo screen will hide eventually.

I have used Maramalde 6.0.5 SDK.


Q:Can you give the c++ code + java modification to call stopLogo as JNI call instead of timer?

A: In your MainActivity declare "static MainActivity m_Activity" field and initialize it in OnCreate() to "this", also declare "public boolean StopMyLogo()" method, that would actually hide th logo view. To call this method from c++ take as a reference \Marmalade\6.0\examples\AndroidJNI\s3eAndroidLVL\source\s3eAndroidLVL.cpp. You should write something like this (error checking omitted):
void StopMyLogo()
 if (!s3eAndroidJNIAvailable()) return;
 JavaVM* jvm = (JavaVM*)s3eAndroidJNIGetVM();
 JNIEnv* env = NULL;
 jvm->GetEnv((void**)&env, JNI_VERSION_1_6);
 jclass Activity = env->FindClass("com/android/mainactivity/MainActivity");
 jfieldID fid = env->GetStaticFieldID(Activity, "m_Activity", "Lcom/android/mainactivity/MainActivity;");
 jobject m_Activity = env->GetStaticObjectField(MainActivity, fid);
 jmethodID pMethod = env->GetMethodID(MainActivity, "StopMyLogo", "()Z");
 env->CallVoidMethod(m_Activity, pMethod); //Or CallBooleanMethod()

Update 2 (as of September 2013) :

If you will follow Google's optimization advices for tablets and change targetSdkVersion to "14", your activity will be restarted on screen rotation event, triggering onCreate() method again! You don't want the splash screen to reappear when the user rotates a phone. Make sure you run m_UiThreadStartLogo just once. For example, you can add to activity's settings in the manifest android:configChanges="orientation|screenSize". This way activity would not be restarted when screen is about to rotate.

Thursday, 10 May 2012

Uncompressing Pkzip files with Zlib and Minizip

I've had compressed file, which I needed to decompress inside c++ program. This file was zipped inside Java program with and "Not a big deal" - that is what I was thinking while adding zlib to c++ project. Then I tried zlib's function uncomress(), and all I got was this lousy t-shirt with Z_DATA_ERROR. I evolved further and gave a chance to inflate() function, but to no avail. Came across this stackoverflow post and gave inflate() a second chance, this time with inflateInit2() call. But nothing worked.

Soon I looked inside my zipped file and saw that first two bytes were 0x504B. They correspond to "PK" chars. That made me thinking that this is PKZIP file format. PKZIP stands for single zipped file with possibility to include multiple (or just one in my case) inner files. This makes sense if we look in Java code and assume that there could be more than one ZipEntry.

     ZipOutputStream zs = new ZipOutputStream(new FileOutputStream(file));
     zs.putNextEntry(new ZipEntry("file1.txt"));

Could we use zlib to extract Pkzip files? Zlib's docs are beating around the bush, but soon I guessed that it doesn't understand the header of pkzip files in first place. There's wonderful additional library, build on top of zlib, called Minizip, and it knows how to deal with pkzip files. Guys from here suggest to use cleaned up version of Minizip by Sam Soff (and his friends) . For purpose of just unzipping you will need 4 files from there: unzip.c unzip.h ioapi.c ioapi.h. Of course, zlib needs to be included in your project as well.

The quick method to include those Minizip files in your cpp project would be to:

1. Rename unzip.c and ioapi.c to unzip.cpp and ioapi.cpp
2. Comment out void fill_fopen_filefunc (pzlib_filefunc_def){...} from ioapi.cpp
3. #define NOUNCRYPT on top of unzip.h
4. Use like this:

#include "unzip.h"

bool Unzip(const char* fileNameIn, const char* fileNameOut)
    unzFile hFile = unzOpen(fileNameIn);
    if (!hFile) return false;

    unz_global_info  globalInfo = {0};
    if (!unzGetGlobalInfo(hFile, &globalInfo )==UNZ_OK ) return false;
    if (unzGoToFirstFile(hFile) != UNZ_OK) return false;
    if (unzOpenCurrentFile(hFile) != UNZ_OK) return false;

    const int SizeBuffer = 32768;
    unsigned char* Buffer = new unsigned char[SizeBuffer];
    ::memset(Buffer, 0, SizeBuffer);

    int ReadSize, Totalsize = 0;
    while ((ReadSize = unzReadCurrentFile(hFile, Buffer, SizeBuffer)) > 0)
        Totalsize += ReadSize;
        //... Write to output file

    if (Buffer)
        delete [] Buffer;
        Buffer = NULL;

    return (Totalsize > 0);

This is for the case of one file, but it can easily be extended for pkzips with many files (use unzGoToNextFile() function).

Sunday, 22 April 2012

Struct member alignment errors

Suspicious member variables values? Uninitialized fields? Stack corruption? <Bad ptr>? These symptoms are likely signs of structure’s alignment conflicts. This is when parts of the program are compiled with different alignment settings, and then communicate and use each other as if nothing had happened.

Even if you didn't bother with alignment settings at all, some 3d-party modules included in your project might joyously do it on your behalf.

So, what is structure member alignment all about? Simple example.
class A
    void CalculateVars();
    long   m_Var1;
    double m_Var2;
While it is easy to see that the size of this class should be 12 (sizeof(long) + sizeof(double)), sizeof(A) would yield 16, because default struct member alignment is 8 with Visual Studio 2008. That is why the compiler adds 4 byte empty memory space after long member variable.  

So far so good. But what if different parts of program would have different member alignment values? This may lead to weird program behaviour. Consider this code:
class A
    void CalculateVars();
    long   m_Var1;
    double m_Var2;
//A.cpp #include "A.h"
void A::CalculateVars() { int size2 = sizeof(*this); m_Var1 = 5; m_Var2 = 4.43; }
//Main.cpp #include "stdafx.h" #pragma pack(4) #include "A.h"
int _tmain(int argc, _TCHAR* argv[]) { int size1 = sizeof(A); A a; a.CalculateVars(); return 0; }

Compiling and running this program would 1) make m_Var2 = 8.6792272474287212e+209 2) produce "Run-Time Check Failure #2 - Stack around the variable 'a' was corrupted" message at exit. This is because inside Main.cpp I maliciously put #pragma that changes alignment to 4 bytes. This way inside Main.cpp class A size is 12 bytes while inside CalculateVars() it is thought to be 16 bytes. As it can be seen, A.cpp knows nothing about the change Main.cpp made to alignment, as it was compiled with the default-from-the-compiler-settings 8 bytes alignment.
This picture shows how the class looks in reality and in CalculateVars()' delusion.

It is obvious now  that CalculateVars() writes to m_Var2 at 8 bytes-offset, and gets right into the middle of double field, producing invalid value for it; what's more, as it writes 8 bytes there - the stack would corrupt due to 4 bytes overwrite.

Sometimes it is hard to debug memory-alignment issues, especially if program doesn't tend to fail early. It just works and works, with a few invalid values waiting for an ocassion to become noticed. So it would be nice if compiler would at least throw a warning. Given the code above cl compiler wouldn't warn. However, if I move this nasty #pragma pack in a header, it throws "warning C4103: alignment changed after including header, may be due to missing #pragma pack(pop)". This suggests to me to never put #pragma pack in implementation files, only in headers.

3d-party libraries with different alignment options should normally keep those changes local to themselves, and restore previous memory alignment at the end of their headers (using #pragma pop). For example, when I used stlport library it did just that. It was said that stlport libs where built with 8-byte alignment, and devs wanted to ensure that these libs would be used with identical alignment. Their <vector> header looked something like this:
pragma pack(push,8)
# include <stl/_vector.h>
pragma pack(pop)

Otherwise the alignment would change for the rest of the included headers, like in the picture below.

To conclude, when there is suspicion of structure alignment issue in code, I’d advice to first test structure size with sizeof(*this) in place of problem. Correspondingly, go down the call stack to an object that called the function from disputed class and check its size there. If sizes do not mach then it's likely an alignment issue. Same as in the above code, when sizeof(A) and sizeof(*this) were different

Also, when including someone else's headers keep a close watch on  warning C4103.

Friday, 6 April 2012

Using sox to batch convert .wav files to .raw

I was in need to convert a lot of wav files to raw format (for use within Marmalade SDK). As it is adviced in I decided to use sox command-line utility for audio manipulation. It is easy to convert separate audio file with sox, but to convert all files in folder at once one should write a batch file. I found a batch-example.bat in sox folder and I have modified it in this way:

cd %~dp0
mkdir converted
FOR %%A IN (%*) DO sox %%A -e signed-integer -b 16 "converted/%%~nA.raw" rate 22050 channels 1

Then you drag-n-drop a group of wav files onto this .bat, and voila the subfolder \converted will emerge in the sox folder. This  folder will contain all the converted files, with the same names, but with .raw extensions.