Welcome!

Welcome to the official BlackBerry Support Community Forums.

This is your resource to discuss support topics with your peers, and learn from each other.

inside custom component

Java Development

Reply
Developer
Posts: 27
Registered: ‎04-10-2009
My Device: Not Specified

UTF8 encoding and SQLite database problems

I'm really stumped with this problem.  The situation is as follows:

 

Using BB 5.0 I am creating an application that uses SQLite.  As part of my app I bundle an existing SQLite database.   In the final product this will be zipped, but currently it is simply packaged as is.

 

Now using a normal SQLIte manager on a PC I can see the contents of the columns quite clearly.  And it is clear that the data encoding format being used is UTF8.  This is easily verified by simply editing the database file in a simple text editor.

 

So I bundle the database with my app in the resource section.  Now to actually use it I have to copy it out to the file system of the device (either sdcard or device memory).  So in short, I use and InputStream type to get bytes out of this resource and an OutputStream type to write to the device/sdcard memory.  I THINK this is doing a simple byte by byte copy (given the use of InputStream and OutputStream)

 

Now the actual SQLite access uses this newly copied file - and it appears to work just fine EXCEPT....

The UTF8 characters that are not part of ASCII display incorrect - as if in ASCII.  This is most notable with common glyphs such as ' or '' - which get encoded into UTF8 multi-byte equivalents.  So something like the text TEST' gets displayed as TEXT’.

 

So I don't know where the issue lies.  Possibilities include:

 

1.  when I copy the file from resources, some unknown translation is occuring that I just don't understand - in spite of my thinking InputStream and OutputStreams are raw bytes

2.  SQLite is not recognizing the encoding of the columns as UTF8

 

I'm pretty much up against a wall now and am runing out of things to google and code to try.

 

 

Developer
Posts: 27
Registered: ‎04-10-2009
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

Well I can save everyone some pain and grief as I've finally figured out the problem.

 

As far as I'm concerned, this is a bug in the SQLite implemenation of 5.0

 

The java type 'String' supports a number of character encodings.  The default in the BB platform is ISO-8859-1.  What happens is the SQLite library returns text fields as type 'String' via a method like 'getString()' on class 'Row'.  The problem is that the type of String is the BB platform default (i.e. ISO-8859-1).  This in SPITE of the fact, the actual SQLite database indicates that text fields are encoded as UTF8.

 

As a result when you do something like 'getString()' on a 'Row' - you then have to manually convert that string from the default encoding (ISO-8859-1) into UTF-8.

 

You do this by converting the returned 'String' to 'byte[]' - then create a new 'String' from those 'byte[]'s where you specify the encoding as "UTF-8".

 

More than I ever cared to know about the guts of character encoding/sqlite and Java String class

 

Developer
Posts: 153
Registered: ‎09-03-2009
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

HI. can you share the how to copy the sqlite database code?

I also need use own sqlite database.But i can not get it. Pls help me

my code like this:

private String dbpath = "file:///SDCard/BlackBerry/test/database/";

	private String dbname = "test.sqlite";

	public void run() {
		try {
			FileConnection fc = (FileConnection) Connector.open(
					dbpath, Connector.READ_WRITE);
			if (!fc.exists()) {
				fc.mkdir();
				fc.close();
			}
			FileConnection fc2 = (FileConnection) Connector.open(
					dbpath+dbname, Connector.READ_WRITE);
			if(!fc2.exists()){
				InputStream input=getClass().getResourceAsStream("/mallMapv2.sqlite");
				OutputStream output=fc2.openOutputStream();
				byte[] buffer = new byte[1024];
				int length;
				while ((length = input.read(buffer)) > 0) {
					output.write(buffer, 0, length);
				}
				output.write(buffer);
				output.flush();
				output.close();
				input.close();
				fc2.close();
			}
		} catch (Exception e) {

		}
	}

 

 

Developer
Posts: 1,305
Registered: ‎01-21-2009
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

Have you tried executing the statement PRAGMA ENCODING="UTF-8" before accessing TEXT columns?




Solved? click "Accept as solution". Helpful? give kudos by clicking on the star.
Developer
Posts: 153
Registered: ‎09-03-2009
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

 

private String dbpath = "file:///SDCard/BlackBerry/test/";

	private String dbname = "test.sqlite";

	public void run() {
		try {
			FileConnection fc = (FileConnection) Connector.open(
					dbpath, Connector.READ_WRITE);
			if (!fc.exists()) {
				fc.mkdir();
				fc.close();
			}
			FileConnection fc2 = (FileConnection) Connector.open(
					dbpath+dbname, Connector.READ_WRITE);
			if(!fc2.exists()){
				InputStream input=getClass().getResourceAsStream("/test.sqlite");
				OutputStream output=fc2.openOutputStream();
				byte[] buffer = new byte[1024];
				int length;
				while ((length = input.read(buffer)) > 0) {
					output.write(buffer, 0, length);
				}
				output.flush();
				output.close();
				input.close();
				fc2.close();
			}
		} catch (Exception e) {

		}
	}

 

 

Actually, my code like this.And i have test it in device. This app can copy the database file to sdcard.

But when i try to read the data. I can not get the data. 

if i just paste the database file in the sdcard. i can read the data. But can not read the data from the database which is copyed by this code

Developer
Posts: 19,636
Registered: ‎07-14-2008
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

Jus hit this problem myself.

 

Ted - re

 

PRAGMA ENCODING="UTF-8"

According to the SQLite doc, this statement is silently ignored on existing databases. 

Developer
Posts: 27
Registered: ‎04-10-2009
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

The saga continues.

 

Some devices began displaying foreign characters incorrectly.  After some digging, I discovered that this malodorous behaviour was only happening in newer versions of the OS.  Specifically I noticed that in OS5.0.0.411 my kludge to overcome the bug in SQLITE worked fine.  However on a device running OS5.0.0.681 ? was being displayed rather than the proper accented characters.

 

As it turns out RIM HAS fixed this bug...BUT WHO KNOW WHEN?  All I can say for sure is this:

 

With OS5.0.0.411 or less, then this encoding bug is present.  With anything OS5.0.0.681 or higher, the bug has been fixed - and a kludge to undo the damage caused by the SQLITE package will only corrupt your string.

 

I have no idea what to do with the devices in the vast netherland between 411 and 681

 

Jim

Highlighted
Developer
Posts: 19,636
Registered: ‎07-14-2008
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

Apologies, I should have posted this before.  I have a workaround that works in all OS levels. 

 

I can't give you the code, but here are the significant parts.

 

a) Request data in hex

 

String SQLSelect = "SELECT hex(description) " + "FROM ProductTable ';

 

b) Get the data as a String

 

String description = r.getString(0);

 

c) Convert that String - which is a sequence of hex like

"4C6974746C6520566F6D656E"

into bytes

byte [] descriptionBytes = hexStringToByteArray(description);

 

sample code, no guarantees:

    public static final byte [] hexStringToByteArray(final String hex) {
        byte [] bytes = new byte[(hex.length() / 2)];
        int j = 0;
        for ( int i=0; i<bytes.length; i++ ) {
            j = i * 2;
            String hex_pair = hex.substring(j,j+2);
            byte b = (byte) (Integer.parseInt(hex_pair, 16) & 0xFF);
            bytes [i] = b;
        }
        return bytes;
    }

 

d) Process this as a UTF8 String

 

description = fromUTF8(descriptionBytes);

 

more sample code, still no guarantees!

    static public String fromUTF8(byte [] bytesToConvert) {
        String returnString = null;
        try {
            returnString = new String(bytesToConvert,"UTF-8");
        } catch (Exception e) {
            // Will never happen, UTF-8 is supported
        }
        return returnString;
    }

 

This workaround is probably not needed in OS 6.0 and above.  But I've not tested. 

Developer
Posts: 19,636
Registered: ‎07-14-2008
My Device: Not Specified

Re: UTF8 encoding and SQLite database problems

[ Edited ]

Not looked seriously at your response yet, but your comment about storing in SQLite reminded my of these problems attach I have been involved in:

http://supportforums.blackberry.com/t5/Java-Development/UTF-8-Encoding-in-SQLite/m-p/557782

 

http://supportforums.blackberry.com/t5/Java-Development/UTF8-encoding-and-SQLite-database-problems/m...

 

I wonder if you have an SQLite encoding problem. 

 

The code at the end of the the second Thread should be useful for you to try to figure out where the problem is introduced.  Can you use it to see if the problem is the data coming out of the SQLite database with the incorrect encoding?

 

Edit:

Apologies, this post was supposed to go on a different Thread, in fact here:

http://supportforums.blackberry.com/t5/Java-Development/Character-Encoding-is-Different-on-Different...

 

Ignore this post for this Thread.....

Contributor
Posts: 11
Registered: ‎10-02-2010
My Device: Curve 8900
My Carrier: Rogers

Re: UTF8 encoding and SQLite database problems

Having also noticed this problem while testing my app on a Storm 9530 (OS 5.0.0.328), I created the following fix for it: 

 

1. Create a table with one column and a row containing a single character: "é"

 

2. When the app is launched, get the value of this column and set the value of a boolean I called m_bDBReturnsUTF8 depending on the value returned for the checked row: if the string returned for the "é" value has a length that's more than 1 character, m_bDBReturnsUTF8 is set to false as it means the value was not properly returned as a UTF-8 string (thus generating two characters looking like garbage if displayed on screen).

 

3. In my coder, everywhere I call row.getstring(x) , I've replaced the call with EnsureUTF8( row.getstring(x) ). The code for EnsureUTF8() can be found below. Depending on the value of m_bDBReturnsUTF8, that function either returns the string untouched or it re-encodes it for support on older BB devices.

 

Doing it this way, I didn't had to change my SQL queries and when the app is run on newer OS, the fix does not cause slow downs with a forced converson from hexadecimal data. Hope it helps.

 

//This function is useful on older versions of OS5 (like builds 3xx/4xx) as strings returned from the DB are not UTF-8
static public String EnsureUTF8(String sText)
{
	try
	{
		if(m_bDBReturnsUTF8)
			return sText;
		else
		{
			//Convert the string to UTF8
			byte[] aChars = sText.getBytes("ISO-8859-1");
			return new String(aChars, "UTF-8");
		}
	}
	catch(Exception e)
	{
		System.out.println("EXCEPTION: EnsureUTF8. " + e.getClass());
		return sText;
	}
}