10-05-2009 07:04 PM
you ought to be able to write a test app or include menu option "send test pattern"
and see if changes make sense for some objective- like cr-lf everywhere.
10-05-2009 07:47 PM
Very interesting reading.
Sorry I don't think I can add anything useful here. So why am wasting your bandwidth? Basically just in case the following is useful. I don't think it is, so apologies in advance.
I concur with what klyubin has said and would encourage you to dump out the data sent and received to confirm the corruption. However it also doesn't surprise me that it is being corrupted. AFAIK, Direct TCP, regardless of whether it is a socket connection or not, still goes through the carrier's gateway. And who knows what the carrier will do.
I have two questions:
a) Is this specific to one carrier or all carriers? I have experience with Vodafone screwing up data, since they are related it would not surprise me if Verizon tried something similar?
b) What port are you using for your socket connection. If you are using port 80 or 443 (and this is sometimes easier because they will go through firewalls), then the carrier may well think it is Browser traffic and attempt to optimize.
One issue for me is why this would change with an OS upgrade. But then we don't know what goes on under the covers. In Vodafone, deviceside=true suddenly meant use WAP unless there wasn't a ConnectionUID, so I guess anything is possible. For details of this, see:
But the biggest issue for me, and one which to me points at the device, is that you are seeing this corruption on data transmitted. I would not have thought anybodies gateway would play with data sent.
I've not heard of the connectionhandler=none trick that klyubin was talking about - klyubin - can you give us any more info on this?
10-06-2009 03:39 AM - edited 10-06-2009 04:00 AM
peter_strange, with Direct TCP, the device generates TCP/IP traffic which is supposed to travel via the IP tunnel between your device and the GGSN and the GGSN is then supposed to simply emit the IP packets generated by the device to the attached Packet Data Network (PDN) (e.g., typically a private carrier network with Internet access). The PDN, at the point of connection to the Internet, can have a firewall and some form of NAT (which may modify TCP/UDP/IP headers and may be fragment the packets), but beyond that you shouldn't expect the payload of your IP packets to be modified. Furthermore, same as in any ISP scenario, these firewalls and NATs should not be modifying the contents of TCP streams that traverse them (Comcast's RST case comes to mind, but even there the actual contents of TCP streams weren't tampered with since that would have violated certain laws).
In principle though modification is possible (even when unintentional as in the case of the Vodafone UK WAP gateways), and that's why I prefer not to trust anybody along the path between my device and my server, be it the mobile operator or the ISP. Some people claim that mobile networks secure the data anyway, but that's not really true: GPRS and 3G data traffic is end-to-end secured by the SIM+carrier (as the carrier knows the shared key stored in your SIM) only between the device and the SGSN, but after that it is no longer secured/encrypted, meaning that the data travel via the carrier's core network unsecured.
As to the the connectionhandler=none workaround, it was suggested by RIM to force Direct TCP to go via the APN specified by the user/connection URL (essentially, the issue is the same as outlined by ttahir in the thread you referenced: http://supportforums.blackberry.com/rim/board/mess
One of the biggest issues with Direct TCP going through WAP 2.0 TCP is that some WAP 2.0 gateways (e.g., Vodafone UK), when accessed in this way, break the protocol. For example, they sometimes return an XHTML-formatted error page when something goes wrong. As a result, your non-XHTML network code using Direct TCP might all of a sudden get an XHTML response, even when you are connecting via TLS/SSL (in which case at least the TLS layer notices that something is wrong).
The workaround works only for socket:// and http:// connections.
10-06-2009 04:23 AM
Just a wild guess... I guess you are using multiple threads in your app? Could it be caused by a thread race condition?
It could be that the OS updates are using a slightly different thread scheduling mechanimsm. Dormant threading problems, which happen to work on the previous os, can suddenly be 'active'.
Like I started, just a wild guess but threading problems can give you strange problems. I agree with the others that you first must take the time to see if the data is really corrupted by the transport layer.
10-06-2009 05:05 AM
10-07-2009 01:25 PM
I'm trying to use stream wrappers as you suggest however I use the DataInputStream and make extensive use of things like readByte() and readlong() etc.
If I extend DataInputStream i can't seem to override those methods however. I can extend read() but the debugger never trips up so I can assume that readByte() et al does not do a call down to read() as i hoped it would.
I did a fair bit of google searching hoping that someone else had done the dirty work but did not see anything promising either.
I am writing some (almost) hard coding into the server side to see if I can shed more light on the issue. Just wanted to give an update to let you know the problem is still a major problem. It has persisted over a blackberry wipe as well (unlike all my personal data) just in case anyone else is interested.
10-07-2009 01:38 PM
Sorry I did not see your message before Peter
Answer a) I believe it is cross carrier but I know it is on Bell and Telus
Answer b) I can use any port but am using port 55555 at present - I doubt shifting ports will make a difference
it is definitely related to an OS upgrade. There is a version of the OS that "suddenly" started draining batteries in about 3 hours which is forcing users to upgrade to fix that issue. That "fix" is what clobbers my app and causes this "problem" to occur.
To date to try to solve this I have bought a new development machine, migrated development from Windows to Mac, moved from EclipseME to MotoDEV and compiled on both platforms multiple times. I've even moved back to RIM's JDE and did a compile there plus tried it as a COD file upload via the desktop loader. The problem is consistent and not going away anytime soon. I was hoping it was a carrier issue and they would respond but it has persisted over 3 months now at least.
I'll repeat that the app works perfectly in the emulator no matter how much data I shovel through the connection.
I've tried using the emulators against the development environment server and a production server as well but again it all works.
Only certain (current) versions of the Blackberry app do this...
10-07-2009 01:50 PM
Very good question and I thought about that for a while however I had huge threading problems when I started writing this app and took great pains to set up distinct and unique threads to do each task.
There is a SINGLE thread responsible for WRITING to the connection and another single thread for READING. There are threads for each and every other task as well like updating the screen or the message databases etc. And there is watchdog thread to keep everything running just in case etc..
All threads "sleep" unless they have something specific to do. Based on what i know of the code not sure how anything could affect the outbound writes as everything is placed as objects into an outbound array so either I have a complete message or I do not and there is a sync block around the "write object to output stream" section of the write thread. I could bump the thread priority up during the method section perhaps but I can't see that helping. generally the less mucking you do with things the better I find.
I highly doubt it is a thread issue however assuming it was what would be your advice for debugging it? The emulator is useless because it works every time so there is nothing to debug.
10-07-2009 02:50 PM
Eh, did you try to isolate the problem like I suggested? What data pattern(s) trigger this and what
is the net rsult and how does it differ from what you meant?