Extracting WhatsApp message logs from a WhatsApp database

Just a small update.

After returning from my long trip abroad, having used WhatsApp profusely during the time there, I had lots of conversations I wanted to export to a message log and store for future, older days. (“Wow, I can’t believe she wrote me THAT! ;)”)

These chat logs included lots of media (images, audio clips and videos), which I also obviously wanted to keep. I used an Android phone, so I looked for an existing solution in the Google Play store. I found an applicaton called WhatsApp to Text which is quite nice, but fails to offer the option of exporting the media. I found no other solution in the Play store.

I then looked on-line for another solution. The team of D. Cortjens, A. Spruyt and W.F.C. Wieringa from the University of Amsterdam have published the results of a research project titled WhatsApp Database Encryption. Based on the results of this project, a Python script called WhatsApp Xtract was coded, to allow the generate of WhatsApp messages logs – this time with all media intact. Great.

Only thing was, some features weren’t working correctly. Media wasn’t automatically detected in some cases, the generated log files were humungous, the names of contacts were sometimes not displayed, there was no way to see all media files (like in the actual WhatsApp software), it supposedly was able to repair corrupt databases (and salvage what it can from them) but didn’t really, and generally it didn’t satisfy my requirements.

So, I updated it a bit:

v2.5 (updated by Alon Diamant – Mar 14, 2013)
– Improved encrypted Android database detection and decryption code
– Can now repair malformed Android databases (depends on availability of sqlite3 executable)

v2.4 (updated by Alon Diamant – Mar 06, 2013)
– Generates media index file – but crappily, we should set this up better..

v2.3 (updated by Alon Diamant – Mar 05, 2013)
– now generates separate file for each contact
– fixed file search to search for image files in days prior to date given (to fix a bug where because of timezone differences the image file exists but is not found)
– fixed message counts for contacts
– does not list contacts with 0 messages
– now writes version number in generated files
– (Android Version) displays WhatsApp name (server based) if no display name is found for a contact
– (Android Version) Supports Python 2.6

It still is medicore, and I am not happy with the way it works nor the way it is coded, but it’ll serve for now. I do feel like coding an Android application to do this properly on the phone, with well formatted output files that include all media. We’ll see.

For now, enjoy. You can also find updated in the project repository.

Memcached patch: Item expiration signal to client and extension

I work as a developer in Metacafe. We use memcached as our caching system, and have been using it for a few years now with great results. A while ago, we thought of a neat little concept that made working with memcached much more convenient for us. We patched the memcached code in-house, and have been using this patched code for a while now. We are now offering the concept (and the source code) to the project.

The idea is as follows: currently, whenever an item which has expired is requested from a memcached server, the server immediately unlinks (deletes) the item internally and returns an empty (or null) item to the client. We propose, instead, returning an empty item to the client and extending the expiration time of the item by X seconds (we use 60 seconds), thus returning the expired item to all clients who ask for it in the next X seconds. This behavior repeats itself, after X seconds, and so forth. This behavior should be controlled by a command-line argument, and is off by default.

The benefit is that the client which receives the empty item can refresh the item however it deems fit. There is no race situation and no database “stampede”. The client can access a database and create a new item, storing it to server, knowing it is not racing another client. (Of course, if the server has just been launched or if the item is brand new – there might be a race, but this idea is not an attempt to fix this different issue.)

More information and the patch can be found here