The first step in reversing any binary for any purpose is to try and elicit any meaningful information that is most easy to retrieve. One such information is clear text strings in the binary. They may disclose a lot of information if the programmer did not take care to remove them. The funniest cases are when a programmer wants to stay anonymous (say malware author) and still leaves the various info left in the binary by the compiler/linker (Microsoft Visual Studio is notorious for that) which includes his host folder structure, his operating system’s username, local time etc. Just as an example let’s look at the Hacknet https://en.wikipedia.org/wiki/Hacknet game (cute “hacking” environment and ambience emulator for those who want to feel themselves “hacker”), which is on sale at Steam right now for 3$ and see what we can deduct from its binary.
I am using HIEW hex editor but of course any hex editor or even plain Linux strings tool will do. But before looking at the strings let’s have a peek at the executable file headers: Hacknet binary analysis headers
From it we can say:
  1. Linker version 11.0 says that the file was first compiled with the Microsoft compiler released as part of the Visual Studio 2012. en.wikipedia.org/wiki/Microsoft_Visual_C
  2. Magic optional header 010B means it is a 32-bit executable , 64-bit would have it 020B.
  3. OS version and subsystem of 4.0 means that most probably Steam that bundled this binary set these numbers artificially for compatibility – Windows internal version of 4.0 is Windows NT, I doubt the software author wrote it on Windows NT in Visual Studio 2012 .
  4. Subsystem Console means that the software from the beginning was defined as Windows console project in Visual studio, that is, does not include any GUI libraries or code.

Not bad for a mere header. Now to the strings. The default minimum string length of 4 characters finds us 16162 strings, too many , I will increase the minimum string length to 25 characters. Almost immediately we can see this string:
hacknet binary analysis xna string
Which confirms our suggestion that it was written in Microsoft Visual studio on Windows, XNA Game development platform from Microsoft says so en.wikipedia.org/wiki/Microsoft_XNA . Also most probably author used FNA framework to port the game to Linux and OS for which it is also available. This also suggests the game was written in C# . This string indeed proves it is a Steam uploaded game:
Here we can even see the page URL for sending victory mail. It is, by the way, sometimes used by malware writers to set a trap – URL which no one will visit but only those looking at it in the binary:
hacknet binary analysis email And the final piece of information is here:
hacknet binary analysis final
which confirms all we suggested earlier, giving us in addition the full path to MS Visual Studio project/debug location on the author machine and his user – Matt, which coincides with the author real name – Matt Trobbiani. And of course not mentioned in this post there are tens of thousands of strings of in game text and function names.
Now to the strings obfuscation itself. There are few ways to obfuscate/encypt them in the binary so that you deobfuscate/decrypt them in real time just before actually using in the code flow. Sure you cannot protect anything like that – because of that it is called obfuscation, but it can make work of a reverser a bit harder, that is it.
First, the easiest, way to hide the strings is by adding/substracting an integer value before compiling the file, then having a routine to do a reverse mathematical addition/substraction to get in the memory the needed string, use it, then discard or again obfuscate it. This will make a gibberish out of a string but will look like a suspicious string still. Of course it will only work with ASCII strings, which here are treated as integers. The source code is in C to give an example. I mangle/obfuscate string “secret password” via macro compiler preprocessor HIDE_LETTER, then de-obfuscate it using UNHIDE_STRING at run time. I plan on running a series of posts about obfuscation so stay tuned.


#include <stdio.h>

#define HIDE_LETTER(a)   (a) + 0x50
#define UNHIDE_STRING(str)  do { char * ptr = str ; while (*ptr) *ptr++ -= 0x50; } while(0)
#define HIDE_STRING(str)  do {char * ptr = str ; while (*ptr) *ptr++ += 0x50;} while(0)
int main()
{   // store the "secret password" as mangled byte array in binary
	char str1[] = { HIDE_LETTER('s') , HIDE_LETTER('e') , HIDE_LETTER('c') , HIDE_LETTER('r') , HIDE_LETTER('e') 
	, HIDE_LETTER('t') , HIDE_LETTER(' ') , HIDE_LETTER('p') , HIDE_LETTER('a') , HIDE_LETTER('s'),
		HIDE_LETTER('s') ,HIDE_LETTER('w') ,HIDE_LETTER('o') ,HIDE_LETTER('r') ,HIDE_LETTER('d'),'\0' }; 

	UNHIDE_STRING(str1);  // unmangle the string in-place
	printf("Here goes the secret we hide: %s", str1);
	HIDE_STRING(str1);  //mangle back

    return 0;
}
keywords: languages, programming, assembly-language, reversing, software-protection