EXE Haxing – Replacing Strings, The Hard Way
September 28, 2014 2 Comments
I was originally just going to explain things to Aroduc, but I figured I needed to make it pretty elaborate so I may as well just make it public. I had a hell of a time finding any information online on how to do this and kind of just winged it, so I figured I’d share some of that knowledge. Note that I’m pretty bad at this, so what I say may not always be right or the best way to do things!
So lets say you’ve got a game you’re working on translating as part of a fanTL project. Due to the awesomeness of the internet and VN community, you’ve got all these tools available to localize a game, but not all the text you need to modify are in those resources. Some of them are in the game’s own executable file. How do you handle that? Well, it’s complicated.
So take this image from this totally secret project. Unluckily for us, the text in the dialog box we’re seeing right now is all stored in the game’s executable, so we’re going to need to get our hands dirty to change it out.
We’re going to need some tools before we can dive in. You’ll need a hex editor, hopefully something that can display different text encoding formats. I’m using 010 Editor, which isn’t free, but I’m sure there’s plenty of other good alternatives. You’re also going to need the tool Lord PE. I’m using version 1.41, and you can probably find it pretty easily on google. And lastly, a calculator that handles hexadecimal. I’m just using the windows calculator set to programmer mode, but I’m sure there’s better out there that I should be using.
We’re only going to focus on doing one of these strings, but the process is easily repeatable for the rest. We’re going to focus on the starting bit “入手した金” (“Obtained Gold”). Open up the executable in your hex editor, and we’ll see if we can find the string in question.
You might need to search in several different formats to find the text you’re looking for. In 010 Editor, your searches are based on the selected character set, so you can check out the usual culprits. Often when searching for Japanese text, the text will be stored in SJIS, but more modern games may use Unicode or UTF-8. In this case, we found the text in Unicode character set. In Unicode, all characters take two bytes, which includes the string end delimiter (the 00 00 after the string). If our English text was shorter than the Japanese, we could just edit it here and be done, but since we’re in Unicode English characters are also two bytes, so we clearly don’t have enough space. Also, we found the string with some leading values (the [s4c4e0]), which doesn’t appear to be displayed in game. These are probably control characters, possibly to control the text color. Either way, we’ll need to include it when we make our changes.
Write down the position the string starts in the exe, we’ll need this value later. In this case, accounting for the leading control characters, the string starts at 0x7349C.
So now that we know where the text is, how do we add English text to replace it? In order to add in English text, we’re going to need a place to put it.
Fire up Lord PE and open up the executable. You should see something like this:
This screen contains all kinds of interesting information, but most of it isn’t useful for us. What is useful is the ImageBase value. This is the base virtual memory address that the executable will be loaded into, and we’ll need this value for some calculations later. Click on the sections button on the right side.
These are the different sections that make up the executable file. These vary a bit depending on the compiler, but you’ll almost always have a .text section, which contains the executable’s compiled code, and a .data, which contains any data and memory that isn’t allocated dynamically. In our case, we also have an .rdata section, which contains read only values.
There’s also a series of columns here. There’s the VOffset and VSize, referring to the virtual offset and virtual size of the section, and ROffset and RSize, which is the raw offset and raw size of the section within the executable. To understand this a bit more, when an executable is run, each data section is loaded from it’s raw offset and size in the executable file into virtual memory, at the memory address equal to VOffset + ImageBase.
So what does this mean for us? Well, it tells us two things. Our string is at 0x7349C, which puts it within the .rdata section, which starts at 0x6E200 and has a length of 0x13C00 (so an ending offset of 0x81E00). For us to move the string, we’re going to need to create a new section for it to live in.
Right click on any of the entries, and you’ll get the following dropdown:
So what I’ve done is I changed the name to something a bit more meaningful, .rdata2, and I changed both the virtual size and rawsize to 8000, more than enough space to fill it with a number of different strings. I also used the same flags as the original .rdata section. The flags selected here specify the section as being read only, and containing only pre-initialized data. It’s probably not too important what it’s set to, as long as the flags include the permissions of the original section. You can click the … button to get more details about the flag values.
So if you’re impatient and try to run the exe now, you’re going to see a screen that looks something like this. Windows just won’t run it, and the reason is that we defined a new section in the file, but when windows tries to run the executable, it won’t be able to find it to load it into memory and figures your file is corrupt.
So lets fix this. Open up the executable in the hex editor, and jump down to the bottom.
You can see that the end of the file is at 0xBC512. If you have a keen eye, you’ll see that this position is the ROffset + RSize of the previous last section (the .patch section) of the exe. Now that we’ve added a new section, we’ll need to fill in space for it. If you look above, we’ve set our ROffset to be 0xBC600. That means we’re going to need to fill up the space up to 0xBC600, and then add another 0x8000 bytes. If you do the math, that means you need to add the 0xEE bytes to reach 0xBC600 (0xBC600 – 0xBC512), and then add another 0x8000 bytes to fill our new section.
In 010 Editor, this can be done from the Edit->Insert/Overwrite->Insert Bytes option. Specify the start address, the size, make sure hex is selected, and make sure you’re filling in with 0’s, and click Insert.
Jump to the start of our new section (ctrl-G) at 0xBC600 and we’ll fill in the new text. In our case here we need to reproduce the control characters that were there originally. It should look something like this:
Alright! Now we’re rocking. In the future, if we want to add more text, we can just stick it in after this string. You might want to leave some space between each one in case you need to make changes later, or it could be messy.
But, even though we now have our text in our new section in the exe, we still need to get the exe to load our string instead of the original. To do this, we’ll have to edit the pointer to the original string to point to our new one. How do we do that then? With a little bit of math.
We need a few different numbers to figure this out. The original string was located at 0x7349C. We noted that the ImageBase address was 0x400000. We also know that the Virtual Offset of the .rdata section it resides in is is at 0x6F00, and the Raw Offset is 0x6E200. We can calculate the offset that our string will be at in virtual memory by doing the following formula: ImageBase + Virtual Offset – Raw Offset. In our case, we get 0x400E00. This number works for all items in the .rdata section. If we add that to our address of 0x7349C, we get 0x47429C. This will be the virtual address where our string resides when the game is running.
If you search for that value in the exe though, you’re going to come up short. Pointers in an executable are stored in Little Endian, which means the bytes are stored in reverse order. So lets take our address, 0x47429C. First we need to make sure it’s a full 4 bytes, so pad our number with 0’s until we have 4 bytes (8 hex characters), giving us 0x0047429C. Then we take each of the 4 bytes, 00 47 42 9C, and reverse their order. What you should get is 0x9C424700. That’s what a pointer to our string will look like.
We’ll do the same thing to get the virtual address of our new string. Take ImageBase + Virtual Offset – Raw Offset, but use the values for our .rdata2 section. We should get an offset of 0x403A00. This offset will be the same for everything in .rdata2. Add that to our string’s position, 0xBC600, and you get 0x4C0000. If you reverse the bytes to get a little endian value, you get 0x00004C00.
It works! We’ve successfully added a longer than original string into the executable. Now, we just need to do this for all the strings that we are adding that are longer than their original counterparts, and we’re good! Obviously you don’t need to mess around with adding a sections to do the other strings, you just need to repeat the remaining process for each of the remaining strings.