This is your last chance. After this, there is no turning back. You take the blue pill – the story ends, you wake up in your bed and believe whatever you want to believe. You take the red pill – you stay in Wonderland and I show you how deep the rabbit hole goes.
– Morpheus, The Matrix
Alright. So, this guide is meant for people who want to learn how to approach game hacking. You won’t be able to hack Valorant after reading this article, so if that’s your expectation, I’m sorry but I won’t deliver on that. I will not cover hacking of multiplayer games in this article, but I do think this guide is worth a read regardless. I want to focus on the general principles behind game hacking/modding, and to teach you to think for yourself so that, one day, you can do whatever you want. But stay away from Valorant because I need my RR.
Table of Contents
Introduction
To be able to hack something, or modify something as you see fit, you need to have a thorough understanding of how the system works. The system can be anything: a game, a person, industrial control systems, etc. Since we’re interested in game hacking, that means we need to understand how our target game works. How do we gain that knowledge? Well, we need to learn how to reverse engineer software. That will be the whole premise of this article.
I assume that you have a basic understanding of computers. If you don’t know what memory is, and if you have no programming experience whatsoever, this article will probably go over your head. But you can try to follow along anyway.
I will start with the basics, so if you feel you know the basics already, feel free to skip further ahead in the guide. And with that, let’s get started.
Cheat Engine
Don’t click off just yet. Cheat Engine is a lot more powerful than you’d think, and we’re not stopping there, so don’t worry. But if you’re completely new to game hacking, this is a great place to start. Cheat Engine is obviously known for its memory features, but there’s so much more you can do.
We’ll start with how to edit memory. We’ll use Sonic Adventure as our game of choice to hack. We’ll start with the “hello world” of memory editing. We’re going to learn how to search through the memory of the game for certain values that are of interest to us, so that we can change them. And what better value to start off with than the ring counter in Sonic Adventure?
We want to give ourselves an infinite amount of rings. Let’s open up Cheat Engine, and select the Sonic Adventure DX process.
This will be very straightforward. We’re going to search for the current amount of rings in our possession (3), and then change the amount of rings, and look for the new values. We’ll keep doing this until we’re left with only a few memory addresses left.
Well, that was easy… Let’s add one of the addresses to our address list, and let’s try changing the value by double clicking it, and typing in a random number.
Okay, so this was really easy. Are you ready to move on to something harder? Same here.
Our next goal will be the following: we’ll locate the code in the game responsible for increasing our ring count when we’re touching the rings, and we’ll change that code to add an arbitrary amount of rings instead of just one at a time.
Unfortunately, we’re going to have to learn a bit of theory before we can just jump straight into it.
The Basics of Reverse Engineering
We’re going to have to learn what a program looks like at a lower level. More specifically, we need to learn how to read assembly code, and we need to learn about the stack, and CPU registers. This section of the guide right here will probably be the most frustrating for you if you don’t have any prior knowledge. I encourage you to stick with it because once you get over this hurdle, you’ll find yourself with the ability to do a lot more than before.
Basics of Assembly Code
You might’ve heard about assembly code. Assembly code is pretty much just the mnemonic representation of machine code. So, assembly code is simply a bunch of CPU instructions that, when executed in some sequence, will perform some task. When we’re reverse engineering some piece of software, we’re most likely going to be working at this lower level most of the time.
Here’s an example C++ program.
#include <iostream>
int increaseRings(int rings)
{
rings += 1;
return rings;
}
int main()
{
int rings = 0;
std::cout << increaseRings(rings) << std::endl;
}
Here's what the assembly code of this program would look like.
#include <iostream>
int increaseRings(int rings)
{
00007FF6AACA18B0 mov dword ptr [rsp+8],ecx
00007FF6AACA18B4 push rbp
00007FF6AACA18B5 push rdi
00007FF6AACA18B6 sub rsp,0E8h
00007FF6AACA18BD lea rbp,[rsp+20h]
00007FF6AACA18C2 lea rcx,[__09178418_ConsoleApplication1@cpp (07FF6AACB2067h)]
00007FF6AACA18C9 call __CheckForDebuggerJustMyCode (07FF6AACA1370h)
rings += 1;
00007FF6AACA18CE mov eax,dword ptr [rings]
00007FF6AACA18D4 inc eax
00007FF6AACA18D6 mov dword ptr [rings],eax
return rings;
00007FF6AACA18DC mov eax,dword ptr [rings]
}
00007FF6AACA18E2 lea rsp,[rbp+0C8h]
00007FF6AACA18E9 pop rdi
00007FF6AACA18EA pop rbp
00007FF6AACA18EB ret
int main()
{
00007FF6AACA1910 push rbp
00007FF6AACA1912 push rdi
00007FF6AACA1913 sub rsp,108h
00007FF6AACA191A lea rbp,[rsp+20h]
00007FF6AACA191F lea rcx,[__09178418_ConsoleApplication1@cpp (07FF6AACB2067h)]
00007FF6AACA1926 call __CheckForDebuggerJustMyCode (07FF6AACA1370h)
int rings = 0;
00007FF6AACA192B mov dword ptr [rings],0
std::cout << increaseRings(rings) << std::endl;
00007FF6AACA1932 mov ecx,dword ptr [rings]
00007FF6AACA1935 call increaseRings (07FF6AACA105Fh)
00007FF6AACA193A mov edx,eax
00007FF6AACA193C mov rcx,qword ptr [__imp_std::cout (07FF6AACB0170h)]
00007FF6AACA1943 call qword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (07FF6AACB0158h)]
00007FF6AACA1949 lea rdx,[std::endl<char,std::char_traits<char> > (07FF6AACA1037h)]
00007FF6AACA1950 mov rcx,rax
00007FF6AACA1953 call qword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (07FF6AACB0150h)]
}
00007FF6AACA1959 xor eax,eax
00007FF6AACA195B lea rsp,[rbp+0E8h]
00007FF6AACA1962 pop rdi
00007FF6AACA1963 pop rbp
00007FF6AACA1964 ret
Now, this will probably look like gibberish to you. A lot of this code is actually not relevant because some of the code pertains to debugging-related things. Let’s take a look at the code in the increaseRings function first, more specifically these four lines of code:
00007FF6AACA18CE mov eax,dword ptr [rings]
00007FF6AACA18D4 inc eax
00007FF6AACA18D6 mov dword ptr [rings],eax
return rings;
00007FF6AACA18DC mov eax,dword ptr [rings]
First of all, what’s this eax stuff? EAX is the name of one of the CPU registers in the x86 architecture. A CPU register is used for fast storage that a processor can use. CPU registers are very useful because they can be accessed a lot faster than RAM, so for many small operations, you’ll find CPU registers being used for fast storage. This is not to be confused with the CPU Cache, they are two different things.
In the code above, you’ll see the instruction: mov eax, dword ptr [rings]. This instruction moves the value pointed to by the “rings” pointer (our ring count) into the eax register. In the next instruction:
00007FF6AACA18D4 inc eax
This just increments the value stored in EAX by one.
00007FF6AACA18D6 mov dword ptr [rings],eax
This instruction moves what’s stored in eax (our updated ring counter number) back into the rings value in memory.
00007FF6AACA18DC mov eax,dword ptr [rings]
The function stores the return value in the eax register. If you’re following along, you might notice that this last line of code seems redundant since eax already has the updated ring value in it already. Apparently, this is deliberate because the compiler has to ensure that the return value is stored in eax, even if that means adding an extra seemingly redundant instruction like the one above.
Hopefully, this brief explanation of assembly code was easy to understand. There are quite a few assembly instructions, and explaining every single one would take too much time, so check out this resource for more detailed information on assembly code. We’re going to be learning as we go because that’s the most fun, at least in my opinion.
Hacking The Ring Function
I remember reading a blog post about reverse engineering a few years ago. I can’t remember exactly where I read it, but there was one thing from that blog post that stood out to me. He/She talked about how when you’re reverse engineering something, you need to find the “edges” of an application. I’m paraphrasing, but I really liked this way of explaining it. When you’re reversing something, you’re not going through every line of code one by one sequentially to understand the entirety of some software. Instead, you have to locate the code relevant to whatever your goals are, and then understand what that code does.
In our case, we want to locate the code responsible for updating our ring counter whenever we grab a new ring. Depending on what your goals are, you might use various methods for getting to the relevant code. In this case, it’s actually super simple to get to the relevant code because we already have the ring counter variable available for us, so we can use a cool Cheat Engine feature. Let me show you how it works.
Right now, we have the memory address of our ring counter variable. We tried changing it to 69, and it worked. And now, we’re going to find out what code writes to this memory address. What code is writing to our ring counter variable? We do this by right-clicking the memory address in our address list, and clicking: “find out what writes to this address”.
As you can tell from the video, a little window pops up, and you can already see one entry in there. That’s probably not the one that we want because we haven’t touched any rings yet. I grabbed some rings, and another entry popped up. That one looks a lot more promising. Let’s go to that code real quick.
As you can see, there’s quite a bit to go through here. The line of code that’s highlighted is the code that updates our ring variable. Here’s some of the code:
push ebp
mov ebp,esp
sub esp,0C
mov ax,["Sonic Adventure DX.exe"+573EBB8]
mov [ebp-0C],ax
movsx ecx,word ptr [ebp+08]
movsx edx,word ptr ["Sonic Adventure DX.exe"+573EBB8]
add edx,ecx
mov ["Sonic Adventure DX.exe"+573EBB8],dx
The last line is the code that updates our ring variable. Looking beyond that is probably not necessary, so let’s have a look at what happens before our ring counter variable is updated. The first thing I notice is an “add” instruction. This is interesting to me.
add edx, ecx
This line of code adds whatever’s stored in ecx into edx. I suspect that this code is responsible for adding 1 to our ring counter. Let’s place a breakpoint on that last line of code. Breakpoints are useful because the entire application halts when the “breakpoint” is hit, and you can look at the state of the memory, the cpu registers, and everything else for as long as you’d like, and you can step through every line of code one by one to see what happens.
Do you see how the entire game froze when we hit the ring? That’s because our breakpoint was hit! Here’s what the CPU registers look like at that moment in time:
In the add instruction, we see that whatever’s stored in ecx is added to edx. When we look at the registers, we can see that the value of ECX is 1, and the value of EDX is 3. Our current ring count is 3!
This tells me that if we were to change this instruction such that a random number is added to EDX instead of what’s in ECX, we can add any number that we want, instead of incrementing the ring count by 1 each time.
Modifying the game’s code
Okay, so how do we do this? Well, we can accomplish this by doing something called Code Injection. Thankfully, Cheat Engine has made this really easy for us. Let’s highlight the add instruction, and then let’s go to Tools -> Auto Assemble.
This window will pop up. Click on Template -> Full Injection.
Click OK, and some code will pop up.
define(address,"Sonic Adventure DX.exe"+154BDB)
define(bytes,03 D1 66 89 15 B8 EB B3 05)
[ENABLE]
assert(address,bytes)
alloc(newmem,$1000)
label(code)
label(return)
newmem:
code:
add edx,ecx
mov ["Sonic Adventure DX.exe"+573EBB8],dx
jmp return
address:
jmp newmem
nop 4
return:
[DISABLE]
address:
db bytes
// add edx,ecx
// mov ["Sonic Adventure DX.exe"+573EBB8],dx
dealloc(newmem)
{
// ORIGINAL CODE - INJECTION POINT: Sonic Adventure DX.exe+154BDB
Sonic Adventure DX.exe+154BBD: 5D - pop ebp
Sonic Adventure DX.exe+154BBE: C3 - ret
Sonic Adventure DX.exe+154BBF: CC - int 3
Sonic Adventure DX.exe+154BC0: 55 - push ebp
Sonic Adventure DX.exe+154BC1: 8B EC - mov ebp,esp
Sonic Adventure DX.exe+154BC3: 83 EC 0C - sub esp,0C
Sonic Adventure DX.exe+154BC6: 66 A1 B8 EB B3 05 - mov ax,["Sonic Adventure DX.exe"+573EBB8]
Sonic Adventure DX.exe+154BCC: 66 89 45 F4 - mov [ebp-0C],ax
Sonic Adventure DX.exe+154BD0: 0F BF 4D 08 - movsx ecx,word ptr [ebp+08]
Sonic Adventure DX.exe+154BD4: 0F BF 15 B8 EB B3 05 - movsx edx,word ptr ["Sonic Adventure DX.exe"+573EBB8]
// ---------- INJECTING HERE ----------
Sonic Adventure DX.exe+154BDB: 03 D1 - add edx,ecx
// ---------- DONE INJECTING ----------
Sonic Adventure DX.exe+154BDD: 66 89 15 B8 EB B3 05 - mov ["Sonic Adventure DX.exe"+573EBB8],dx
Sonic Adventure DX.exe+154BE4: 0F BF 45 F4 - movsx eax,word ptr [ebp-0C]
Sonic Adventure DX.exe+154BE8: 99 - cdq
Sonic Adventure DX.exe+154BE9: B9 64 00 00 00 - mov ecx,00000064
Sonic Adventure DX.exe+154BEE: F7 F9 - idiv ecx
Sonic Adventure DX.exe+154BF0: 89 45 FC - mov [ebp-04],eax
Sonic Adventure DX.exe+154BF3: 0F BF 05 B8 EB B3 05 - movsx eax,word ptr ["Sonic Adventure DX.exe"+573EBB8]
Sonic Adventure DX.exe+154BFA: 99 - cdq
Sonic Adventure DX.exe+154BFB: B9 64 00 00 00 - mov ecx,00000064
Sonic Adventure DX.exe+154C00: F7 F9 - idiv ecx
}
All you need to worry about is what comes directly after the code: label.
code:
add edx,ecx
mov ["Sonic Adventure DX.exe"+573EBB8],dx
jmp return
Right now, the code is just equal to the original code. But we can now change the add instruction to whatever we want, and we can add code as we see fit. Let’s change the add instruction to the following:
add edx, 45
Click execute at the bottom of the window, and let’s see if it works!
Wow. We now get 69 (45 in hex) rings for every ring. I’d say we accomplished something here.
I don’t expect you to understand everything we’ve done so far in the guide right away, but just keep at it.
Advanced Section: Reverse Engineering a Save File
Alright. I think it’s time to get our hands dirty. Let’s tackle something a bit more challenging. Let’s move on to another game: Sonic Adventure 2. We’re going to reverse engineer the checksum for the chao save file. I know for a fact that there is a checksum because I have tried editing the chao save file before, and this is what comes up…
If we can figure out how the checksum is calculated, we can write our own program that adds a checksum after we have modified the save file. That’s what we’re going to be doing.
Before we get started reversing, let’s take a quick look at the chao save file.
There’s a SAVEDATA folder in the game’s directory, and you’re met with a few different files. Let’s try and open up some of these files in a hex editor to see if we see something interesting.
That’s interesting. In the SONIC2B_ALF file, the first few bytes seem to be the signature for the save file: “SONIC ADVENTURE 2 BATTLE CHAO GARDEN DATA”. Something tells me this is the correct file…
Now comes the hard part. We need to apply our philosophy of finding the “edges” of an application like I talked about earlier. We need to find a way to get to the code that writes a checksum to the save file. If we can find that code, we can take apart the code and figure out what it does.
We’ll be using two new tools this time around: x64dbg and Ghidra. x64dbg is a debugger we can use to analyze an executable dynamically. We could use Cheat Engine for this too, but I prefer using x64dbg when I’m doing primarily dynamic analysis. Cheat Engine is great for memory-related reversing, but I find x64dbg’s UI to be better as a debugger.
Ghidra is great for static analysis. We’re not running an executable, but instead, we’re looking at just the code of the executable. Ghidra also features a decompiler which tries to convert assembly code to C code to the best of its ability. It’s not perfect, but it’s a really nice tool to have to make sense of things. I use both of these tools at the same time when I’m reversing, usually.
Okay, so how do we get to the relevant code? Well, we know that the game has to get a file handle for our chao save file in order to write a checksum to it upon saving. This means that if we can set a breakpoint on the CreateFile function that returns a file handle, we should be able to step through the code, and eventually get to the code that deals with the checksum. Let me show you how to do this.
Let’s open up x64dbg, and attach the debugger to our Sonic game.
If we press ALT + E, we jump to the symbols tab. Here we can look at the different modules used by the executable, along with the functions that are inside each module.
Modules & imports
Sometimes, your program needs to use code in other external libraries, usually in the form of DLLs. DLLs, or dynamic-link libraries, are files that contain code and data that can be shared by multiple programs. They allow programs to use common functionality without having to duplicate code or waste memory. You want to deal with GUI stuff? You’ll be dealing with user32.dll. You want to read the contents of a file? Kernel32.dll. You get my point.
The useful thing to know here is that we can look at the DLLs that are loaded by the executable, and the functions that they have. If we take a look at the kernel32.dll module by highlighting it in x64dbg, you’ll see that CreateFileA is one of the functions that it exports. If it exports a function, that means it’s available for the program to use. If we take a look at the sonic2app.exe module, we’ll see some imports. These imports are functions that the game already uses that are a part of other libraries.
This is useful to know because most of the things that we are interested in reverse engineering can be traced back to some library call, like CreateFile, for instance.
Let’s go into the Sonic2app.exe module, and look through its imports. We can search for “CreateFile”. This is what comes up.
If we highlight one of these functions – let’s pick the first one – and press F2, it’ll set a breakpoint on that function.
Let’s enter the chao garden and step through the program when the breakpoint is hit to see how the game handles the save file. The first thing you’ll see as you enter the chao garden is that the program is creating file handles for many files, not just the save file. In the window under the registers, you can see the function arguments being passed into the CreateFile function. We’re only interested in inspecting the game’s behaviour when the first argument is the path to our save file, not some other file.
The Call stack
We’re now going to take a look at something called the call stack. You see, when a function call is made in assembly code, something called a stack frame is created. A stack frame consists of the function arguments, and the return address. When a breakpoint is hit, we can inspect the call stack and see a chain of functions that eventually lead to the current function call – in our case, the CreateFile function. The call stack can also be useful for understanding the logic and the hierarchy of function calls at certain places in the code.
As we go further down the call stack, we go further down the chain of function calls. We know that the program will eventually return to the code that called the CreateFileA function, and we’re obviously interested in what’s going to happen after the file has been loaded, not necessarily during or before the loading. Our goal is to go far enough back to where we can see the code that deals with the actual contents of the save file. Let’s go one step back in the call stack to the second entry in the list with the address: 07BBA0C.
Combining static and dynamic analysis using ghidra & x64dbg
What I’ll do now is load up Ghidra, and add our game executable to Ghidra so that we can analyze the code and take advantage of its decompiler. After it’s loaded up, I like to jump to specific places of interest (using the memory addresses that we can see from the debugger), and look at the decompiled code. It’s not perfect of course, but it can make things a lot easier. So, let’s do that with the address 07BBA0C by pressing G inside Ghidra.
/* WARNING: Function: __SEH_prolog4 replaced with injection: SEH_prolog4 */
/* WARNING: Function: __SEH_epilog4 replaced with injection: EH_epilog3 */
/* Library Function - Single Match
__sopen_helper
Library: Visual Studio 2008 Release */
errno_t __cdecl
__sopen_helper(char *_Filename,int _OFlag,int _ShFlag,int _PMode,int *_PFileHandle,int _BSecure)
{
int *piVar1;
errno_t eVar2;
undefined4 local_20 [5];
undefined4 uStack12;
undefined *local_8;
local_8 = &DAT_00910660;
uStack12 = 0x7bb9a1;
local_20[0] = 0;
if (((_PFileHandle == (int *)0x0) || (*_PFileHandle = -1, _Filename == (char *)0x0)) ||
((_BSecure != 0 && ((_PMode & 0xfffffe7fU) != 0)))) {
piVar1 = __errno();
eVar2 = 0x16;
*piVar1 = 0x16;
__invalid_parameter(0,0,0,0,0);
}
else {
local_8 = (undefined *)0x0;
eVar2 = __tsopen_nolock(local_20,_Filename,_OFlag,_ShFlag,_PMode);
local_8 = (undefined *)0xfffffffe;
FUN_007bba33();
if (eVar2 != 0) {
*_PFileHandle = -1;
}
}
return eVar2;
}
This is what Ghidra shows me. If you look at the function definition, you can see that Ghidra recognizes this to be a library function called __sopen_helper. We’re not really interested in reverse engineering library functions, only functions that pertain to the actual handling of the chao data.
Let’s go even further down the call stack in our debugger, and then look up those new addresses in Ghidra like we did just now.
/* Library Function - Single Match
__sopen_s
Library: Visual Studio 2008 Release */
errno_t __cdecl
__sopen_s(int *_FileHandle,char *_Filename,int _OpenFlag,int _ShareFlag,int _PermissionMode)
{
errno_t eVar1;
eVar1 = __sopen_helper(_Filename,_OpenFlag,_ShareFlag,_PermissionMode,_FileHandle,1);
return eVar1;
}
Another library function… Hm… This means we need to step even further back into the call stack. I’ll save you the trouble, and just tell you straight up that we need to go all the way back. Let’s see what that last entry looks like in Ghidra.
/* WARNING: Could not reconcile some variable overlaps */
void FUN_00429480(undefined4 param_1,undefined4 param_2)
{
uint uVar1;
int **in_FS_OFFSET;
undefined auStack76 [3];
undefined local_49;
undefined local_48 [4];
uint local_44;
undefined4 local_34;
uint local_30;
uint local_28;
undefined4 local_18;
uint local_14;
uint local_10;
int *local_c;
undefined *puStack8;
undefined4 local_4;
local_4 = 0xffffffff;
puStack8 = &LAB_0086f690;
local_c = *in_FS_OFFSET;
local_10 = DAT_00917300 ^ (uint)auStack76;
uVar1 = DAT_00917300 ^ (uint)&stack0xffffffa4;
*in_FS_OFFSET = (int *)&local_c;
local_14 = 0xf;
local_18 = 0;
local_28 = local_28 & 0xffffff00;
FUN_00401f00(param_2,0,0xffffffff);
local_4 = 0;
FUN_0086ba60(uVar1);
local_4 = CONCAT31(local_4._1_3_,1);
FUN_00401f00(local_48,0,0xffffffff);
local_49 = FUN_00869cc0();
if (0xf < local_30) {
FUN_007a5974(local_44);
}
local_30 = 0xf;
local_34 = 0;
local_44 = local_44 & 0xffffff00;
if (0xf < local_14) {
FUN_007a5974(local_28);
}
*in_FS_OFFSET = local_c;
FUN_007a597f();
return;
}
Alright, we see some function calls and some XOR operations. This looks interesting, could this be the checksum calculation code? It could be. I’ve already looked through this code so I know for a fact that it’s not, and this is why it can be so hard to explain reverse engineering in a linear way because it’s really not linear in any shape or form. Reverse engineering is a lot of jumping back and forth, and making educated guesses. If you know what you’re looking for, it makes things a bit easier, but it’s still a time consuming and confusing process. Let’s set a new breakpoint at the last entry of the call stack, so that we can look at that new call stack from there.
If we look in Ghidra at the address of the second entry in the call stack, we’ll see some really interesting code.
/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */
void UndefinedFunction_0052dd90(int param_1)
{
undefined4 *puVar1;
void *_Dst;
undefined *puVar2;
undefined4 uVar3;
int iVar4;
void *_Src;
size_t sVar5;
undefined4 *puVar6;
float10 fVar7;
int iStack1032;
char acStack1028 [512];
char acStack516 [512];
uint uStack4;
uStack4 = DAT_00917300 ^ (uint)&iStack1032;
puVar6 = *(undefined4 **)(param_1 + 0x40);
iStack1032 = param_1;
if (puVar6 == (undefined4 *)0x0) {
puVar1 = (undefined4 *)(**DAT_01d19cac)(0x28,s_..\..\src\Chao\al_confirmload.c_01366cb8,0xcd);
*puVar1 = 0x12345678;
puVar6 = puVar1 + 1;
*puVar6 = 0;
puVar1[2] = 0;
puVar1[3] = 0;
puVar1[4] = 0;
puVar1[5] = 0;
puVar1[6] = 0;
puVar1[7] = 0;
puVar1[8] = 0;
puVar1[9] = 0;
*(undefined4 **)(param_1 + 0x40) = puVar6;
*(undefined **)(param_1 + 0x18) = &LAB_0052dd00;
FUN_00544a70();
FUN_005438a0();
}
LAB_0052de10:
switch(*puVar6) {
case 0:
FUN_0052f110();
iVar4 = DAT_0173d06c;
puVar6[2] = DAT_0173d06c;
if ((iVar4 == 0) || (iVar4 == 1)) {
*puVar6 = 1;
}
else {
*puVar6 = 0xf;
}
goto LAB_0052de10;
case 1:
puVar6[8] = 0x2000;
_DAT_01a557e0 = 0;
*puVar6 = 2;
goto LAB_0052de10;
case 2:
_DAT_01a557e0 = 0;
*puVar6 = 3;
case 3:
puVar6[4] = 1;
*puVar6 = 4;
goto LAB_0052de10;
case 4:
iVar4 = FUN_00426ac0();
if (iVar4 == -4) {
LAB_0052dfb9:
*puVar6 = 7;
}
else {
if (iVar4 == -1) goto switchD_0052de1b_caseD_10;
if (iVar4 != 0) {
FUN_00426740(s_load_error_:_open_01333e44);
FUN_0052dc20();
if ((puVar6[1] != 0) && (FUN_0052dc90(), puVar6[1] != 0)) {
FUN_0052dc90();
iVar4 = puVar6[1];
LAB_0052dedb:
if (iVar4 != 0) {
FUN_0052fb80();
}
}
goto LAB_0052dee2;
}
puVar6[5] = 1;
*puVar6 = 5;
}
goto LAB_0052de10;
case 5:
_DAT_01a5af28 = 0;
if (puVar6[6] == 0) {
puVar1 = (undefined4 *)
(**DAT_01d19cac)(0xfc04,s_..\..\src\Chao\al_confirmload.c_01366d08,0x19e);
*puVar1 = 0x12345678;
_memset(puVar1 + 1,0,0xfc00);
puVar6[6] = puVar1 + 1;
}
iVar4 = FUN_00426860(0xfc00);
if (iVar4 == -4) {
_DAT_019f6444 = 0;
goto LAB_0052dfb9;
}
if (iVar4 == -1) goto switchD_0052de1b_caseD_10;
if (iVar4 == 0) {
*puVar6 = 6;
}
else {
FUN_00426740(s_load_error_:_load_0_01366d28);
FUN_0052dc20();
if ((puVar6[1] != 0) && (FUN_0052dc90(), puVar6[1] != 0)) {
FUN_0052dc90();
iVar4 = puVar6[1];
goto LAB_0052dedb;
}
LAB_0052dee2:
*puVar6 = 0xf;
}
goto LAB_0052de10;
case 6:
sVar5 = 0xca6c;
_Src = (void *)(puVar6[6] + 0x3040);
_Dst = (void *)FUN_0052e440();
_memcpy(_Dst,_Src,sVar5);
iVar4 = FUN_0052f030();
if (iVar4 != 0) goto LAB_0052e314;
FUN_00426740(s_load_error_:_crc_013340b4);
FUN_0052dc20();
if (((puVar6[1] != 0) && (FUN_0052dc90(), puVar6[1] != 0)) && (FUN_0052dc90(), puVar6[1] != 0))
{
FUN_0052fb80();
*puVar6 = 0xf;
goto LAB_0052de10;
}
break;
case 7:
DAT_0174afd4 = 1;
_DAT_01a5af28 = 1;
uVar3 = FUN_00426960();
switch(uVar3) {
case 0:
*puVar6 = 8;
FUN_0052dc20();
if (puVar6[1] != 0) {
FUN_0052dc90();
}
goto LAB_0052de10;
case 0xfffffff7:
DAT_0174afd4 = 0;
FUN_0052dc20();
puVar2 = &DAT_01335b4c;
if (puVar6[2] != 0) {
puVar2 = &DAT_01335ccc;
}
_sprintf(acStack516,(char *)(*(int *)(DAT_01a259e4 + 0xdc) + DAT_01a259e4),puVar2,
(int)(0xfaab / (longlong)(int)puVar6[8]) + 1);
if ((puVar6[1] != 0) && (FUN_0052dcd0(), puVar6[1] != 0)) {
FUN_0052fb80();
*puVar6 = 0xf;
goto LAB_0052de10;
}
break;
case 0xfffffff8:
DAT_0174afd4 = 0;
FUN_0052dc20();
puVar2 = &DAT_01335404;
if (puVar6[2] != 0) {
puVar2 = &DAT_013357b4;
}
_sprintf(acStack1028,(char *)(*(int *)(DAT_01a259e4 + 0xf0) + DAT_01a259e4),puVar2);
if ((puVar6[1] != 0) && (FUN_0052dcd0(), puVar6[1] != 0)) {
FUN_0052fb80();
*puVar6 = 0xf;
goto LAB_0052de10;
}
break;
default:
DAT_0174afd4 = 0;
FUN_00426740(s_load_error_:_create_0_01366d58);
FUN_0052dc20();
if ((puVar6[1] != 0) && (FUN_0052dc90(), puVar6[1] != 0)) {
FUN_0052fb80();
*puVar6 = 0xf;
goto LAB_0052de10;
}
break;
case 0xffffffff:
goto switchD_0052de1b_caseD_10;
}
break;
case 8:
puVar6[5] = 1;
*puVar6 = 9;
goto LAB_0052de10;
case 9:
*puVar6 = 10;
goto LAB_0052de10;
case 10:
*puVar6 = 0xb;
goto LAB_0052de10;
case 0xb:
*puVar6 = 0xc;
goto LAB_0052de10;
case 0xc:
puVar6[5] = 1;
*puVar6 = 0xd;
goto LAB_0052de10;
case 0xd:
DAT_0174afd4 = 1;
sVar5 = ((int)(0xfaab / (longlong)(int)puVar6[8]) + 1) * puVar6[8];
if (puVar6[6] == 0) {
puVar1 = (undefined4 *)
(**DAT_01d19cac)(sVar5 + 4,s_..\..\src\Chao\al_confirmload.c_01366d88,0x29f);
*puVar1 = 0x12345678;
_memset(puVar1 + 1,0,sVar5);
puVar6[6] = puVar1 + 1;
FUN_0052eee0();
uVar3 = FUN_0052e440();
FUN_005326c0(uVar3);
}
iVar4 = FUN_00426760(sVar5);
if (iVar4 == -1) goto switchD_0052de1b_caseD_10;
if (iVar4 == 0) {
*puVar6 = 0xe;
}
else {
DAT_0174afd4 = 0;
FUN_00426740(s_save_error_:_save_0_01366da8);
FUN_0052dc20();
if ((puVar6[1] == 0) || (FUN_0052dc90(), puVar6[1] == 0)) break;
FUN_0052fb80();
*puVar6 = 0xf;
}
goto LAB_0052de10;
case 0xe:
DAT_0174afd4 = 0;
LAB_0052e314:
_DAT_019f6444 = 1;
break;
case 0xf:
if ((puVar6[1] == 0) || (*(short *)(DAT_01a259fc + 10) < 1)) {
FUN_00402100();
fVar7 = (float10)FUN_0086b320();
if ((ushort)((ushort)(fVar7 - (float10)_DAT_019f6448 < (float10)1000.0) << 8 |
(ushort)(fVar7 - (float10)_DAT_019f6448 == (float10)1000.0) << 0xe) == 0) {
if (puVar6[1] != 0) {
FUN_005437b0();
puVar6[1] = 0;
}
*puVar6 = 0x10;
*(code **)(iStack1032 + 0x10) = FUN_0046f720;
}
}
switchD_0052de1b_caseD_10:
FUN_007a597f();
return;
default:
goto switchD_0052de1b_caseD_10;
}
*puVar6 = 0xf;
goto LAB_0052de10;
}
We see some interesting strings here like “chao/al_confirmload” and “load_error: crc”. This is definitely a hint and a half that we’re going in the right direction. Especially the “load_error: crc” string. In fact, if I were reversing this for the first time, I would just look at the code surrounding that particular string, find the address of the first instruction in that particular case of the switch statement, and set a breakpoint for it in our debugger. I can’t remember if that’s what I did the first time around, but I would definitely do it now. And we will, because why not?
Okay, good. Our breakpoint was hit. Right now, our EDX register clearly contains a pointer to the chao data. If you look at the registers, you can see that. As we step through the program, we’ll see these instructions being executed:
mov edx,dword ptr ds:[edi+18]
push CA6C
add edx,3040
push edx
call sonic2app.52E440
push eax
call sonic2app.7AB860
mov ecx,dword ptr ds:[173D06C]
add esp,C
cmp ecx,1
je sonic2app.52DFEE
xor ecx,ecx
imul ecx,ecx,CA6C
add ecx,sonic2app.19F6460
call sonic2app.52F030
It pushes 0xCA6C to the stack. After that, we can see 0x3040 being added to the address that’s stored in EDX. This is clearly an offset into the chao data. We can then see the chao data (with the added offset of 0x3040) being pushed onto the stack, and a function being called. After a quick look at the function in Ghidra, I can tell that it’s not really relevant, or not relevant enough to spend more time on that particular function, so let’s move on.
We see another function call after that, and if we take a look at that function call in Ghidra, we can see that it recognizes that function call as memcpy.
_Src = (void *)(puVar6[6] + 0x3040);
_Dst = (void *)FUN_0052e440();
_memcpy(_Dst,_Src,sVar5); //sVar5 is defined earlier with the value 0xCA6C
The code clearly copies what’s stored in the chao data at an offset of 3040, and it copies all the way to 0xCA6C. What could this mean? Let’s take a look at what’s in the chao data file (using a hex editor) at offset 0x3040 + CA6C = 0xFAAC.
The byte at 0xFAAC is clearly the last byte in that particular sequence of bytes, and everything after that is just zeros. This looks a lot like a checksum to me, I don’t know about you.
If we take a look at function call after the memcpy function, we’ll see something really interesting.
bool __fastcall FUN_0052f030(int param_1)
{
undefined uVar1;
undefined uVar2;
undefined uVar3;
undefined uVar4;
int iVar5;
uVar1 = *(undefined *)(param_1 + 0xca6b);
uVar2 = *(undefined *)(param_1 + 0xca64);
uVar3 = *(undefined *)(param_1 + 0xca66);
uVar4 = *(undefined *)(param_1 + 0xca69);
*(undefined *)(param_1 + 0xca69) = 0;
*(undefined *)(param_1 + 0xca64) = 0;
*(undefined *)(param_1 + 0xca6b) = 0;
*(undefined *)(param_1 + 0xca66) = 0;
*(undefined *)(param_1 + 0xca67) = 0;
iVar5 = FUN_00549c40();
return CONCAT31(CONCAT21(CONCAT11(uVar3,uVar1),uVar2),uVar4) == iVar5;
}
param_1 here is the new buffer that contains a copy of the chao data from 0x3040 to 0xCA6C. We can see that some of those bytes are being set to zero, and then another function call is made.
uint __fastcall FUN_00549c40(byte *param_1)
{
uint uVar1;
int iVar2;
iVar2 = 0xca6c;
uVar1 = 0x6368616f;
do {
uVar1 = uVar1 >> 8 ^ *(uint *)(&DAT_008a6ff8 + (uVar1 & 0xff ^ (uint)*param_1) * 4);
param_1 = param_1 + 1;
iVar2 = iVar2 + -1;
} while (iVar2 != 0);
return uVar1 ^ 0x686f6765;
Bingo! This looks like a crc algorithm to me! We finally got to the code, and now we just need to reverse it.
If we highlight the function in Ghidra, we can see that this particular function is referenced from two different places: 0x0052efbe, and 0x0052f07c.
The function at 0x0052f07c is the one I posted above, but there’s another one. If we take a look at the other one, it actually contains more code:
void FUN_0052eee0(void)
{
uint uVar1;
int iVar2;
undefined4 uVar3;
int unaff_ESI;
ushort in_FPUControlWord;
uint uVar4;
undefined local_4;
*(undefined *)(unaff_ESI + 0xca69) = 0;
*(undefined *)(unaff_ESI + 0xca64) = 0;
*(undefined *)(unaff_ESI + 0xca6b) = 0;
*(undefined *)(unaff_ESI + 0xca66) = 0;
*(undefined *)(unaff_ESI + 0xca67) = 0;
uVar1 = _rand();
uVar4 = uVar1 & 0xffff0000;
local_4 = (undefined)(int)ROUND((double)uVar1 * 3.0517578125e-05 * 255.9998931884766);
*(undefined *)(unaff_ESI + 0xca65) = local_4;
iVar2 = _rand();
uVar4 = uVar4 & 0xffff0000;
local_4 = (undefined)(int)ROUND((double)iVar2 * 3.0517578125e-05 * 255.9998931884766);
*(undefined *)(unaff_ESI + 0xca68) = local_4;
iVar2 = _rand();
local_4 = (undefined)(int)ROUND((double)iVar2 * 3.0517578125e-05 * 255.9998931884766);
*(undefined *)(unaff_ESI + 0xca6a) = local_4;
uVar3 = FUN_00549c40(uVar4 & 0xffff0000 | (uint)in_FPUControlWord);
*(char *)(unaff_ESI + 0xca69) = (char)uVar3;
*(char *)(unaff_ESI + 0xca64) = (char)((uint)uVar3 >> 8);
*(char *)(unaff_ESI + 0xca6b) = (char)((uint)uVar3 >> 0x10);
*(char *)(unaff_ESI + 0xca66) = (char)((uint)uVar3 >> 0x18);
iVar2 = _rand();
local_4 = (undefined)(int)ROUND((double)iVar2 * 3.0517578125e-05 * 255.9998931884766);
*(undefined *)(unaff_ESI + 0xca67) = local_4;
return;
}
We can see more happening here, so this is most likely the right function, and the other one we looked at was mainly used for checking if the current checksum inside the chao data save file was correct, and it was in fact not responsible for writing a new checksum for our chao save file. So, how do we continue from here?
I would use Ghidra and x64dbg in conjunction to make sense of what’s happening in here. However, if you’re new to reversing, this bit might be too much currently, so I won’t actually go into detail on how the algorithm works. Instead, I’ll just leave you with my own implementation that I made in C, and if you have Sonic Adventure 2, feel free to test it out. I’ve already tested it, and it works.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define FILENAME "PATH_TO_SONIC2B_ALF"
#define BYTES 0x10000
#define CONSTANT_ONE 0x6368616F
#define CONSTANT_TWO 0x686f6765
unsigned int checksum_table[] = {
0x00000000, 0xC9073096, 0x920E612C, 0xA50951BA, 0xFF6DC419, 0xCA6AF48F, 0x9163A535, 0xA66495A3,
0xFEDB8832, 0xCFDCB8A4, 0x94D5E91E, 0xA3D2D988, 0xF9B64C2B, 0xCCB17CBD, 0x97B82D07, 0xA0BF1D91,
0xFDB71064, 0xC4B020F2, 0x9FB97148, 0xA8BE41DE, 0xF2DAD47D, 0xC7DDE4EB, 0x9CD4B551, 0xABD385C7,
0xF36C9856, 0xC26BA8C0, 0x9962F97A, 0xAE65C9EC, 0xF4015C4F, 0xC1066CD9, 0x9A0F3D63, 0xAD080DF5,
0xFB6E20C8, 0xD269105E, 0x896041E4, 0xBE677172, 0xE403E4D1, 0xD104D447, 0x8A0D85FD, 0xBD0AB56B,
0xE5B5A8FA, 0xD4B2986C, 0x8FBBC9D6, 0xB8BCF940, 0xE2D86CE3, 0xD7DF5C75, 0x8CD60DCF, 0xBBD13D59,
0xE6D930AC, 0xDFDE003A, 0x84D75180, 0xB3D06116, 0xE9B4F4B5, 0xDCB3C423, 0x87BA9599, 0xB0BDA50F,
0xE802B89E, 0xD9058808, 0x820CD9B2, 0xB50BE924, 0xEF6F7C87, 0xDA684C11, 0x81611DAB, 0xB6662D3D,
0xF6DC4190, 0xFFDB7106, 0xA4D220BC, 0x93D5102A, 0xC9B18589, 0xFCB6B51F, 0xA7BFE4A5, 0x90B8D433,
0xC807C9A2, 0xF900F934, 0xA209A88E, 0x950E9818, 0xCF6A0DBB, 0xFA6D3D2D, 0xA1646C97, 0x96635C01,
0xCB6B51F4, 0xF26C6162, 0xA96530D8, 0x9E62004E, 0xC40695ED, 0xF101A57B, 0xAA08F4C1, 0x9D0FC457,
0xC5B0D9C6, 0xF4B7E950, 0xAFBEB8EA, 0x98B9887C, 0xC2DD1DDF, 0xF7DA2D49, 0xACD37CF3, 0x9BD44C65,
0xCDB26158, 0xE4B551CE, 0xBFBC0074, 0x88BB30E2, 0xD2DFA541, 0xE7D895D7, 0xBCD1C46D, 0x8BD6F4FB,
0xD369E96A, 0xE26ED9FC, 0xB9678846, 0x8E60B8D0, 0xD4042D73, 0xE1031DE5, 0xBA0A4C5F, 0x8D0D7CC9,
0xD005713C, 0xE90241AA, 0xB20B1010, 0x850C2086, 0xDF68B525, 0xEA6F85B3, 0xB166D409, 0x8661E49F,
0xDEDEF90E, 0xEFD9C998, 0xB4D09822, 0x83D7A8B4, 0xD9B33D17, 0xECB40D81, 0xB7BD5C3B, 0x80BA6CAD,
0xEDB88320, 0xA4BFB3B6, 0xFFB6E20C, 0xC8B1D29A, 0x92D54739, 0xA7D277AF, 0xFCDB2615, 0xCBDC1683,
0x93630B12, 0xA2643B84, 0xF96D6A3E, 0xCE6A5AA8, 0x940ECF0B, 0xA109FF9D, 0xFA00AE27, 0xCD079EB1,
0x900F9344, 0xA908A3D2, 0xF201F268, 0xC506C2FE, 0x9F62575D, 0xAA6567CB, 0xF16C3671, 0xC66B06E7,
0x9ED41B76, 0xAFD32BE0, 0xF4DA7A5A, 0xC3DD4ACC, 0x99B9DF6F, 0xACBEEFF9, 0xF7B7BE43, 0xC0B08ED5,
0x96D6A3E8, 0xBFD1937E, 0xE4D8C2C4, 0xD3DFF252, 0x89BB67F1, 0xBCBC5767, 0xE7B506DD, 0xD0B2364B,
0x880D2BDA, 0xB90A1B4C, 0xE2034AF6, 0xD5047A60, 0x8F60EFC3, 0xBA67DF55, 0xE16E8EEF, 0xD669BE79,
0x8B61B38C, 0xB266831A, 0xE96FD2A0, 0xDE68E236, 0x840C7795, 0xB10B4703, 0xEA0216B9, 0xDD05262F,
0x85BA3BBE, 0xB4BD0B28, 0xEFB45A92, 0xD8B36A04, 0x82D7FFA7, 0xB7D0CF31, 0xECD99E8B, 0xDBDEAE1D,
0x9B64C2B0, 0x9263F226, 0xC96AA39C, 0xFE6D930A, 0xA40906A9, 0x910E363F, 0xCA076785, 0xFD005713,
0xA5BF4A82, 0x94B87A14, 0xCFB12BAE, 0xF8B61B38, 0xA2D28E9B, 0x97D5BE0D, 0xCCDCEFB7, 0xFBDBDF21,
0xA6D3D2D4, 0x9FD4E242, 0xC4DDB3F8, 0xF3DA836E, 0xA9BE16CD, 0x9CB9265B, 0xC7B077E1, 0xF0B74777,
0xA8085AE6, 0x990F6A70, 0xC2063BCA, 0xF5010B5C, 0xAF659EFF, 0x9A62AE69, 0xC16BFFD3, 0xF66CCF45,
0xA00AE278, 0x890DD2EE, 0xD2048354, 0xE503B3C2, 0xBF672661, 0x8A6016F7, 0xD169474D, 0xE66E77DB,
0xBED16A4A, 0x8FD65ADC, 0xD4DF0B66, 0xE3D83BF0, 0xB9BCAE53, 0x8CBB9EC5, 0xD7B2CF7F, 0xE0B5FFE9,
0xBDBDF21C, 0x84BAC28A, 0xDFB39330, 0xE8B4A3A6, 0xB2D03605, 0x87D70693, 0xDCDE5729, 0xEBD967BF,
0xB3667A2E, 0x82614AB8, 0xD9681B02, 0xEE6F2B94, 0xB40BBE37, 0x810C8EA1, 0xDA05DF1B, 0xED02EF8D
};
unsigned char* read_file()
{
unsigned char* buffer;
// Allocate 0x10000 bytes (the size of our chao save file)
buffer = (unsigned char*)malloc(0x10000);
if (buffer == NULL) {
printf("Memory allocation failed\n");
return NULL;
}
FILE* fp;
fp = fopen(FILENAME, "rb");
if (fp == NULL) {
printf("File not found\n");
free(buffer);
return NULL;
}
fseek(fp, 0, SEEK_END);
long filesize = ftell(fp);
fseek(fp, 0, SEEK_SET);
size_t n = fread(buffer, 1, filesize, fp);
if (n != BYTES) {
printf("File read error\n");
free(buffer);
fclose(fp);
return NULL;
}
fclose(fp);
return buffer;
}
void save_file(unsigned char* buffer)
{
FILE* fp;
fp = fopen(FILENAME, "wb");
if (fp == NULL) {
printf("Error: could not open %s\n", FILENAME);
exit(1);
}
fwrite(buffer, sizeof(char), BYTES, fp);
fclose(fp);
}
unsigned int checksum_func(unsigned char* buffer)
{
int offset = 0x3040;
int length = 0xCA6C;
unsigned int var = CONSTANT_ONE;
for (int i = 0; i < length; i++)
{
unsigned int var2 = buffer[offset + i];
unsigned int var3 = var & 0x000000ff;
var3 = var2 ^ var3;
var = var >> 8;
var = var ^ checksum_table[(var3)];
}
return var ^ CONSTANT_TWO;
}
void write_checksum(unsigned char* buffer)
{
buffer[0xFAA9] = 0;
buffer[0xFAA4] = 0;
buffer[0xFAAB] = 0;
buffer[0xFAA6] = 0;
buffer[0xFAA7] = 0;
double num1 = 3.0517578125e-05;
double num2 = 255.9998931884766;
srand(time(NULL));
int var = (int)((double) rand() * num1 * num2);
buffer[0xFAA5] = var;
var = (int)(rand() * num1 * num2);
buffer[0xFAA8] = var;
var = (int)(rand() * num1 * num2);
buffer[0xFAAA] = var;
int var2 = 0x6368616F;
unsigned int checksum = checksum_func(buffer);
buffer[0xFAA9] = (char)checksum;
buffer[0xFAA4] = (char)((unsigned int)checksum >> 0x8);
buffer[0xFAAB] = (char)((unsigned int)checksum >> 0x10);
buffer[0xFAA6] = (char)((unsigned int)checksum >> 0x18);
var = (int)(rand() * num1 * num2);
buffer[0xFAA7] = var;
save_file(buffer);
}
int main() {
unsigned char* buffer = read_file();
if (buffer != NULL)
{
write_checksum(buffer);
free(buffer);
}
return 0;
}
If you have any requests for specific content you’d like to see guides being made on, feel free to leave a comment. And with that, I’ll leave you with these final words:
No, that’s just how I spell thanks. I mean, phanks.
Awesome guide, thank you.
Thank you, and no problem. 🙂
*Phank
This is awesome!
Thank you very much, mate. 🙂
GIVE ME MORE!