It was a lazy afternoon when I was reading a book on Vedic Mathematics. So I asked a friend sitting next to me:
"Hey, how fast can you add 123456789 and 987654321?"
Clever as he is, he quickly opened the calculator application in his desktop, and started typing. To that, I screamed:
"Hey, you are not supposed to use calculators! And anyways, do you think your computer is always right?"
"Well, of course! As far as calculations are concerned! And its just an addition!"
"Ok, how much for my computer adding 2 and 2 to give 5? or 6? or anything I want?"
"You can bet a 1000 bucks if you want! I just love easy money! :P"
So, I started off with it. I wrote a program to add 2 and 2 like this
Ok, so it adds 2 and 2 in the printf() statement. That should convince him. And that means I have to write the function do_magic() for the real trick.
Well, to start off, every operation must be having in its operands somewhere stored in memory. My job should be to change the value of any one of the two operands, so that the addition gives a result I want it to produce. To do that, I need to check where in memory are the two operands (the two 2's) stored. Well, a disassembly can do that for me. That's how the function main() looks when disassembled.
So, at line 5, I see do_magic is being called, and in line 6, .LC0 which must be containing the string literal "I want to print 4, and printed %d\n" to be passed to printf(), is being stored at eax. But I could not find any add operations in there. Well, then I realized that gcc must have added them automatically and put that statement
at line 7. So, now I have to directly change this value inside the function do_magic()! But, how? How am I supposed to know the address of this movl statement in there?
Well, of course I can! When do_magic() will be called, the address to the next instruction will be pushed to the stack. Then, the first instruction in that function will push the value of ebp in the stack to save its contents. Next, the value of esp will be stored in ebp. Confused, how? Well, That is what every function is compiled to do. Check out lines 1 and 2 in the disassembly of main(). That says it, right?
So, after the first two instructions, ebp will contain the address of the beginning of the stack, where previous value of ebp is saved. Just above that (yes, stack grows downwards) is the address of line 6 of main(), that is the address of
Once I know the size of this instruction, I can calculate the location of my desired statement. Now, how can I find the size of this instruction? Ok, lets compile this code and get the hex dump of the object file. This can be done as
And the object dump will look like this
The highlighted line is the one we are looking for. So, its 5 bytes in length. And not just that, the next line is where we need to do the change. That instruction is of length 8 bytes. Now, a little intuition says that 0xC7 0x44 0x24 0x04 is the opcode to move an immediate value to the address specified by 4(%esp). the trailing 0x04 says it all. So, the "4" we need to change is written as a 32 bit int value in little-endian format as 0x04 0x00 0x00 0x00. That's it! So, summing it all up, do_magic has to do these in order:
Oh, you got to notice that! mprotect() was asked to grant all permissions on this page. This is because if do_magic() is in the same memory page as main() [which it will usually be], with only the PROT_WRITE permission (and hence PROT_READ and PROT_EXEC revoked), the call to mprotect() will never be able to return!
Okay, saying all that, I just wrote it down to a function, and added a bit of salt and sugar, to cook up this up
Well, that should be enough to irritate my friend! Sure he will fire up when he gets to see this:
Now, just in case, if somebody is thinking that these can only be used to goof up with systems, in the next post, we will see how such memory protection schemes can be put to some good.
"Hey, how fast can you add 123456789 and 987654321?"
Clever as he is, he quickly opened the calculator application in his desktop, and started typing. To that, I screamed:
"Hey, you are not supposed to use calculators! And anyways, do you think your computer is always right?"
"Well, of course! As far as calculations are concerned! And its just an addition!"
"Ok, how much for my computer adding 2 and 2 to give 5? or 6? or anything I want?"
"You can bet a 1000 bucks if you want! I just love easy money! :P"
So, I started off with it. I wrote a program to add 2 and 2 like this
1:
/*Listing of boozed_desktop.c
2:
It adds 2 and 2 and then faints*/
3: 4: #include <stdio.h> 5: 6: void do_magic(void); 7: 8: int main(){ 9: do_magic();
10: printf("I want to print 4, and printed %d\n",2+2);11: }
Ok, so it adds 2 and 2 in the printf() statement. That should convince him. And that means I have to write the function do_magic() for the real trick.
Well, to start off, every operation must be having in its operands somewhere stored in memory. My job should be to change the value of any one of the two operands, so that the addition gives a result I want it to produce. To do that, I need to check where in memory are the two operands (the two 2's) stored. Well, a disassembly can do that for me. That's how the function main() looks when disassembled.
1: pushl %ebp
2: movl %esp, %ebp
3: andl $-16, %esp
4: subl $16, %esp
5: call do_magic
6: movl $.LC0, %eax
7: movl $4, 4(%esp)
8: movl %eax, (%esp)
9: call printf
10: leave
11: ret
So, at line 5, I see do_magic is being called, and in line 6, .LC0 which must be containing the string literal "I want to print 4, and printed %d\n" to be passed to printf(), is being stored at eax. But I could not find any add operations in there. Well, then I realized that gcc must have added them automatically and put that statement
movl $4, 4(%esp)
at line 7. So, now I have to directly change this value inside the function do_magic()! But, how? How am I supposed to know the address of this movl statement in there?
Well, of course I can! When do_magic() will be called, the address to the next instruction will be pushed to the stack. Then, the first instruction in that function will push the value of ebp in the stack to save its contents. Next, the value of esp will be stored in ebp. Confused, how? Well, That is what every function is compiled to do. Check out lines 1 and 2 in the disassembly of main(). That says it, right?
So, after the first two instructions, ebp will contain the address of the beginning of the stack, where previous value of ebp is saved. Just above that (yes, stack grows downwards) is the address of line 6 of main(), that is the address of
movl $.LC0, %eax
Once I know the size of this instruction, I can calculate the location of my desired statement. Now, how can I find the size of this instruction? Ok, lets compile this code and get the hex dump of the object file. This can be done as
$ gcc -c -o boozed_desktop.o boozed_desktop.c
$ objdump --disassemble boozed_desktop.o
And the object dump will look like this
The highlighted line is the one we are looking for. So, its 5 bytes in length. And not just that, the next line is where we need to do the change. That instruction is of length 8 bytes. Now, a little intuition says that 0xC7 0x44 0x24 0x04 is the opcode to move an immediate value to the address specified by 4(%esp). the trailing 0x04 says it all. So, the "4" we need to change is written as a 32 bit int value in little-endian format as 0x04 0x00 0x00 0x00. That's it! So, summing it all up, do_magic has to do these in order:
- Get the return address from stack by using ebp.
- Add 5 bytes + 4 bytes to that address to get the address of "4".
- Modify it!
Is that all? No, not quite ... Actually, the memory region we are going to modify is holding the executable code. So this memory page must be read-and execute-only page. And we need to have the write permissions to this page. No problem! mprotect() can do that for me. And for that, we will send mprotect() the address of the starting location of this page [which is done by address & (~(getpagesize()-1))]. This will be done by
mprotect((unsigned char *)((unsigned int)address & ~(getpagesize()-1)),1,PROT_READ|PROT_EXEC|PROT_WRITE);
Oh, you got to notice that! mprotect() was asked to grant all permissions on this page. This is because if do_magic() is in the same memory page as main() [which it will usually be], with only the PROT_WRITE permission (and hence PROT_READ and PROT_EXEC revoked), the call to mprotect() will never be able to return!
Okay, saying all that, I just wrote it down to a function, and added a bit of salt and sugar, to cook up this up
1: #include <stdio.h>
2: #include <unistd.h>
3: #include <sys/mman.h>
4:
5: void do_magic(void);
6:
7: int main(){
8: do_magic();
9: printf("I want to print 4, and printed %d\n",2+2);
10: }
11:
12: void do_magic(void){
13: int *var;
14: unsigned char *retloc;
15: int number;
16:
17: printf("How much do you want \"2+2\" to print?\n");
18: scanf("%d",&number);
19:
20: __asm__ volatile("mov 4(%%ebp), %0\n\t":"=r"(retloc):);
21: mprotect((unsigned char *)((unsigned int)retloc & ~(getpagesize()-1)),1,PROT_READ|PROT_EXEC|PROT_WRITE);
22: *((int *)(retloc+(5+4)))=number;
23: }
Well, that should be enough to irritate my friend! Sure he will fire up when he gets to see this:
Now, just in case, if somebody is thinking that these can only be used to goof up with systems, in the next post, we will see how such memory protection schemes can be put to some good.
Awesome...!!! Amrar institution di te tomar moto koyek jon dorkar aslo bhai. I started growing more interest in c.
ReplyDeletehaha, clever
ReplyDeleteNice one Mr. Paul
ReplyDelete