When I was undergoing a course on Calculus in college, I often felt the urge to plot the graphs of various functions, just to check them out how they look like, to have a feel. So, back then I wrote a small program which evaluated the values of y for a set of values of x and then plotted them. The mathematical expression for y was written into a C function which was called by main() while iterating over the values of x.
But soon, it started to irritate me! Because, every time I wanted to do this for another expression, I had to rewrite the function, compile and link it! But I had no choice. The least I could do was to write a parser which read the expression from the input, build a parse tree (dont know much about parsing? check it out here), and then recursively traverse that to evaluate y. Well, no! I wanted something in machine code. Yes, you got it; I wanted a mini-compiler inside my program, which would read an expression string and create a function online to be called again and again.
The rest of this post illustrates an example of how to do it. To keep it short and simple, a binary operation will be parsed. Also, this example will be absolutely non-portable. That means, in short, that it will not even compile on a non-x86-non-unix machine. Well, casting it to suit a particular architecture/platform wont be very hard, I believe! For reference to x86 instructions, refer here.
And one more thing! This is going to be extremely dirty down there. Please hold on to it with patience (and this).
The first thing we require is allocate memory for the function that will be generated. Since we are going to use mprotect() to obtain write (and then read/exec) permissions to this memory, its better to allocate an entire page for this. posix_memalign() will do that for us.
The next job will be to write the function in the memory pointed to by ptr. create_function() will do it when called. We will deal with this function sometime later. After we manage to do that, the page should be made read and execute only and assigned to an appropriate function pointer.
This done the function can be called just like any other C function.
Okay, take a break, get a beer! The next part tells what create_function() does for us.
Coming back, a C function of the type as declared here
will be called like this:
Caller
In our case, we need to have a function like this
All that we need to do in create_function() is to fill the memory with these instructions and the intended operation. Once we are done, the program is ready to compile our expression. A sample program can be downloaded from here (modified to parse all elementary binary operations). The output from this program looks like
This is a very simple illustration of how a compiler works. Hope this motivates the reader towards further studying compiler design in depth.
But soon, it started to irritate me! Because, every time I wanted to do this for another expression, I had to rewrite the function, compile and link it! But I had no choice. The least I could do was to write a parser which read the expression from the input, build a parse tree (dont know much about parsing? check it out here), and then recursively traverse that to evaluate y. Well, no! I wanted something in machine code. Yes, you got it; I wanted a mini-compiler inside my program, which would read an expression string and create a function online to be called again and again.
The rest of this post illustrates an example of how to do it. To keep it short and simple, a binary operation will be parsed. Also, this example will be absolutely non-portable. That means, in short, that it will not even compile on a non-x86-non-unix machine. Well, casting it to suit a particular architecture/platform wont be very hard, I believe! For reference to x86 instructions, refer here.
And one more thing! This is going to be extremely dirty down there. Please hold on to it with patience (and this).
The first thing we require is allocate memory for the function that will be generated. Since we are going to use mprotect() to obtain write (and then read/exec) permissions to this memory, its better to allocate an entire page for this. posix_memalign() will do that for us.
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
void *ptr;
posix_memalign(&ptr, getpagesize(), getpagesize());
mprotect(ptr,1,PROT_WRITE);
The next job will be to write the function in the memory pointed to by ptr. create_function() will do it when called. We will deal with this function sometime later. After we manage to do that, the page should be made read and execute only and assigned to an appropriate function pointer.
char operator; //this can have '+' or '-'int (*operation_function_created_online)(int, int);
create_function(ptr, operator);
mprotect(ptr,1,PROT_READ|PROT_EXEC);
operation_function_created_online=ptr;
This done the function can be called just like any other C function.
#include <stdio.h>
int result,val1,val2;
result=operation_function_created_online(val1, val2);
printf("operation %s = %d\n", expr_string, result);
Okay, take a break, get a beer! The next part tells what create_function() does for us.
Coming back, a C function of the type as declared here
int function(int val1, int val2)
will be called like this:
- the arguments are passed in stack right to left. so val1will be pushed and then val2 will be.
- The function is called, so the next element in the stack is the return address.
- The function, in its first instruction, will push ebp to stack, to save its value, so that the stack can be addressed using this register.
- The return value is sent back in eax.
- The original value of ebp is popped back before returning.
Callee
Line number | Opcode | Instruction |
1 | 55 | pushl %ebp |
2 | 89 e5 | movl %esp, %ebp |
... rest of the function ... ... store result in EAX ... | ||
---|---|---|
8 | 5d | popl %esp |
9 | c3 | ret |
... some text removed ...
6: mov op2, %eax
7: push %eax
8: mov op1, %eax
9: push %eax
10: call Callee
... do something with EAX ...
... some text removed ...
In our case, we need to have a function like this
1: push %ebp
2: movl %esp, %ebp
3: movl 8(%ebp), %eax
4: movl 12(%ebp), %eax
5:
6: ;do the operation here to store result in EAX
7:
8: pop %ebp
9: ret
All that we need to do in create_function() is to fill the memory with these instructions and the intended operation. Once we are done, the program is ready to compile our expression. A sample program can be downloaded from here (modified to parse all elementary binary operations). The output from this program looks like
This is a very simple illustration of how a compiler works. Hope this motivates the reader towards further studying compiler design in depth.
No comments:
Post a Comment