:l One does not simply ask how something complex and large works without explaining why they want to know?
There is absolutely no "noob" way to explain reflection or injection at all.
When you write a program, instructions are stored in memory or translated to assembly. Java is a language that is both compiled and interpreted. The JVM uses JIT (Just In Time Compiling) sometimes to compile the byte-code down to machine code on the fly so that the CPU can understand it.
TLDR, Java is compiled to byte code which can be interpreted or compiled down to machine code. All implementation defined.
Now that we understand this, we can move onto reflection. Brace yourself.. I'll be showing code in both compiled (C), assembled(ASM) and interpreted (Java/Bytecode) below..
Let us say you have a structure in a compiled language such as C or ASM:
C Code:
struct foo
{
short a; /* 2 bytes */
int b; /* 4 bytes */
char c[6]; /* 6 bytes */
};
int main()
{
//Do whatever with foo here..
return 0;
}
when this is compiled to Assembly, it can look like:
ASM Code:
section .data
struct:
struc Foo ;This name is usually replaced with something the compiler chooses.
a: resw 1 ;reserve word
b: resd 1 ;reserve dword
c: resb 6 ;reserve byte
endstruc
FooSize: dd $ - struct
section .text
global main
extern printf
extern scanf
extern malloc
main:
push ebp
mov ebp,esp
;Do whatever with struct here..
add esp, 4
mov esp, ebp
pop ebx
mov eax, 0 ;Return 0.
ret
and a function in C such as:
C Code:
void Bar(int I)
{
//Function's body.
}
ASM Code:
__ZB@4: ;This all depends on the compiler/calling convention.. The compiler will name this function. 4 is the size of the argument aka an integer.
push ebp
mov ebp,esp
;Function's body.
pop ebx
mov eax, 0
ret
Now looking at the above, in memory and registers, the CPU has no idea wtf a structure or a function is.. All it knows is instructions and addresses of things.. In fact, even your compiled program knows nothing about the structure of "Foo". When you want to access the variables or functions inside the "Foo" structure, you're telling it to just retrieve something stored at a specific address. Notice that Meh is actually just a label for convenience that you can jump to. That's really what a function looks like.
Another example is functions like the below in Java:
Java Code:
void print
(String Str
) { System.
out.
println(Str
);}Integer inc
(Integer I
){ return I
+ 1;}void foo
(int A,
char B,
int C
){ return 0;}
Turns into:
Java Code:
print
(Ljava
/lang
/String;)V
; Code
: 0: getstatic #
1; //Field java/lang/System.out:Ljava/io/PrintStream; 3: aload_0
4: invokevirtual #
2; //Method java/io/PrintStream.println:(Ljava/lang/String;)V 7: returninc
(Ljava
/lang
/Integer) Ljava
/lang
/Integer; Code
: 0: aload_1
1: invokevirtual #
3; //Method java/lang/Integer.intValue:()I 4: iconst_1
5: iadd
6: invokestatic #
4; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 9: areturn
foo
(IZI
)V
; //Notice the arguments this time. Code
: 0: aload_1
.
.
//Getting tired typing all out.. .
5: return
And as you can see, it actually keeps the signature of each function in a "Class file" which the JVM interprets or compiles. It stores accessors as well (private, public, static, protected, etc).. It also sometimes stores the names of the functions and whatever else.. Because it stores this information, the reader or observer of the class can actually retrieve information about a specified type or variable or function. The biggest point I'm trying to make is that the bytecode actually has information that you can use.. However, the compiled code (C/ASM) has no information at all. Just a bunch of addresses and registers whereas the bytecode has everything mentioned above.
Retrieving and using this information is known as reflection (you cannot do this in C or ASM or any compiled language that does not store such info).
When you run a java application, the JVM instantiates a "ClassLoader" that reads the class files and figures out what to do with each instruction and the information. In order to load RS into Smart or any Java bots, the authors need to instantiate a class loader and load all the classes within the jar file. They then need to instantiate an instance of the required classes such as the "RS2Applet.class" or the "Client.class"..
However, taking it a step further, you can store all classes loaded into an array and given the "path" to a class (usually "PackageName\ClassName" or "StaticClassName.FieldName" or "ClassInstance.FieldName"), you can use the following to access information about a field or function/member:
Java Code:
Class c
= myClassLoaderInstance.
loadClass("ClassNameHere");Field f
= c.
getDeclaredField("nameOfSomeVariableOrFieldHere");f.
setAccessible(true);f.
get(null);
the above uses the classLoader that loaded the applet and all the RS classes. It then gets a field/variable holding some valuable information. You need to set that variable accessible in order to do anything with it.
That's just the tip of the iceburg but yeah.. That's basically all there is to reflection really.. You just get the classname/class instance and the field you want to access and you use the above to retrieve the information about it.
Injection however requires that you insert code into the client and then you can call your injected function which can spit out information that it retrieved. Most of the time you do not need to to call your injected function because the client calls it for you (depends entirely on where you inject the code). Usually you inject a getter or setter into a class or class member/function.
Take for example the following Java code:
Java Code:
void Foo()
{
//Foo was called.
}
And in ASM:
ASM Code:
__Foo_Func:
push ebp
mov ebp,esp
;Foo was called.
pop ebx
mov eax, 0
ret
Now if we want to inject into that function, what we have to do is make that code call our function. It'd look like this:
Java Code:
void MyFunc()
{
}
void Foo()
{
MyFunc(); //When Foo is called, it automatically calls my function.
}
And in ASM, this is the equivalent of a technique called the trampoline technique requiring two JMP instructions.
ASM Code:
__Foo_Func:
push ebp
mov ebp,esp
push eax, [0x1000h] ;Push this address before the call.. Lets call this addr 0x0998h;
jmp __MyFunc_Func ;Call MyFunc..
;This is the address after the call.. Lets call this addr 0x1000h;
pop ebx
mov eax, 0
ret
__MyFunc_Func:
push ebp
mov ebp,esp
;Body of MyFunc..
;Run my code here..
pop ebx
mov eax, 0
ret eax; ;Can either jmp to addr 0x1000h above or just return to the caller.
In the above, it works like this:
Foo is called. Foo calls MyFunc by jumping to the address of MyFunc. My func is ran and it has two choices. Either return the results to Foo or simply jump back to the "address after the call".
In java, injection is usually done through a Visitor using the ASM framework v4.XX. Practically the exact same thing as what I wrote above using ASM is going on under the hood of a Java bot.