Introduction
Recently myself and @elfyyy; wrote updaters and I thought.. Why not release mine.. Then the idea that everyone would just copy it (instead of learning from it) hit us and so we decided a tutorial is in order.
Everyone everywhere on the RS scene keeps saying that an updater is hard work. The answer is that it really isn't. An updater is more TIME-CONSUMING than hard work.
Now this may not be the best way to get an updater done.. or the best tutorial (give me suggestions if you want, I will add them), but it's okay for a start. Ideally, you'd want ControlFlowGraphs which help greatly.. Anyway, let's get started..
Table Of Contents
1. An Intro to byte-code.
2. OPCodes/Mnemonics/Instructions.
3. Byte-code patterns and structures.
4. Client structure.
5. About the ASM Library & JDK Internals.
6. Load the client & Writing Analysers.
7. Identifying classes and fields.
8. Byte-code Deobfuscation.
9. Tools.
10. The End + Sample Updater.
Intro to byte-code.
What is byte-code? Byte-code is an intermediate language that Java code is translated to. It's the assembly language of the Java virtual machine. When Java is compiled, it is compiled into byte-code which is then interpreted by the JVM and executed.
Before I show you some byte-code, lets dive right in with a few byte-code signatures:
Byte-code Signature |
Java Type |
Z |
Boolean |
B |
Byte |
C |
Char |
S |
Short |
I |
Int |
J |
Long |
F |
Float |
D |
Double |
[T |
Array of T |
[[T |
Array of Array of T - 2D array of T |
[[[T |
Array of Array of Array of T - 3D array of T |
Lcls; |
Fully qualified name of "cls". - See below for examples |
Ljava/lang/String; |
java.lang.String |
Lpackage/subpackage/Class; |
package.subpackage.Class |
Lpackage/InterfaceName; |
package.InterfaceName |
Using the above table, we can describe the following methods using byte-code signatures:
Java Code:
public void a
(int a,
int b,
char c,
boolean d
);public int b
(int a,
byte b,
boolean c,
long d
);public String c
(int[] a
);public static void main
(String[] args
);package meh;public class Foo
{}package bleh;public class Bar
{}public Bar some_func
(int[] a,
String b, Foo c,
long[][]d,
boolean e
);
Bytecode Equivalent:
Java Code:
(IICZ
)V
<-- method a.
(IBZJ
)I
<-- method b.
([I
)Ljava
/lang
/String; <-- method c.
(Ljava
/lang
/String;)V
<-- method main.
([ILjava
/lang
/String;Lmeh
/Foo
;[[JZ
)Lbleh
/Bar
; <-- method some_func.
We can describe the layout of a class using the above information as well:
Java Code:
package com.rs.collection;
public class Node {
Node prev, next;
long uid;
public void unlink();
}
Byte-code layout:
Java Code:
Lcom/rs/collection/Node; {
Lcom/rs/collection/Node; prev;
Lcom/rs/collection/Node; next;
J uid;
()V; unlink;
}
OPCodes/Mnemonics/Instructions:
So now that you know how to describe fields, classes, interfaces, and methods, let us learn about OPCodes/instructions. What is an instruction? In technical terms, an instruction is a voltage that is sent to a piece of hardware (CPU for example) to tell it what to do next and what registers to operate on. However, for those that aren't so technical, an instruction is sent to the JVM telling it what it must do next, what to compare, what to execute, etc..
So what is an OPCode then? An OPCode is the byte/integer version of an instruction. For example, in Assembly x86, JMP is an unconditional jump instruction but its OPCode is 0xE9. This means that in memory, the JMP instruction is represented by the byte/integer value: 0xE9. When the CPU sees this, it executes a JUMP/GOTO.
Some instructions may take multiple operands. For example, the JMP instruction takes ONE operand: The address to "goto". JMP 0x1990 would be the equivalent of saying: goto address 0x1990.
A Mnemonic is usually a tool that helps you remember things easily or an acronym that stands for something. MOST instructions stand for something. For example: NOP means No-Operation (aka: Do NOTHING).. JMP means "Jump". IFEQ means "If equal to".
So on and so forth.. Below is a table of 200 OPCodes that we'll be using in our updater (we won't be using all, but we may SEE most of them in use by applications.. There are more, but they aren't in this table):
Java Code:
public interface Opcodes {
int NOP = 0;
int ACONST_NULL = 1;
int ICONST_M1 = 2;
int ICONST_0 = 3;
int ICONST_1 = 4;
int ICONST_2 = 5;
int ICONST_3 = 6;
int ICONST_4 = 7;
int ICONST_5 = 8;
int LCONST_0 = 9;
int LCONST_1 = 10;
int FCONST_0 = 11;
int FCONST_1 = 12;
int FCONST_2 = 13;
int DCONST_0 = 14;
int DCONST_1 = 15;
int BIPUSH = 16;
int SIPUSH = 17;
int LDC = 18;
int ILOAD = 21;
int LLOAD = 22;
int FLOAD = 23;
int DLOAD = 24;
int ALOAD = 25;
int IALOAD = 46;
int LALOAD = 47;
int FALOAD = 48;
int DALOAD = 49;
int AALOAD = 50;
int BALOAD = 51;
int CALOAD = 52;
int SALOAD = 53;
int ISTORE = 54;
int LSTORE = 55;
int FSTORE = 56;
int DSTORE = 57;
int ASTORE = 58;
int IASTORE = 79;
int LASTORE = 80;
int FASTORE = 81;
int DASTORE = 82;
int AASTORE = 83;
int BASTORE = 84;
int CASTORE = 85;
int SASTORE = 86;
int POP = 87;
int POP2 = 88;
int DUP = 89;
int DUP_X1 = 90;
int DUP_X2 = 91;
int DUP2 = 92;
int DUP2_X1 = 93;
int DUP2_X2 = 94;
int SWAP = 95;
int IADD = 96;
int LADD = 97;
int FADD = 98;
int DADD = 99;
int ISUB = 100;
int LSUB = 101;
int FSUB = 102;
int DSUB = 103;
int IMUL = 104;
int LMUL = 105;
int FMUL = 106;
int DMUL = 107;
int IDIV = 108;
int LDIV = 109;
int FDIV = 110;
int DDIV = 111;
int IREM = 112;
int LREM = 113;
int FREM = 114;
int DREM = 115;
int INEG = 116;
int LNEG = 117;
int FNEG = 118;
int DNEG = 119;
int ISHL = 120;
int LSHL = 121;
int ISHR = 122;
int LSHR = 123;
int IUSHR = 124;
int LUSHR = 125;
int IAND = 126;
int LAND = 127;
int IOR = 128;
int LOR = 129;
int IXOR = 130;
int LXOR = 131;
int IINC = 132;
int I2L = 133;
int I2F = 134;
int I2D = 135;
int L2I = 136;
int L2F = 137;
int L2D = 138;
int F2I = 139;
int F2L = 140;
int F2D = 141;
int D2I = 142;
int D2L = 143;
int D2F = 144;
int I2B = 145;
int I2C = 146;
int I2S = 147;
int LCMP = 148;
int FCMPL = 149;
int FCMPG = 150;
int DCMPL = 151;
int DCMPG = 152;
int IFEQ = 153;
int IFNE = 154;
int IFLT = 155;
int IFGE = 156;
int IFGT = 157;
int IFLE = 158;
int IF_ICMPEQ = 159;
int IF_ICMPNE = 160;
int IF_ICMPLT = 161;
int IF_ICMPGE = 162;
int IF_ICMPGT = 163;
int IF_ICMPLE = 164;
int IF_ACMPEQ = 165;
int IF_ACMPNE = 166;
int GOTO = 167;
int JSR = 168;
int RET = 169;
int TABLESWITCH = 170;
int LOOKUPSWITCH = 171;
int IRETURN = 172;
int LRETURN = 173;
int FRETURN = 174;
int DRETURN = 175;
int ARETURN = 176;
int RETURN = 177;
int GETSTATIC = 178;
int PUTSTATIC = 179;
int GETFIELD = 180;
int PUTFIELD = 181;
int INVOKEVIRTUAL = 182;
int INVOKESPECIAL = 183;
int INVOKESTATIC = 184;
int INVOKEINTERFACE = 185;
int INVOKEDYNAMIC = 186;
int NEW = 187;
int NEWARRAY = 188;
int ANEWARRAY = 189;
int ARRAYLENGTH = 190;
int ATHROW = 191;
int CHECKCAST = 192;
int INSTANCEOF = 193;
int MONITORENTER = 194;
int MONITOREXIT = 195;
int MULTIANEWARRAY = 197;
int IFNULL = 198;
int IFNONNULL = 199;
}
Byte-code Patterns & Structures:
Before writing an updater, it is crucial to understand a few blocks and patterns that commonly occur in any Java application. These blocks are: "local variables & parameters", "labels & addresses", "branch-statements", "while-loops", "for-loops", "comparisons", and "assignments".
1. Local Variables & Parameters:
Java Code:
//METHOD: m(ZB)Lcc;
cc m(boolean var1, byte var2) {
int j = 2048;
int k = this.f;
int m = 1;
return inst;
}
Byte-code:
Java Code:
//METHOD: m(ZB)Lcc;
sipush 2048
istore_3 //Store the value 2048 into var 3 -- aka j.
aload_0
getfield f //load the value of field f.
istore 4 //store it in var4 -- aka k.
iconst_1 //load the constant "1"
istore 5 //store it in var 5 -- aka m.
aload_0
getfield inst //load the field "inst" and return it.
areturn
So what happened here? Why is j, k, m numbered instead of being used explicitly? The answer is simple. Local variables LOSE their names. Instead, the variables are numbered in the order they are declared and that includes parameters because parameters are LOCAL to that function.
The first parameter is "local var1". The second parameter is "local var2" and the any variables declared within the function start their names off as "local var(last_param + 1)", "local_var(last_var + 1)" and so on..
Now there's one catch to this rule:
Java:
Java Code:
void meh(int a, int b) {
int j = 2048;
int k = this.f;
if (bleh) {
int m = 1;
int L = 5;
}
int z = 3;
}
What happens here? The variables start going in order:
java Code:
a = var1
b = var2
j = var3
k = var4
m = var5
L = var6
z = var5
But you've probably noticed that z and m are both labelled var5.. Why?! Well that's because when a variable dies within its scope, its name is free to use for anything out of scope. Variables "m" and "L" no longer exist once it reaches the end of that if-statement. That's because they are local to that if-statement. They don't exist outside that if-statement and so their numbers go back into the pool to be used.
Think of it like this:
Java Code:
void meh(int a, int b) {
int j = 2048;
int k = this.f;
if (bleh) {
int m = 1; //variable called M inside this if-statement.
int L = 5;
} //variable M inside the if-statement dies here..
int m = 3; //we are allowed to name this variable "M" also because there's no such variable named "M" by the time we get here.
}
2. Labels & Addresses:
Like in Simba, Java also contains goto/jump statements. When you see a "GOTO" statement and a Label such as below:
Java Code:
ALOAD_0
GETFIELD x
LDC 2036719157
IMUL
ISTORE
GOTO L1
Would translate to something like:
Java Code:
var0 = x * 2036719157
GOTO L1 //Although Java doesn't explicitly allow GOTO so see the pascal example below.. I need a better Java example.
Simba Code:
var
var0: Integer;
x: Integer;
procedure foo;
label L1, L2, L3;
begin
var0 = x * 2036719157;
goto L1;
L2:
writeln('World');
goto L3;
L1:
writeln('Hello');
goto L2;
L3:
writeln('Exiting method.');
end;
begin
foo;
end.
As you can see, L1, L2, L3 are labels aka addresses in memory. The code tells it to assign something to X and then immediately start executing instructions location at L1.
Thus the flow of the above pascal code goes like this:
Simba Code:
var0 = X * 2036719157;
writeln('Hello');
writeln('World');
writeln('Exiting Method');
It is the exact same in Java and there's nothing more to it than that. The Byte-code GOTO instruction does the exact same thing as the above Simba/Pascal code.
3. Branch-Statements:
The GOTO code as show above is an unconditional branch statement. It means to jump to some address immediately without thinking about it. Another form of branch statements are condition statements that say: jump to some address immediately BUT only IF some condition is TRUE, OTHERWISE forget it and continue on as normal.
Most of you know this as the "IF" statement which is the same as a GOTO but with a "condition". The below are examples of that:
Byte-code:
Java Code:
ILOAD_2
ICONST_1
IF_ICMPEQ addr
//Other stuff here.
addr:
Equivalent:
Java Code:
if (var2 != 1) { //var2 = ILOAD_2. var2 != 1 is the condition. It will GOTO the "addr" IF the var2 == 1, otherwise it continues on.
//Other stuff here.
} //addr
GOTO Equivalent:
Java Code:
IF var2 == 1 GOTO addr //Notice that the sign is flipped to a "==" instead of "!="
//Other stuff here.
addr:
A more complicated if-statement found within the RS07 client would be:
Bytecode:
Java Code:
LDC 1955946639
ALOAD_0
GETFIELD d/au I
IMUL
IFEQ addr_9
ILOAD_2
ICONST_1
IF_ICMPEQ addr_10
//Do stuff here..
addr_10:
Java Equivalent:
Java Code:
if (1955946639 * var.au != var0) {
if (var2 != 1) {
//Do stuff here..
} //addr_10
} //addr_9
4. While-Loops:
Java Code:
L1: //label for the while-loop.
iload_1 //boolean I.
ifeq L2 //If I != True, GOTO L2
//Other stuff here
goto L1 //Continue looping. GOTO L1
L2: //Return inst.
aload_0
getfield inst
areturn
Java Equivalent:
Java Code:
cc m(boolean i, byte j) {
while(i) {
//Other stuff here..
}
return inst;
}
As you can see, the above byte-code is a loop. Its equivalent to the following (M method) Pascal code:
Simba Code:
var
var0: Integer;
x: Integer;
type
CC = record
dummy: byte;
end;
Function m(i: Boolean; j: Byte): CC;
label L1, L2;
begin
L1:
if (I <> True) then
GOTO L2;
writeln('looping still..');
GOTO L1;
L2:
Result := [0];
end;
begin
m(true, 0);
end.
That's the exact same as:
Simba Code:
function m(I: Boolean; J: Byte): CC;
begin
while(I) do
begin
writeln('looping still..');
end;
Result := [0];
end;
begin
m(true, 0);
end.
You can see that it loops as long as "I" is true. This is how a While-Loop is implemented in byte-code and assembly.
5: For-Loops
Byte-code:
Java Code:
iconst_0 //load 0 into var I.
istore_1 //int I = 0.
L1: //label for the loop.
iload_1 //load I.
iconst_5 //load 5.
if_icmpge L2 //while I < 5, keep looping.. otherwise goto the return statement.
//Otherstuff here..
iinc 1 1 //Increase I by 1.
goto L1 //keep looping..
L2:
return
Java:
Java Code:
void m() {
for (int i = 0; i < 5; ++i) {
//Other stuff here..
}
}
Pascal:
Simba Code:
procedure m();
label L1, L2;
var
I: Integer;
begin
I := 0; //load 0 into variable I.
L1:
if (I > 5) then //while (I < 5), keep looping.. otherwise exit.
GOTO L2;
writeln('looping still..', I);
I := I + 1; //inc I, 1
GOTO L1;
L2:
exit(); //return.
end;
begin
m();
end.
That's the exact same as:
Simba Code:
procedure m();
var
I: Integer;
begin
for I := 0 To 5 Do
writeln('looping still..', I);
end;
begin
m();
end.
4. Client Structure:
Now that you have absorbed all of that information and are ready to learn about the RS client structure, things should be MUCH easier. You will learn what to look for, how the client is structured and how it works. What does what, etc..
We start by looking at a RS client refactor: https://github.com/Rabrg/refactored-.../com/runescape
One of the first things you'll notice is that it's grouped into categories. Each one of those folders is a category/package. This makes things a lot easier because you can decide what to look for first. Usually, you want to search for collections first.
Most collections inherit or use the NODE class as shown below:
Java Code:
public class Node
{ Node prev, next
; long id
;}public class LinkedList { Node head, current
;}public class CacheableNode
extends Node
{ CacheableNode prev, next
;}
So one of the first classes you should attempt to find would be the NODE class. All other collections will use it or inherit from it. Now whenever you see a class with NO PARENT (no extends) such as "Linked-List" above, it AUTOMATICALLY inherits from java.lang.Object. All classes in Java inherit from java.lang.Object.. So the below classes are the same:
After finding the Node and all collections, you should find the BufferedStream, Rasteriser3D, and Actor.
Why? Because a LOT of classes use the BufferedStream class and Animable allows you to find things such as "Actor", "NPCs" and "Players" which in turn allow you to find everything to do with those classes such as the position and the name, etc..
So what's the hierarchy for this?
Java Code:
public class Animable extends CacheableNode { //Animable is sometimes called "Renderable".
}
public class Item extends Animable {
}
public class Actor extends Animable {
}
public class Player extends Actor {
}
public class NPC extends Actor {
}
You can see how it all comes together and how the order in which you should find things, plays a role in making your life easier.
Another thing about the client is that static methods. In one revision, a static method might be in the Actor class and in another revision, it could be in the NPC class, the Node Class, whatever class Jagex's obfuscator puts it in. Thus, when looking for static methods, it is wise to search all the classes (that's what I do anyway).
It is the exact same thing for static fields. They move also. They may be in NPC in one revision or Player in another, or some random class in another revision.
Finally, we get to the dreaded multipliers and control flow. Multipliers are easy to find in a sense but when finding code WITH the multipliers, it can really be a PAIN. The reason is as follows:
Java Code:
void m
() { if (d
* 2147483646 < 36d
) { System.
out.
println("DD"); }}
Bytecode:
Java Code:
aload_0
getfield d
ldc
2147483646imul
i2d
ldc2_w
36.0dcmpg
ifge
12getstatic java
/lang
/System/out
ldc
"DD"invokevirtual java
/io
/PrintStream/println
return
HOWEVER, if the multiplier is switched around:
Java Code:
void m
() { if (2147483646 * d
< 36d
) { System.
out.
println("DD"); }}
Bytecode:
Java Code:
ldc
2147483646aload_0
getfield d
imul
i2d
ldc2_w
36.0dcmpg
ifge
12getstatic java
/lang
/System/out
ldc
"DD"invokevirtual java
/io
/PrintStream/println
return
Then the OPCodes for that IF statement are also in a different ORDER even though the code does the exact same thing! Thus you need to search the client for TWO+ different patterns just to find ONE field with an annoying multiplier.
To fix this, we usually deobfuscate the client so that all of the multipliers are fixed to be ONE way and one way only. That way, the same OPCode pattern is outputted every time. OR you can write a pattern finder that searches for it no matter the order.
Anyway, I have to leave some of the fun and exploring to you, the reader so.. on to the next topic.
5. About the ASM Library & JDK Internals.
Before we move onto actually analysing the client and finding patterns, we need to learn about the ASM library OR JDK Internals. I personally use the JDKInternals which is the exact same thing..
Let us take the NODE class as an example to start off.
Java Code:
public class Node { //ClassNode
Node prev, next; //FieldNode
long uid; //FieldNode
public boolean unlink() { //MethodNode -- ClassMethod
if (this.next != null) { //FieldInsnNode -- Field usage.
}
}
public void dummy_unlink() { //MethodNode -- Class Method.
unlink(); //MethodInsnNode -- Method Call.
int i = 0; //NOT a FieldInsnNode.. Just an ISTORE instruction.
}
}
Recognise anything yet? If you haven't, here's an explanation:
1. Every class is known as a "ClassNode".
2. All the "FIELDS" within that class is known as a "FieldNode".
3. All the "METHODS" within that class is known as a "MethodNode".
4. All the "Fields" WITHIN a method of that class is known as a "FieldInsnNode" (Field Instruction Node). Aka a field being used inside a method.
5. All the "Methods" WITHIN a method of that class is known as a "MethodInsnNode" (Method Instruction Node). Aka a function call within a method.
6. Load the client & Writing Analysers..
Here is where we use all of that knowledge above and drive right into making an Updater. The first thing we must decide is the structure of our updater and how we store things that we find. We need to decide this even before we decide to read the RS jar.
For this, I will use the following structures for holding Class and Field information:
Java Code:
public class JField
{ private String name
; //byte-code name of the field we found. private String desc
; //is it an int? a bool? a Node? what is it. private String id
; //human readable name of the field we found. "Player Name"? "NPC Name"? What? private long multiplier
; //multiplier if any. 0 otherwise. public JField
(String id,
String name,
String desc
) { this(id, name, desc,
0); } public JField
(String id,
String name,
String desc,
long multiplier
) { this.
id = id
; this.
name = name
; this.
desc = desc
; this.
multiplier = multiplier
; } public String getId
() { return id
; } public String getDesc
() { return desc
; } public String getName
() { return name
; } public long getMultiplier
() { return multiplier
; } @Override
public String toString
() { if (multiplier
== 0) { return " " + id
+ " -> " + name
; } return " " + id
+ " -> " + name
+ " * " + String.
valueOf(multiplier
); }}public class JInfo
{ private String id
; //Human readable name of the class we found. Is it "NPC"? "Player"? What? private String name
; //Name of the class in the jar file and byte-code. private ArrayList
<JField
> fields
; //List of fields we found for this class. public JInfo
(String id,
String name
) { this.
id = id
; this.
name = name
; this.
fields = new ArrayList
<>(); } public void addField
(JField field
) { int i
= 0; for (; i
< fields.
size(); ++i
) { if (fields.
get(i
).
getId().
equalsIgnoreCase(field.
getId())) { fields.
remove(i
); break; } } fields.
add(i, field
); } public JField getField
(String id
) { for (JField field
: fields
) { if (field.
getId().
equalsIgnoreCase(id
)) { return field
; } } return null; } @Override
public String toString
() { String result
= "Class " + this.
id + ": " + this.
name + "\n"; result
+= "-------------------------------------------------------"; for (JField field
: fields
) { result
+= field
+ "\n"; } return result
+ "\n"; }}
Next, we need to PARSE the jar file into a list of classes we can operate on. For this, I'll be using the following code:
Java Code:
import jdk.internal.org.objectweb.asm.ClassReader;import jdk.internal.org.objectweb.asm.tree.ClassNode;import java.io.*;import java.util.ArrayList;import java.util.jar.*;public class JarReader
{ private ArrayList
<ClassNode
> classes
= null; private String path
= null; public JarReader
(String path
) { //path to the jar file. "C:/Users/****/Desktop/Revision68.jar" this.
path = path
; } public ArrayList
<ClassNode
> getClasses
() { return classes
; } public ArrayList
<ClassNode
> load
() { try { JarFile file
= new JarFile(path
); classes
= readJar
(file
); file.
close(); } catch (IOException e
) { e.
printStackTrace(); } return classes
; } private ArrayList
<ClassNode
> readJar
(JarFile file
) { //Parse the jar for ONLY class files. ArrayList
<ClassNode
> result
= new ArrayList
<>(); file.
stream().
forEach(entry
-> { if (entry.
getName().
endsWith(".class")) { //If it is a class file. try { ClassReader reader
= new ClassReader
(file.
getInputStream(entry
)); ClassNode node
= new ClassNode
(); reader.
accept(node, ClassReader.
SKIP_DEBUG | ClassReader.
SKIP_FRAMES); result.
add(node
); } catch (IOException e
) { e.
printStackTrace(); } } }); return result
; }}
The above will allow us to parse a jar file and store all its classes into an ArrayList. We will use this ArrayList to go through each class and identify it, find methods, and patterns.
Now we need a way of analysing the classes and storing the information into our structures above. For this, I will use two classes as show below:
Java Code:
public abstract class Analyser
{ /** All sub classes must implement their own find method on the given list of nodes. **/ public abstract ClassNode find
(Collection
<ClassNode
> nodes
); /** All sub classes must implement their own identify method on the given node. **/ public abstract JInfo identify
(ClassNode node
);}public class ClassParser
{ private static final String jar_location
= "Gamepacks/Revision68.jar"; private ArrayList
<ClassNode
> classes
= null; //holds all classes from the jar file. private LinkedHashMap
<String, JInfo
> found
= null; //holds all classes we've identified private ArrayList
<Analyser
> analysers
= null; //holds an array of analysers that we will run on the jar classes. public ClassParser
() { classes
= new JarReader
(jar_location
).
load(); //load all the classes from the jar. found
= new LinkedHashMap
<>(); analysers
= new ArrayList
<>(); loadAnalysers
(); //load all analysers into our array. } public void analyse
() { analysers.
stream().
forEach(a
-> { ClassNode n
= a.
find(classes
); //call "Analyser.find and pass it the list of classes. if (n
!= null) { //if the analyser identifies its class. JInfo info
= a.
identify(n
); //call method identify and pass it the class node. found.
put(info.
getId().
toLowerCase(), info
); //Add "HumanReadableClassName, JInfo". } else if (!a.
getClass().
getSimpleName().
equals(Other.
class.
getSimpleName())) { //else, print the class name that we failed to identify. System.
out.
println("Failed to find: " + a.
getClass().
getSimpleName()); } }); } private void loadAnalysers
() { analysers.
add(new NodeAnalyser
()); //add more "collection" analysers other than Node and CacheableNode.. analysers.
add(new CacheableNodeAnalyser
()); analysers.
add(new BufferedStreamAnalyser
()); analysers.
add(new AnimableAnalysers
()); //same as Renderable. analysers.
add(new ActorAnalyser
()); analysers.
add(new PlayerAnalyser
()); analysers.
add(new ClientAnalyser
()); }}
What the above code does? First we created a class that is abstract. This is to allow us to store all child classes into a single array without slicing them (this is known as polymorphism). Any class that is an analyse must implement all of the abstract class's methods.
Next, we create a class for parsing the jar, loading the analysers and analysing each class within that jar. That's a mouthful to swallow but it works as follows:
1. Load the Jar (in the constructor).
2. Load the Analysers (in the constructor).
3. Call Analyser.find(list_of_classes_from_the_jar); to identify which class we are analysing.
4. If Analyser.find returns a ClassNode and not NULL, then we can further analyse that class for fields.
5. Call Analyser.identify. This method will identify as many fields and methods as it can. It will return JInfo that contains the information it identified.
6. Store the identified class and its info into the "found" array-list.
7. If all else fails, print the name of the class we tried and failed to analyse..
7. Identifying classes and fields.
We now need to use what we have written above to start finding classes and identifying fields. As discussed earlier, it is customary to start with the Node class first. That's what we'll do.. However, we need a way of searching a method for a pattern.. Lets write a class to find patterns for us..
Finder (extremely dumbed down/simplified for this tutorial):
Java Code:
import jdk.internal.org.objectweb.asm.Opcodes;
import jdk.internal.org.objectweb.asm.tree.*;
public class Finder {
private InsnList instructions = null;
public Finder(MethodNode method) {
this.instructions = method.instructions; //get ALL the instructions for the specified method..
}
/** Search that method for a pattern (array of OPCodes) **/
public int findPattern(int sub[]) {
return findPattern(instructions.toArray(), sub);
}
/**
Translated from C++'s std::search.. Searches for an array in an array.
The ideal findPattern method would allow WILDCARDs, OPTIONALs, Skip OR Follow GOTOs, etc..
But again, there's only so much that fits in a tutorial.. A basic Pattern-Search will work for MOST classes and fields.
**/
private int findPattern(AbstractInsnNode arr[], int sub[]) {
for (int i = 0, j = 0; i < arr.length; ++i) {
int k = i, l = j;
while(arr[k].getOpcode() == sub[l]) {
if (++l == sub.length) return i;
if (++k == arr.length) return j;
}
}
return -1;
}
}
Now we can start looking for some patterns.. As shown previous, the node class looks like:
Java Code:
public class Node {
Node prev, next;
long id;
}
Prev and Next is a "Node". Meaning that this class contains fields that are of itself and 1 long. We are thus going to search all classes and see if they contain "NameOfClass Prev", "NameOfClass Next", "Long ID".
This is done as follows (using our Analyser class):
Java Code:
public class NodeAnalyser
extends Analyser
{ //all analysers must extend Analyser as we have described above. @Override
public ClassNode find
(Collection
<ClassNode
> nodes
) { //all analysers must override and implement their own "find" function as per the "Analyser requirements" described above. for (ClassNode n
: nodes
) { //For each class node.. if (!n.
superName.
equals("java/lang/Object")) { //if the class parent/super is not the default (all classes inherit java/lang/Object by default) continue; //then skip it.. we only want classes that have no parent/super. } int long_count
= 0, node_count
= 0; //TO identify this class, I'm going to count the fields that are "long" and the fields that are "nodes (self)". for (FieldNode f
: n.
fields) { //for each "field" in the class. Remember that fields inside the class are called "FieldNodes". if ((f.
access & Opcodes.
ACC_STATIC) == 0) { //if it is NOT static. if (f.
desc.
equals(String.
format("L%s;", n.
name))) { //Here is where we use our knowledge from chapter 1: Lclass_name; If the field is a "NameOfThisClass" ++node_count
; //increase the node count.. might be "prev" or "next" } else if (f.
desc.
equals("J")) { //else if the field is a "long" ++long_count
; //increase the long count.. we might have found the ID. } } } if (long_count
== 1 && node_count
== 2) { //if the long count = 1 (id) and the node count = 2 (prev, next) return n
; //we have found the node class! } } return null; } @Override
public JInfo identify
(ClassNode node
) { //all analysers must implement their own identify method. JInfo info
= new JInfo
("Node", node.
name); //This class is the "NODE" class! We store the "Node" as human-readable-name. And "Node.name" as the obfuscated name (actual class name). info.
addField(findID
(node
)); //find the ID field. info.
addField(findNext
(node
)); //find the Next field. info.
addField(findPrev
(node, info.
getField("Next"))); //find the Previous field using the "Next" field. If we have found "id" and "next", the only other field left is "prev".. /Logic. return info
; //return the container of fields we have identified + the class we've identified. } private JField findID
(ClassNode node
) { for (FieldNode f
: node.
fields) { //Remember that class fields are called "FieldNode". if ((f.
access & Opcodes.
ACC_STATIC) == 0) { //if the field is NOT static.. if (f.
desc.
equals("J")) { //if the field is a "LONG" return new JField
("UID", f.
name, f.
desc); //then we have found the "ID" field. Store: ("ID", byte_code_field_name, "J") -- "J" because its a long. } } } return new JField
("UID",
"Broken-Hook",
null); //otherwise, we failed to find the "ID" field. Return a blank field. } private JField findNext
(ClassNode node
) { //Remember that methods inside a class are called "MethodNode". for (MethodNode m
: node.
methods) { //to identify which of the nodes are the "prev" or "next" node, we need to search the methods.. if (m.
desc.
equals("()Z")) { //For this, I will search the "UNLINK" method which has no parameters and returns a boolean. int i
= new Finder
(m
).
findPattern(new int[]{Opcodes.
ALOAD, Opcodes.
GETFIELD, Opcodes.
IFNONNULL}); //Search for: if (field != null) if (i
!= -1) { //if I found the above pattern. //Remember that fields inside a method are called "FieldInsnNode". FieldInsnNode next
= (FieldInsnNode
) m.
instructions.
get(i
+ 1); //then the "Getfield" is the "next" field. return new JField
("Next", next.
name, next.
desc); //return ("Next", byte_code_field_name, "LNode;"). } } } return new JField
("Next",
"N/A"); //Else, we failed to find the "Next" field.. } private JField findPrev
(ClassNode node, ClassField next
) { for (FieldNode n
: node.
fields) { //Search all the fields in the class.. Fields inside a class are called "FieldNode". if (!n.
name.
equals(next.
getName()) && n.
desc.
equals(next.
getDesc())) { //if the field is a node AND the field is NOT the "Next" node.. then it has to be the "Prev" node. return new JField
("Prev", n.
name, n.
desc); //return ("Prev", byte_code_field_name, "LNode;"). } } return new JField
("Prev",
"N/A"); //Else, we failed to find the "Prev" field.. }}
The next class we want to identify is the Actor class (I'm not going in order.. Assume that I have already found all the requirements to identify this class..).
Java Code:
public class Actor extends Analyser {
@Override
public ClassNode find(Collection<ClassNode> nodes) {
//Go through all classes..
//If the class inherits from Animable/Renderable..
//and the class is public & abstract..
for (ClassNode n : nodes) {
if (!n.superName.equals(Main.get("Animable")) || ((n.access & (Opcodes.ACC_ABSTRACT | Opcodes.ACC_PUBLIC)) != 0)) {
continue;
}
int int_arr_count = 0, str_count = 0;
for (FieldNode f : n.fields) {
if (f.desc.equals("[I") && ((f.access & Opcodes.ACC_STATIC) != 0)) { //count the non-static int arrays.
++int_arr_count;
} else if (f.desc.equals("Ljava/lang/String;") && ((f.access & Opcodes.ACC_STATIC) != 0)) { //count the non-static strings.
++str_count;
}
}
if (str_count == 1 && int_arr_count >= 5) { //count the strings and int arrays..
return n; //return that we have found the actor class..
}
}
return null;
}
@Override
public JInfo analyse(ClassNode node) {
JInfo info = new JInfo("Actor", node.name);
info.addField(findSpokenText(node));
info.addField(findAnimationID(node));
return info;
}
private JField findSpokenText(ClassNode node) {
for (FieldNode f : node.fields) {
if (f.desc.equals("Ljava/lang/String;")) { //the only string FIELD is "OverHeadText".
return new JField("SpokenText", f.name, f.desc);
}
}
return new JField("SpokenText", "N/A");
}
private JField findAnimationID(ClassNode node) {
for (MethodNode m : node.methods) { //for each method in the class, we're going to search its instructions for a pattern.
if (m.desc.equals("(IIZI)V") && ((m.access & Opcodes.ACC_FINAL) != 0)) { //if the method has the signature: final void method_name(int, int, bool, int);
int i = new Finder(m).findPattern(new int[]{Opcodes.ALOAD, Opcodes.GETFIELD, Opcodes.LDC, Opcodes.IMUL}); //Search for: var * multiplier
if (i != -1) {
FieldInsnNode f = (FieldInsnNode) m.instructions.get(i + 1); //GETFIELD is the AnimationID field.
long multi = (int) ((LdcInsnNode) m.instructions.get(i + 2)).cst; //LDC is the Multiplier.
return new JField("AnimationID", f.name, f.desc, multi); //we have found the AnimationID field and its multiplier!
}
}
}
return new JField("AnimationID", "N/A");
}
}
And now to find what revision our jar is:
Java Code:
public class ClientAnalyser
extends Analyser
{ @Override
public ClassNode find
(Collection
<ClassNode
> nodes
) { for (ClassNode n
: nodes
) { //go through all the classes.. if (n.
name.
equals("client")) { //if the name of the class is "client".. we have found the client class.. lol.. return n
; //Client class is found. } } return null; } @Override
public JInfo analyse
(ClassNode node
) { JInfo info
= new JInfo
("Client", node.
name); //We found the client class.. Save its info. info.
addField(findVersion
(node
)); //add the "revision" field to the list of found "things". return info
; } private JField findVersion
(ClassNode node
) { for (MethodNode m
: node.
methods) {//go through all the methods to find the constructor.. if (m.
name.
equals("init") && m.
desc.
equals("()V")) { //if this method is the client constructor.. int i
= new Finder
(m
).
findPattern(new int[]{Opcodes.
SIPUSH, Opcodes.
BIPUSH}); //find SIPUSH followed by a BIPUSH. IntInsnNode revision
= (IntInsnNode
) m.
instructions.
get(i
+ 1); //the BIPUSH (aka short) is the client revision. return new ClassField
("Revision",
String.
valueOf(revision.
operand),
"I"); //client revision is an integer. } } return new JField
("Revision",
"N/A"); //failed to find the client revision. }}
That should be enough to get you motivated to find the classes (CacheableNode, Animable/Renderable, BufferedStream) so that you can find the Actor and Player class and some more fields.
8. Byte-code Deobfuscation.
As shown earlier, a multiplier can be found on either side of an expression. It can be found on the left or the right. For example:
Java Code:
int var0 = var1 * multiplier;
//OR:
int var0 = multiplier * var1;
And such an expression can be found anyway. It can be done in if-statements, for-loops, while-loops, ANYWHERE. In order to find some fields without it breaking every few revisions, you need to deobfuscate some of these. By that, I mean that you need to have them all in a specific order. The order is up to you. You may have it like the first example or you can have it like the second example.. just don't have it as both as that's useless.
For the above example, it can be many different "constants" depending on the multiplier and the variable.
If it's a non-static variable, it can be:
Java Code:
//int i = multiplier * var0.
LDC
ALOAD
GETFIELD
IMUL
ISTORE
This says: LoadConstant first.. then load the Variable (field) and multiply the two of them using integer multiplication. The last instruction says to store it in an integer variable.
You'd have to switch the multiplier to the other side by changing the instructions as follows (or vice-versa depending on how you like your instructions served):
Java Code:
//int i = var0 * multiplier.
ALOAD
GETFIELD
LDC
IMUL
ISTORE
Notice that the "LoadConstant" instruction has been moved so that it: LoadVariable (field) first.. then the Constant and finally it multipliers them using integer multiplication. The last instruction says to store it in an integer variable.
If it's static, the ALOAD and GETFIELD would be replaced by a single instruction: "GETSTATIC". Example:
Java Code:
//int i = multiplier * var0.
LDC
GETSTATIC
IMUL
ISTORE
To switch the order of the multiplier for the above example, you can use the following code (for the above example only because there can be many different instructions for the code [int i = multiplier * var0]):
java Code:
public class MultiplierFix {
private ArrayList<ClassNode> classes;
public MultiplierFix(ArrayList<ClassNode> classes) {
this.classes = classes;
}
public void normalise() {
//for each Class in our ArrayList, for each Method in that class, call "fix(method)".
classes.stream().forEach(c -> c.methods.stream().forEach(m -> fix(m)));
}
private void fix(MethodNode method) {
//We don't care about where the result is stored. So do not search for the ISTORE instruction.
int pattern[] = new int[]{
Opcodes.LDC, Opcodes.ALOAD, Opcodes.GETFIELD, Opcodes.IMUL;
};
int i = new Finder(method).findPattern(pattern);
while(i != -1) {
//Move LDC (i) to AFTER the GETFIELD (i + 2).
method.instructions.insert(method.instructions.get(i + 2), method.instructions.get(i));
method.instructions.remove(method.instructions.get(i)); //remove the LDC from the beginning..
//We have now successfully changed: ldc, aload, getfield, imul to: aload, getfield, ldc, imul.
i = new Finder(method).findPattern(pattern, i + 1); //find the next occurrence.
}
}
}
You can do this for MANY different types of multipliers. Not all multipliers are LDC. Some are BIPUSH, ICONST, or SIPUSH.. Not all multipliers are IMUL (integer multiplication) either.. some are LMUL (long multiplication).
So on and so forth.. The above is a very naive way of doing it as others use advanced methods in their updater (trees).. However, the above is simple to understand and great for starters (I actually still use the above.. I just have a lot more patterns than that ONE above).
The same technique can be used to remove fields, exceptions, etc.. To remove redundant fields, you need to check every class' methods and check if that one field is used at all. So on and so forth. If it is never used, you may call:
java Code:
cls.fields.remove(f); //f is the field to remove. cls is the class that contains it.
Tools
To make things easier, there are a few tools out there that really help a lot when going through the client's code. One of them is called a byte-code editor. This allows you to view the byte-code for classes:
1. CJBE (Continued Java Byte-code Editor): https://github.com/contra/CJBE/tree/...facts/CJBE_jar
Sample:
2. FERNFLOWER. Fernflower is a DECOMPILER (this means it can turn byte-code into human readable java-code aka back into it's original form as much as possible).
Fernflower can be found here: http://forum.xda-developers.com/atta...8&d=1354655856
I ran it with the command line:
Bash Code:
java -jar fernflower.jar -dgs=true Revision69.jar Revision69Decompiled
Results:
The idea is simple with these tools. You run fernflower on the jar. You look open a tab in your browser and look at the refactored class. Next you look at the fernflower classes and then look at the bytecode for that class using CJBE. Write your analyser code to find that bytecode. Sometimes, if a class is hard to find, look at the reflection include, see which class is the one you want and open the fernflower decompiled corresponding class. Write code to find that class based on the output of CJBE and your knowledge of the above information..
The End.
Sample Updater:
Sample Output:
Java Code:
Using cached jar..
Deobfuscating Multipliers..
Changed: 3180 multipliers of
4760Node
: gn
------------------------------------------ UID -> ef
Next
-> ew
Prev
-> ep
CacheableNode
: gl
------------------------------------------ Next
-> cw
Prev
-> cs
Client
: client
------------------------------------------ Revision
-> 69Process finished with exit code
0