Results 1 to 4 of 4

Thread: slackasm - Partial x86 assembler in Lape

  1. #1
    Join Date
    Feb 2012
    Location
    Norway
    Posts
    995
    Mentioned
    145 Post(s)
    Quoted
    596 Post(s)

    Default slackasm - Partial x86 assembler in Simba

    SlackASM - A library that allows you to write (a subset of) x86 assembler purely in Simba 1.2, to put it roughly.

    I've been working on a project that led me to write an assembler, so I figured I'd write a version of it that was lape compatible while at it. Figured I'd share it with, someone might find it fun to play around with, however I wont say that it's particularly useful, could technically be used to speed up certain parts of your code, but yeah..

    • Produces real machinecode, no emulation / virtual machine or other bullshit.
    • Resembles Intel syntax. Instruction order follows AT&T (src, dst) structure.
    • Functions are used to "write the assembler", there's no parsing involved (no nothing).
    • Forward jumps are a bitch, you have to manually patch it when you've gotten that far.

    It's far from a complete x86 assembler, some core language features are in fact lacking, so you might feel a little limited. It's also not the most beautiful implementation, but I think it's fairily straight forward to use.

    Snippet from the Example1.pas file:
    pascal Code:
    with assembler := TSlackASM.Create() do
    try
      code += _mov(imm(0), ebx);            // mov 0 to %ebx (it's kept in %ebx as the loop counter)
     
      var label1 := Location;               // make a label so we can jump back to here
      //loop body -->
      code += _mov(mem(x), eax);            // mov `x` to %eax
      code += _add(mem(y), eax);            // add `y` to %eax
      code += _imul(eax);                   // imul %eax          [EAX *= EAX]
      code += _sub(mem(y), eax);            // sub `y` from %eax  [EAX -= y]
      code += _mov(eax, mem(z));            // mov %eax to `z`
      code += _inc(ebx);                    // inc %ebx           [increase our counter]
      code += _cmp(imm(1000000), ebx);      // compare %ebx to some number (limit)
      code += _jle(RelLoc(label1));         // if (%ebx <= lim) then jump to label1
      //<-- loop end
     
      code += _mov(ebx, mem(i));            // store the value of %ebx in `i`
      code += _ret;                         // return

      Method := Finalize();                 // create a reusable function which can be called any time
    finally
      Free();                               // clean up
    end;

    Method();                               // execute the code!

    So this snippet will pretty much produce something like this:
    pascal Code:
    repeat
      tmp := x + y;
      z := tmp * tmp;
      z := z - y;
      Inc(i);
    until 1000000 < i;
    Except that it produced native x86 machine-code, which means wont bet executed on lape's virtual machine, but directly on your CPU.. it's a "little" bit faster :P


    https://github.com/slackydev/slackasm


    Syntax

    Using a pointer stored in a register (%reg) would be done like this (memory operand):
    ref(reg)
    Example:
    _mov(ref(eax), eax);
    Referencing a lape variable (memory operand):
    mem(myVariable)
    Example:
    _mov(mem(myIntVar), eax);
    Referencing a lape variable [pointer] (memory operand):
    ptr(myPtrVariable)
    Example:
    _mov(ptr(@myIntVar), eax);
    Stack variables: Two operators has been overloaded, +, and -, so that it's somewhat similar to what would be done using Intel syntax:
    ebp+4 and ebp-4
    Example:
    _mov(ebp+8, eax);
    The above is equal to mov eax,[ebp+8] (notably the order is however src, dst)
    For setting the size of memory references (including stack vars) you'd do something like this:
    ref(edx).AsType(i32)
    (ebp+8).AsType(i16)
    (ebp+8).AsType(4) // i32, i16, i8, f32, f64 (alt: szLONG, szWORD, etc..) all translates to size, so you can write the size yourself


    Currently you can also do this, which may not be supported in the future:
    (ebp+8) is i16;
    Example:
    _movzx(ref(edx).AsType(i16), eax);
    For offsetting a pointer (memory operand) I have overloaded the + (add) operator:
    mem(myVar)+4 //offset ptr by 4 bytes
    Example:
    _mov(mem(myInt64)+4, eax); //move upper half of an int64 to eax
    Immediate value (immediate operand):
    imm(1000)
    Example:
    _mov(imm(1000), eax);

    Further examples can be found here. In particular `func.pas` utilizes a lot of the mentioned syntax. `func2.pas` is the same only implemented in a piecewise - reusable manner.

    Registers are named as the following:
    FPU stack: st0..st7
    General purpose (4 bytes): eax, ecx, edx, ebx, esp, ebp, esi, edi
    General purpose (2 bytes): _ax, _cx, _dx, _bx, _sp, _bp, _si, _di
    General purpose (1 byte ): _al, _cl, _dl, _bl, _ah, _ch, _dh, _bh


    SSE instructions has not been implemented, so XMM register are not even declared.


    Final notes

    - Expect some bugs, and lacking features, this utility was never my focus but came to life due to another project of mine.
    - Contribute to it if you can, it would be awesome to end up with a full x86 assembler! Or maybe extend it to support x86-64, or trow in linux support (should be fairly easy).. or whatever comes to mind!


    https://github.com/slackydev/slackasm
    Last edited by slacky; 08-01-2017 at 12:33 PM.
    !No priv. messages please

  2. #2
    Join Date
    Jan 2007
    Posts
    8,876
    Mentioned
    123 Post(s)
    Quoted
    327 Post(s)

    Default

    Nice! I'll definitely use this for my code

  3. #3
    Join Date
    Feb 2011
    Location
    The Future.
    Posts
    5,600
    Mentioned
    396 Post(s)
    Quoted
    1598 Post(s)

    Default

    For those that do not know, the underlying concept is the same: https://villavu.com/forum/showthread.php?t=109848

    Using VirtualAlloc/VirtualProtect to make the assembly instructions executable (On Linux & OSX, you can use mmap and mprotect). Writes the instructions as hex/binary to a memory location that is executable, then you cast the pointer to that memory as a function pointer, call it, and it will execute the instructions that you wrote.

    In any case, I want to point out that it is actually interpreted and goes through Lape.. I got confused at first as to how it was NOT using Lape and over excited because I thought Simba was actually using an assembler and assembling into native machine code..

    What he really means is that the "final" execution of the instructions aren't interpreted.
    IE: _mov, _cmp, _jmp, etc.. are all interpreted Lape "functions" that writes the native OPCodes to executable memory and executes that memory. #onecanonlydream

    Nonetheless, pretty cool and good job.
    Last edited by Brandon; 05-13-2017 at 01:08 AM.
    I am Ggzz..
    Hackintosher

  4. #4
    Join Date
    Dec 2011
    Posts
    2,147
    Mentioned
    221 Post(s)
    Quoted
    1068 Post(s)

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •