Page 1 of 3 123 LastLast
Results 1 to 25 of 55

Thread: SCAR Assembler/Emulator

  1. #1
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default SCAR Assembler/Emulator

    This is a collection of 2 a few scripts - an assembler, and an emulator (which is made up of a couple scripts). The assembler script will take an assembly file and compile it into an executable, which can then be run by the emulator.

    Two important things to note:
    1. A processor that will execute this code does not exist (it does however run on a special processor in my brain.. the Smartzgrid 2000)
    2. I don't know assembly very well and as such I am open to all constructive criticism at everything I have done incorrectly, left out, or done inefficiently.


    LATEST UPDATE (Rev 4)
    Rev 3
    Rev 2


    Rev 1:
    SCAR Code:
    program Assembler;
    const
      SourceFile = 'C:\datfile'; //Input file is .asm, output file is .bxe
    var
      prog: string;
     
    function ReadSource: string;
    var
      h: integer;
    begin
      h := OpenFile(SourceFile + '.asm', false);
      ReadFileString(h, result, FileSize(h));
      CloseFile(h);
    end;

    function Thick2(x: integer): string;
    begin
      result := IntToStr(x);
      if(length(result) mod 2 = 1) then
        result := '0' + result;
    end;

    procedure ParseInstructions(var data: string);
    var
      Instructions: Array of String; //Must be in order!!!
      Registers: Array of String; //Must be in order!!!
      i: integer;
    begin
      Instructions := ['INIT', 'DIS1', 'JMP', 'TERM', 'PUSH',
                       'POP', 'JIE', 'INX', 'MOV'];
      for i := 0 to high(Instructions)do
        data := Replace(data, Instructions[i], Thick2(i));
       
      Registers := ['EAX', 'EBX', 'ECX', 'EDX', 'ESP'];
      for i := 0 to high(Registers)do
        data := Replace(data, Registers[i], Thick2(i));
       
      for i := 0 to 9 do
        data := Replace(data, ' ' + inttostr(i) + Chr(13), ' 0' + inttostr(i) + Chr(13));
    end;

    procedure ConsolidateSpace(var data: string);
    begin
      data := Replace(data, Chr(10), '');
      data := Replace(data, Chr(13), '');
      data := Replace(data, ',', '');
      data := Replace(data, ' ', '');
    end;

    procedure WriteFile(data: string);
    var
      h, i: integer;
    begin
    //  data := '00200800010601021507010142020503';
      h := RewriteFile(SourceFile + '.bxe', false);
      for i := 1 to length(data)/2 do
        WriteFileByte(h, strtoint(data[i*2 - 1] + data[i*2]));
      CloseFile(h);
    end;

    begin
      prog := readSource;
      ParseInstructions(prog);
      ConsolidateSpace(prog);
      writeFile(prog);
    end.

    SCAR Code:
    program Emulator;
    const
      datafile = 'C:\datfile.bxe';

    var
      RAM: Array [0..63] of Byte; //0-3 = EAX - EDX, 4 = ESP, 5-47 = Program, 48-63 = Stack
      Counter: Integer;
      PC: Integer;
      OpCode: Byte;
      CPUIsntRunning: Boolean;
      StackPointer: Byte;
    const
      //RAM locations
      EAX = 0;
      EBX = 1;
      ECX = 2;
      EDX = 3;
      ESP = 4;
      //Program is @ 5-47
      //Stack is @ 48-63

      INIT = $00;
      DIS1 = $01;
      JMP  = $02;
      TERM = $03;
      PUSH = $04;
      POP  = $05;
      JIE  = $06;
      INX  = $07;
      MOV  = $08;
     
     
    function nem: Byte;
    begin
      result := RAM[PC];
      inc(PC);
    end;

    procedure ReadFileIntoRAM(theFile: String);
    var
      hFile, i: integer;
      data: byte;
    begin
      hFile := OpenFile(theFile, true);
      repeat
        ReadFileByte(hFile, data);
        RAM[i + 5] := data;
        inc(i);
      until(EndOfFile(hFile));
      writeln('Program loaded');
    end;
     
    begin
    { ASM Instructions
       INIT 0x00, InitialCounter
       DIS1 0x01, OneByte
       JMP  0x02, MemoryLocation
       TERM 0x03
       PUSH 0x04, Value
       POP  0x05, Register
       JIE  0x06, Register, Value, JumpLocation
       INX  0x07, Register
       MOV  0x08, Register, Value //Yes I know this is backwards
       INP  0x09, Register
       
    }

      readFileIntoRAM(datafile);
      PC := 5;
      StackPointer := 64;
     
      repeat
        OpCode := nem;

        case OpCode of
          INIT:
            CPUIsntRunning := false;
          DIS1:
            writeln(RAM[nem]);
          JMP:
            PC := nem;
          TERM:
            CPUIsntRunning := true;
          POP:
            begin
              RAM[ESP] := RAM[StackPointer];
              inc(StackPointer);
            end;
          PUSH:
            begin
              dec(StackPointer);
              RAM[StackPointer] := nem;
            end;
          JIE:
            if(RAM[nem] = nem)then
              PC := nem;
          INX:
            RAM[nem] := RAM[PC] + 1;
          MOV:
            RAM[nem] := nem;
        else
          writeln('INVALID OPCODE: ' + inttostr(OPCODE));
        end;
       
        inc(Counter);
      until(CPUIsntRunning)
    end.

    Sample Script (C:\datfile.asm)
    Code:
    INIT
    PUSH 20
    PUSH 16
    POP
    DIS1 ESP
    POP
    DIS1 ESP
    TERM
    Last edited by Smartzkid; 02-20-2010 at 04:36 AM. Reason: Added REV 4
    Interested in C# and Electrical Engineering? This might interest you.

  2. #2
    Join Date
    May 2009
    Posts
    799
    Mentioned
    2 Post(s)
    Quoted
    16 Post(s)

    Default

    So, this is supposed to be a basic compiler, for assembler code ?

    I'm not quite sure if i got it right, but sounds interesting and raises some ideas in my head as well ;D.

    ~caused

  3. #3
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default

    It's not really a compiler, because it simply takes assembly instructions and converts them to machine instructions. It doesn't have the whole fancy 'compiler' layer to translate high-level commands (loops, arrays, classes, etc..) into machine code.

    I really need to rewrite the assembler because I didn't plan it out quite correctly..
    Interested in C# and Electrical Engineering? This might interest you.

  4. #4
    Join Date
    Dec 2006
    Location
    .̿̂̔͋͗̎̆ͥ̍̒ͤ͂̾̌̀̅
    Posts
    3,012
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default

    This is one of the nicest things I've seen made with SCAR in a while.

  5. #5
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default

    Thanks man! It really is ugly code though.. and pretty inefficient, I'd imagine. In addition, its a bit skimpy on assembly instructions..

    My final goal is a RISC/Microchip PIC-like language

    I wish someone here who knew assembly would comment, there's a few things I could use clearing up on!
    Interested in C# and Electrical Engineering? This might interest you.

  6. #6
    Join Date
    Dec 2007
    Location
    Somewhere in Idaho
    Posts
    480
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Sure, I know some assembly.

    The code you produced is OK. One thing you should realize is that registers are not memory locations, they are actually little storage units inside the processor. So saying something like push 16 and push 10 does really make sense.

    For the x86 architecture (Which is what you code looks to be targeting, I'm not familiar with ARM asm) saying pop alone doesn't make sense, usually you pop into a register, so pop eax; or pop esi. A command like that moves the stack pointer as well as putting the data back into the register that you are popping from.

  7. #7
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    I only skimmed through your code, didn’t test it or anything, but I spotted few problems. First of them is already mentioned: the way you treat registers as memory locations.

    I really don’t get your stack handling. You have your stack pointer as an internal variable. You also have one register which is named like it was a stack pointer. ESP doesn’t hold any memory address at any point: when you pop a value off stack, you place the value in ESP. Your POP takes one argument (a register) according to your “documentation”, but it completely ignores it. In fact, if you give it a parameter, the vm tries to execute it as next instruction. This isn’t what you want. Also, you don’t check for stack underflows. If you push too many values to stack, you start to overwrite program code, and then you start to overwrite registers. Then you hit bottom of memory and whole thing will sigsegv or run into exception or something else SCAR does, I don’t know what it does in case of buffer underflow.

    You don’t allow any other memory access than accessing stack, which isn’t implemented very well. In fact, you /could/ exploit your register handling and give MOV some other memory address, but this really shouldn’t be considered The Right Thing to Do.

    Because of combination of your stack and memory handling, you can’t access any value on stack. You can only access value last popped from stack. If programmer could access memory in sane way and you gave programmer access to at least read stack pointer, maybe not write into it if you don’t want to allow programmer move stack, programmer could access values on stack.

    This leads us to the problem of variables. With this version you can only have variables in registers and stack, because you can’t access memory to store global variables. Because you can’t access values on stack, you can’t really have variables there without some really deep black magic with stack, which isn’t even possible if you have many values on stack because your vm has only four GPR.

    Your assembler doesn’t support labels. This means programmer has to calculate memory addresses he gives for JMP and brothers. This is very bad, because it is very error prone, unnecessary responsibility for programmer and programs that use JMP may not work with other versions of your vm than which was used to write it. If you want, you can change that code is located at end of memory. If you want, you can even decide that opcode is 14 bytes long and every parameter’s first 3 bits follow opcode and rest of them are stored in reversed order in stack. That wouldn’t make any sense, but you could do that, and it would break every JMP.

    Your vm doesn’t have any support for function calling. You could do some trickery with stack and registers, but it would be very hard with your stack and few GPRs. Creating functions would also be hard with no label support in assembler.

    I think your INIT is unnecessary. In my opinion you should just start executing code from beginning of code segment (I’ll call these segments here even though you don’t have segmentation) and stop when it hits end of code segment or TERM. Now you execute the first instruction even if it isn’t INIT, and if it was, you try to execute everything which isn’t TERM. This means that if programmer forgets to add TERM, your vm tries to execute stuff on stack (!). This really isn’t The Right Thing to Do.

    Why is your INX isn’t called something like INC or something, why did you choose X? I think INC[rement] would be the most logical choice.

    Your instruction set is very limited right now. What kind of instructions do you plan to implement? I’m looking forward to see arithmetic operations (or just addition and subtraction if you plan to be very RISC), bitwise operations and better support for function calling. I would like additional registers, because I’m used to see and use up to 128 GPRs.

    What do you mean when you say that your parameters of MOV are backwards? If you’re going to implement something that resembles “Intel syntax”, it is right order. If you want something like “AT&T syntax” (which /I/ prefer), they are backwards, but in other parts you’re very far away from AT&T syntax. I don’t like how you name your registers like i386’s registers if you don’t want to implement i386. I think you should at least drop E form them, because they aren’t [E]xtended from anything. According to your post, you want this to be RISC, so I believe you don’t want to implement i386, because it definitely is CISC and its memory management maze is like hell.

    If you don’t want to implement a register machine or you want to try something different, you could write a stack machine (or “0-operand instruction set”). Java’s vm is stack machine, Perl 5 uses one, .NET framework’s CLI’s virtual machine is one and many Forth implementations use some sort of stack machine. Not so many real machines have been stack machines, though. I think they are more interesting than register machines, but this is your choice. If you plan to write a compiler for some high level language which generates code for this, I think it will be easier for beginner to generate code for some RISC architecture, but it really depends on person.

    If you would like to someday write this to be standalone program (in Delphi/Object Pascal, maybe even complete rewrite in something like C) and you’d like this to be portable, you really have to do something with your vm’s byte size and endianess. If you tried to port this to some other architecture which has some other byte size than 8 (rare nowadays) or which endianess is different, programs may work different way, so this couldn’t be used to create cross-platform programs.

    -
    I don’t know SCAR’s language and I have never really programmed any Pascal, I have just gained basic understanding from one of my father’s books, Pascal-ohjelmointikieli by “Yucca” Korpela et alii from early 80’s, so I didn’t really understand all that Replace() mess and how your assembler really works, but I know it won’t work very well if you want to implement labels and maybe some higher level structures. The bytecode you produce seems to be reasonable. I think you start to write a simple tokenizer and maybe little parser. I think you could get by at first without a parser and generate code on fly when getting tokens, but I believe you would run quickly into problems when calculating labels’ addresses. If I would implement a simple solution into this, I would create a list where I would put IR of instructions and a symbol table. When parser encounters a new label, it creates a new entry into symbol table. Every entry in symbol table has a pointer to node in “IR list”, and every entry has list of instructions that refer to it and needs resolving or every instruction that needs address resolving has a pointer to entry that it’s referring. After it has traversed whole source code and generated IR of it, code generator starts to walk through IR. After it has generated bytecode of whole program, it starts to walk through symbol table and calculates address for every label and then assigns it to every instruction that needs it.

    -
    Good work, keep going with this!

  8. #8
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default

    Wow, that was a lot to read! Thanks so much for the feedback, boberman and fronty, your posts were a great help!

    I see what you both mean about the GPR's/Stack being in 'RAM', whereas that is completely incorrect harware-wise. How, though, would you say to implement them? The only other option I see would be to make a separate array for each, but would that not also be incorrect?

    I misunderstood the use of ESP. From the examples I saw, it seemed to me that it was a temporary register for popping data into. I rewrote this part, so it's now the stack pointer! One step closer to authentic

    INIT. Yes, I realize it serves no purpose - I would remove it, but I was hoping to add a 'data area' (string table?) at the beginning of the compiled code. The data would be preceded by an INIT, followed by the address of the first instruction. This ofcourse would be set up by the assembler, and INIT would be removed from the documentation. Would that be at all logical? It feels like a dirty fix to me, but it's all I could think of that would allow the programmer to use whole strings.

    PUSH. So.. you generally push registers? AAH! That makes sense!

    How many GPR's would you reccomend I have? What kind of processor is it that has 128 GPR's!? And how does their naming scheme go? Thanks for the bit about the Exx. I'll make sure to change my naming in the next revision.

    INX is only INX because 'inc' is a built in SCAR function. Thanks for pointing it out though, because I just realized that I can simply tweak the assembler to recognize INC Oops!

    Instructions todo:
    JNE, DEC, ADD, SUB[tract]?, SHL, SHR, AND, NAND, OR, XOR, NOR. Is there a need for SAL and SAR? And would you mind explaining what the purpose of the carry flag is? From my list of bitwise operators, is there anything missing?

    I haven't decided yet if I will implement any CISC style instructions like MULT, or if I will stick with LOAD, STORE, PROD, etc. Probably the latter, because I'd like to - should this ever become a mature enough project - be able to emulate PIC microprocessors. I guess I could do both, but that wouldn't be very wise, would it?

    I definitely do need to re-write the assembler; I've already run into some really solid roadblocks with the current setup. Thanks for the tips, they should give me a good starting point. What does IR mean, though?

    How do functions work in assembly? Are they simply segments of code cordoned off by labels, or is there a difference?

    Thanks again, I'm really excited to have found a project where I have something interesting to learn


    -- Notes for school tomorrow:
    POP into a register
    SP - 'Stack Pointer'
    Labels!
    Last edited by Smartzkid; 10-01-2009 at 10:58 AM.
    Interested in C# and Electrical Engineering? This might interest you.

  9. #9
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    I would just drop registers from RAM, move code area and stack few steps down to cover the area previousle occupied by registers, and have registers in seperate array and assign every register a number which is used to refer into it in bytecode and use that number as index to the register array. You could have them as seperate variables, but that would mean that every time program uses some register, you would have to run through several conditional starements to check which register should be used. When implemented with array, you could just check if the index is negative or bigger than the biggest index of array, and if it is valid, use it.

    If you want to enable usage of floating point numbers with floating point registers and some floating point instructions (or even enable mixed usage of floating point and integers with same instructions), you need another array for floating point registers, of course.

    Now I understand your INIT. It's usage would resemble usage of JMP in some old DOS programs (or maybe even on some newer programs). With limitations of real mode of x86, some programs included both data and code in same segment. These programs' structure was like this:
    Code:
    jmp start        ; jump to start of program code
    
    [program's data]
    
    start:
    [program's code]
    I would try something like this: I would create a executable format or use some already used (like PE of Windows or ELF of Unix-likes on x86 and some other architectures). In the beginning of executable file would be some magic number to check that the file really is executable. Then there would be the position in file where program's code is, position where program's data is, and if you want a string table, the position of it. Then I would change the layout of RAM to be something like this:
    Code:
    0 - size of code: Code
    end of code + 1 - size of data (+ string table?): Data (and string table?)
    end of data + 1 - end of RAM: Stack
    Your vm's loader would read the header, load parts of program to right parts of memory and start running. Header could also include entry point's address, which would be used when starting running the program.

    If you want, you can always allow both values directly and registers to be pushed to stack. It's your choice.

    Itanium has 128 integer registers, 128 floating point registers and few other registers for other uses. Some of integer registers have special usage. SPARC processors have something like 128 registers, but only 32 of them can be accessed from program at once. Number of registers on these architectures is because of calling conventions used by them. They give parameters on some registers, get parameters on some registers, return values on some registers... Itanium's registers are like r0 .. r127, sp, ret0 .. ret3, etc.

    I would suggest you to have have at least 8 registers, maybe 16 or maybe even as big as 32 for some reason. More if you know you will need them.

    If you want this to be very deeply RISC and provide just most simple things and let programmer do the rest, you can leave arithmetic shifts out. If you don't want to make programming easier, just implement them and multiplying and dividing as well.

    Carry flag is a flag that tells the program that there was carry (or borrow) out of target's MSB. That can cause errors in calculations.
    Example 1:
    Code:
       0101 1010
    + 0100 1001
    --------------
      1010 0011
    Good, there was no carry out of MSB, everything worked fine.

    Code:
       1101 1010
    + 0100 1001
    ---------------
    1 0010 0011
    Now there was carry out of MSB, and result was 9 bits, which is too much for our 8 bit byte. When you leave out the ninth bit, you'll get 35, but the right answer would be 291. Carry flag is set to inform about this.

    I didn't spot anything from your bitwise operations.

    Continues in next post, I hit maximum lenght. D:
    Last edited by fronty; 10-01-2009 at 11:57 PM.

  10. #10
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    IR means intermediate representation. It is something in between of source code and machine code to be generated. When writing compilers, you should design your IR to not to be tied in any source language or any target language. However, in assembler you can basically generate code almost identical to target code, or even target code straight away and just leave calculating addresses for labels and assigning for later.

    Functions in assembly language can seem a bit tricky. Different architectures have different ways to call functions and handle local variables, and different operating systems and compilers can use different conventions even on same architecture. I will show one calling convention and way of using local variables here.

    In this convention, parameters of function are pushed "from left to right" into stack, then function is called, and caller destroys parameters from stack. Return value is in eax.

    Example.

    Code:
    ; Function is defined in C as int func(int x, int y, int z)
    
    push z
    push y
    push x
    
    call func
    
    add esp, 12  ; Clears parameters from stack. Because stack grows downwards on x86, you add to the stack pointer
    
      ; Return value is now in eax
    Here comes the hard part, accessing parameters and creating and using local variables.

    When program calls a function above, the top of stack is like this:
    Code:
    EIP         - First instruction to be executed after function, pushed by call    <------ ESP
    param 1  - first parameter (x)
    param 2  - second parameter (y)
    param 3  - third parameter (z)
    On i386, there is two pointers that are used when using stack, ESP and EBP, ststackstack pointer and frame pointer. When a function starts, it pushes old EBP to stack and copies current ESP to EBP. This creates a new stack frame.

    Stack now:
    Code:
    old EBP - pushed by function    <------ ESP    <------ EBP
    EIP       - Return address
    param 1
    param 2
    param 3
    Then you should subtract space needed by your local variables from ESP (can you guess the reason for subtractin). This is always done in 32-bit chunks (this is reason why compilers rearrange variables and fields in structures etc).

    After setting up stack frame and allocate local variables, you should save registers used in your function on stack. This is done with series of pushes.

    Let's pretend that our function has three local variables uses only ebx and ecx, so it pushes them. However, you should push everything. Now let's take a look again on stack.
    Code:
    old ECX    <------ ESP
    old EBX
    var 3
    var 2
    var 1
    old EBP    <------ EBP
    EIP
    param 1
    param 2
    param 3
    Now you can access parameters with EBP. How? First parameter is EBP + 8, second is EBP + 12, etc. (+, because stack grows downwards) Why do these start from 8? Directly at address pointed is old EBP (remember how we copied ESP to EBP?) and EBP + 4 is return address (the EIP pushed by call).

    You can also access local variables with EBP. First parameter is EBP - 4, second is EBP - 8, etc.

    Of course you could access both parameters and locals with ESP, but it's convention to use EBP and it's easier that way.

    After function has done it's work, it starts by popping saved registers back. Then it destroys local variables by adding the same number it subtracted from ESP when locals were created. After that it restores old EBP and calls RET, which pops return address from stack and continues execution from there.

    This may look hard at first, and I think it will never feel piece of cake. However, you don't have to do things this way. You can use any existing calling convention or invent your own, if you don't like this.

    Remember how I said number of registers on Itanium and SPARC is caused by calling convention used by them? One possibility is to have some global registers that can be used like global variables, then you have some local variables that are accessible only while the function runs, some registers used for giving parameters, some for receiving parameters and some for return values. When you call a function, vm copies output registers to input registers, saves local registers to some internal storage not accessible by program, jumps into function. When function returns, vm restores local variables and return value is kept in right registers.

    Or you can use some mixture, like parameters are on stack and you can have locals in registers. Or the other way. Or anything else. You have million possibilities. Use stack if you want to, use registers if you want to, use both if it feels right, or don't use any of them but some other technique.

    </novel>

  11. #11
    Join Date
    Apr 2007
    Posts
    2,593
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Is it just me, or is this fronty guy a genius?

  12. #12
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    I'm not genius, it's just programming knowledge gathered in few years.

  13. #13
    Join Date
    Sep 2006
    Posts
    6,089
    Mentioned
    77 Post(s)
    Quoted
    43 Post(s)

    Default

    Looks like you're a promising newcomer

  14. #14
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default Rev 2

    Changes:

    • Moved registers to separate location in CPU
    • Fixed push/pop
    • Fixed stack and implemented a stack pointer register
    • Added general program memory
    • Depreciated INIT, will be removed during assembler rewrite
    • Renamed INX to INC
    • Added more registers
    • Partially implemented executable header
    • Added some instructions, see 'documentation' in emulator.scar


    Emulator.scar
    SCAR Code:
    program Emulator;
    const
      datafile = 'C:\program.bxe';
    var
      RAM: Array [0..255] of Byte; //0-63 = Program, 64-79 = Stack, 80-255 = general memory
      Register: Array [0..16] of Byte;
      Counter: Integer;
      PC: Integer; //Program counter. TODO: Replace with register PC

      OpCode: Byte;
      CPUIsntRunning: Boolean;
    const
      //Register array locations (all here for reference)
      AX = 0;
      BX = 1;
      CX = 2;
      DX = 3;
      EX = 4;
      FX = 5;
      GX = 6;
      HX = 7;
      IX = 8;
      JX = 9;
      KX = 10;
      LX = 11;
      MX = 12;
      NX = 13;
      OX = 14;
      PX = 15;
     
      SP = 16;
     
      //Program is @ RAM 0-63
      //Stack is @ RAM 64-79 (15 levels deep)

      INIT = $00;
      DIS1 = $01;
      JMP  = $02;
      TERM = $03;
      PUSH = $04;
      POP  = $05;
      JIE  = $06;
      INX  = $07;
      MOV  = $08;
     
      JNE  = $0A;
      LOAD = $0B;
      STOR = $0C;
      VAL  = $0D;


    function nem: Byte;
    begin
    try
      result := RAM[PC];
    except
      writeln('Debugtime!');
      wait(40000);
    end
      inc(PC);
    end;

    procedure ReadFileIntoRAM(theFile: String);
    var
      hFile, i: integer;
      data: byte;
    begin
      try
        hFile := OpenFile(theFile, false);
        for i := 0 to 1 do
        begin
          ReadFileByte(hFile, data);
          case i of
            0: if(data <> $52) then
                 RaiseException(erUnexpectedEof, 'File is not an SXE executable');
            1: PC := data;
            //2: //Code end address := data
            //3: //Data start address
            //4: //Data end address
            //5: //Total size
          end;
        end;
        i := 0;
        repeat
          ReadFileByte(hFile, data);
          RAM[i] := data;
          inc(i);
        until((EndOfFile(hFile)) or (i > 63));
        writeln('Program loaded (' + inttostr(i) + '/64 bytes used)');
      except
        writeln('Error loading executable: ' + ExceptionParam);
        TerminateScript;
      finally
        CloseFile(hFile);
      end;
    end;

    begin
    { ASM Instructions    //TODO: Group similar instructions, ie inc/dec
       INIT 0x00, InitialCounter  //NOT YET IMPLEMENTED
       DIS1 0x01, Register
       JMP  0x02, JumpLocation
       TERM 0x03
       PUSH 0x04, Register
       POP  0x05, Register
       JIE  0x06, Register, Register, JumpLocation
       INC  0x07, Register
       MOV  0x08, Register, Register
       INP  0x09, Register //TODO: discard? [INPut]
       JNE  0x0A, Register, Register, JumpLocation
       LOAD 0x0B, Register, RAM
       STOR 0x0C, RAM, Register
       VAL  0x0D, Register, Value
       
       SHL  0xXX, Register    //TODO: Implement opcodes from here on down
       SHR  0xXX, Register
       DEC  0xXX, Register
       ADD  0xXX, Register, Register
       SUB  0xXX, Register, Register
       PROD 0xXX, Register, Register
       DIV  0xXX, Register, Register
       AND  0xXX, Register, Register
       NAND 0xXX, Register, Register
       OR   0xXX, Register, Register
       NOR  0xXX, Register, Register
       XOR  0xXX, Register, Register
    }

      ClearReport;
      readFileIntoRAM(datafile);
      Register[SP] := 79;

      repeat
        OpCode := nem;

        case OpCode of
          INIT:
            CPUIsntRunning := false;
          DIS1:
            AddToReport(inttostr(Register[nem]));
          JMP:
            PC := nem;
          TERM:
            CPUIsntRunning := true;
          POP:
            begin
              Register[nem] := RAM[Register[SP]];
              inc(Register[SP]);
            end;
          PUSH:
            begin
              dec(Register[SP]);
              RAM[Register[SP]] := Register[nem];
            end;
          JIE:
            if(Register[nem] = Register[nem])then
              PC := nem
            else
              nem;
          JNE:
            if(Register[nem] <> Register[nem])then
              PC := nem
            else
              nem;
          INX:
            inc(Register[nem]);
          MOV:
            Register[nem] := Register[nem];
          LOAD:
            Register[nem] := RAM[nem + 80];
          STOR:
            RAM[nem + 80] := Register[nem];
          VAL:
            Register[nem] := nem;

        else
          writeln('INVALID OPCODE: ' + inttostr(OPCODE));
        end;

        inc(Counter);
      //  writeln('Line ' + inttostr(counter));
      until(CPUIsntRunning)
    end.

    Assembler.scar
    SCAR Code:
    program Assembler; //TODO: Rewrite this.
    {
    [Planned] Executable layout

    [Code header]
    34
    Code Start Address
    Code End Address
    Data Start Address
    Data End Address
    Total Size

    [Data segment]
    todo..


    [Code segment]
    Instructions

    }


    const
      SourceFile = 'C:\program'; //Input file is .asm, output file is .bxe
    var
      prog: string;
      timer: integer;

    function ReadSource: string;
    var
      h: integer;
    begin
      h := OpenFile(SourceFile + '.asm', false);
      ReadFileString(h, result, FileSize(h));
      CloseFile(h);
    end;

    function Thick2(x: integer): string;
    begin
      result := IntToStr(x);
      if(length(result) mod 2 = 1) then
        result := '0' + result;
    end;

    procedure ParseInstructions(var data: string);
    var
      Instructions: Array of String; //Must be in order!!!
      Registers: Array of String; //Must be in order!!!
      i: integer;
    begin
      Instructions := ['INIT', 'DIS1', 'JMP', 'TERM', 'PUSH', 'POP',
                       'JIE', 'INC', 'MOV', 'INP', 'JNE', 'LOAD',
                       'STOR', 'VAL'];
      for i := 0 to high(Instructions)do
        data := Replace(data, Instructions[i], Thick2(i));

      Registers := ['AX', 'BX', 'CX', 'DX', 'EX', 'FX', 'GX', 'HX', 'IX',
                    'JX', 'KX', 'LX', 'MX', 'NX', 'OX', 'PX', 'SP'];
      for i := 0 to high(Registers)do
        data := Replace(data, Registers[i], Thick2(i));

      for i := 0 to 9 do
        data := Replace(data, ' ' + inttostr(i) + Chr(13), ' 0' + inttostr(i) + Chr(13));

      for i := 0 to 9 do
        data := Replace(data, ' ' + inttostr(i) + ' ', ' 0' + inttostr(i) + ' ');
    end;

    procedure ConsolidateSpace(var data: string);
    begin
      data := TrimOthers(data);
    end;

    procedure WriteFile(data: string);
    var
      h, i: integer;
    begin
    //  data := '00200800010601021507010142020503';
      h := RewriteFile(SourceFile + '.bxe', true);
      for i := 1 to length(data)/2 do
        WriteFileByte(h, strtoint(data[i*2 - 1] + data[i*2]));
      CloseFile(h);
    end;

    begin
      timer := GetSystemTime;
      prog := readSource;
      ParseInstructions(prog);
      ConsolidateSpace(prog);
      prog := inttostr($52) + '00' + prog;
      writeFile(prog);
      writeln('Assembled in ' + inttostr(GetSystemTime - timer) + ' ms');
    end.

    C:\program.asm
    Program Dumper
    Code:
    VAL AX 64
    STOR 0 AX
    VAL SP 0
    POP AX
    DIS1 AX
    LOAD AX 0
    JIE SP AX 22
    JMP 9
    TERM
    Last edited by Smartzkid; 10-07-2009 at 08:27 AM.
    Interested in C# and Electrical Engineering? This might interest you.

  15. #15
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Wow, I wouldn't have come up with implementing that kind of memory protection. Quite elegant, I think.

    Btw, what does nem mean? NExt value in Memory or something?

    Is the sample program meant to actually print something or do some other visible thing?

    I really couldn't get most of instructions work.
    TERM - works
    JMP - works
    JIE - works
    JNE - works
    PUSH - doesn't work
    POP - doesn't work
    INC - doesn't work
    MOV - doesn't work
    LOAD - doesn't work
    STOR - doesn't work
    VAL - doesn't work

    Those which don't work just hit the infinite loop and start printing "Debugtime!". Example program outputs:
    Code:
    Program loaded (23/64 bytes used)
    INVALID OPCODE: 61
    INVALID OPCODE: 60
    INVALID OPCODE: 20
    INVALID OPCODE: 20
    INVALID OPCODE: 90
    INVALID OPCODE: 64
    Debugtime!
    ...

  16. #16
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default

    Ouch. I guess I hadn't realized how bad the assembler actually is. Even to get the sample program to run, I had to make some weird changes. I've found the emulator to be quite solid, but it's hard to get the assembler to output what it's supposed to. It got so bad, I actually considered distributing a hand-assembled executable!

    The sample program is a ram-dumper; technically, it shouldn't work, because it manipulates the stack pointer to be in program memory. The next version of the emulator should block out this sort of interaction.

    How did you test those instructions? Chances are, its the assembler mucking something up.. lets just say I've gotten quite friendly with my hex viewer over the past couple days!

    Yes, nem is the NExt Memory function. So far it has worked well, but I think it may need to be tweaked in the future, because there are some cases when it's inconvenient to increment to the next value in memory.
    Interested in C# and Electrical Engineering? This might interest you.

  17. #17
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    I tried those instructions with extremely simple test programs, like "PUSH AX", or "VAL CX 0". With JMP I used program something like this:
    Code:
    JMP 4
    PUSH AX
    TERM 0
    With J[IN]E, program was like this:
    Code:
    J[IN]E AX AX 6
    PUSH AX
    TERM 0
    Because I had already checked that PUSH AX doesn't work but TERM 0 worked, I could test if jumps worked, because if JMP worked, program would run correctly, but if it didn't, it would start Debugtime! loop.

    I will try to create an executable by hand with some hexeditor and see if it works.

  18. #18
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Yes, emulator seems to work after I hand assembled some simple programs. I remember most of opcodes' numbers now. D: You're right, problem is in assembler.

    I quickly wrote a very simple disassembler in C for this. I know it has some problems, including bad identifiers. I also think that it should buffer its input before printing it and it's current behavior should be behind some flag. But this is good enough for now and I will make it better while correcting it to reflect changes in your code. [code at bottom of post]

    Disassembly for example program was:
    Code:
    HEADER
    Magic number:   0x52
    Entry point:    0x0
    
    PROGRAM
    0x0:    VAL     AX      0x40
    0x3:    STOR    0x0     AX
    0x6:    VAL     SP      0x0
    0x9:    POP     AX
    0xB:    DIS1    KX
    0xD:    DIS1    KX
    0xF:    INIT
    0x10:   INIT
    ERROR: Invalid opcode at address 0x11
    Hand assembled executable works correctly if you don't count printing zeros after program code as misbehavior.

    EDIT2: I noticed that I haven't mentioned your strange gpr and address lenghts. Lenght of your gprs is size of Integer, which I guess is something like 16, 32 or 64, depending on system. If it isn't constant lenght, it would cause programs written for this possibly work differently on 32 and 64 bit systems.

    I think your memory addressing scheme may feel strange to some people when you add code and data "segments". If you keep addressing like this, biggest accessible address is 255. I'll give an example. If your program's code is 200 bytes and you give program a 55 byte stack. Both of those lenghts are quite small. Then your program has 128 bytes of data. You couldn't represent address of any byte in data segment. If you had only 150 bytes of code, you could address some parts of data but not everything. This wouldn't really be a problem because reading and writing to stack is only allowed via PUSH and POP and read/write to program code will be forbidden if you forbid MOV to SP. I think biggest problem will be address lenght because 255 bytes of code isn't really much. And if you choose to make addresses longer, I would also decide to allow reading from all parts of memory but still possibly allow writing only to "data segment".

    -
    This disassembler is quite trivial and I think any programmer should understand at least basic working and flow of it. If someone doesn't understand some part(s) of program, ask me and I will explain.

    Tested on on Windows Vista with Visual Studio 2008 because I don't have Limuz or any better operating system installed on any computer right now. Compiles on Visual Studio with one warning complaining about safety of fopen() and tells to use fopen_s() instead. I know cpp is there to let you fix that kinds of warnings and problems, but I didn't want to clutter this code just because of something like this.

    EDIT: I just realized I should use uint8_t or at least unsigned char instead of char. Too much C# for me. I will fix that with next release. Also, I'd like to know will total size field in header count header or just code and data. Depending on that the datatype of that field should be uint8_t or uint16_t or should be a bitfield with some arbitrary lenght.

    main.c
    Code:
    /*-
     * Disassembler for Smartzkid's emulator.
     *
     * Joonas Hilska <joonas.hilska(at)kopteri.net>
     * This code is public domain (no copyright).
     * You can do whatever you want with it.
     */
    
    #include <stdio.h>
    #include <stdlib.h>
    
    #include "sxe.h"
    
    static void
    eof(char addr)
    {
    	fprintf(stderr, "ERROR: Unexpected end of file at address 0x%X\n", addr);
    	exit(1);
    }
    
    static void
    reg(char addr)
    {
    	fprintf(stderr, "ERROR: Invalid register at address 0x%X\n", addr);
    	exit(1);
    }
    
    int
    main(int argc, char *argv[])
    {
    	struct sxe_header header;
    	char opcodes[][5] = { "INIT", "DIS1", "JMP", "TERM", "PUSH", "POP", "JIE", "INC",
    		"MOV", "INP", "JNE", "LOAD", "STOR", "VAL" };
    	char registers[][3] = { "AX", "BX", "CX", "DX", "EX", "FX", "GX", "HX", "IX", "JX",
    		"KX", "LX", "MX", "NX", "OX", "PX", "SP" };
    	FILE *file;
    	char *filename;
    	int c, print_header, param1, param2, param3;
    	char addr;
    
    	if (argc < 2) {
    		fprintf(stderr, "Usage: disasm [-h] executable\n");
    		return 1;
    	}
    
    	if (argc >= 3) {
    		if (argv[1][0] == '-' && argv[1][1] == 'h') {
    			filename = argv[2];
    			print_header = 1;
    		}
    	} else {
    		filename = argv[1];
    		print_header = 0;
    	}
    
    	if ((file = fopen(filename, "r")) == NULL) {
    		perror("fopen");
    		return 1;
    	}
    
    	if (fread(&header, sizeof(header), 1, file) != 1) {
    		perror("fread");
    		return 1;
    	}
    
    	if (header.s_magic != SXE_MAGIC) {
    		fprintf(stderr, "ERROR: File isn't SXE executable\n");
    		return 1;
    	}
    
    	if (print_header) {
    		printf("HEADER\n"
    			"Magic number:\t0x%X\n"
    			"Entry point:\t0x%X\n", header.s_magic, header.s_entry);
    		printf("\nPROGRAM\n");
    	}
    
    	addr = 0;
    
    	while ((c = fgetc(file)) != EOF) {
    		if (c < INIT || c > VAL) {
    			fprintf(stderr, "ERROR: Invalid opcode at address 0x%X\n", addr);
    			return 1;
    		}
    
    		switch (c) {
    		case INIT:
    		case TERM:
    			printf("0x%X:\t%s\n", addr, opcodes[c]);
    			addr++;
    			break;
    		case DIS1:
    		case PUSH:
    		case POP:
    		case INC:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			if (param1 < AX || param1 > SP)
    				reg(addr + 1);
    
    			printf("0x%X:\t%s\t%s\n", addr, opcodes[c], registers[param1]);
    
    			addr += 2;
    			break;
    		case MOV:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			if (param1 < AX || param1 > SP)
    				reg(addr + 1);
    
    			if ((param2 = fgetc(file)) == EOF)
    				eof(addr + 2);
    
    			if (param2 < AX || param2 > SP)
    				reg(addr + 2);
    
    			printf("0x%X:\t%s\t%s\t%s\n", addr, opcodes[c], registers[param1], registers[param2]);
    
    			addr += 3;
    			break;
    		case JMP:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			printf("0x%X:\t%s\t0x%X\n", addr, opcodes[c], param1);
    
    			addr += 2;
    			break;
    		case JIE:
    		case JNE:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			if (param1 < AX || param1 > SP)
    				reg(addr + 1);
    
    			if ((param2 = fgetc(file)) == EOF)
    				eof(addr + 2);
    
    			if (param2 < AX || param2 > SP)
    				reg(addr + 2);
    
    			if ((param3 = fgetc(file)) == EOF)
    				eof(addr + 3);
    
    			printf("0x%X:\t%s\t%s\t%s\t0x%X\n", addr, opcodes[c], registers[param1], registers[param2], param3);
    
    			addr += 4;
    			break;
    		case VAL:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			if (param1 < AX || param1 > SP)
    				reg(addr + 1);
    
    			if ((param2 = fgetc(file)) == EOF)
    				eof(addr + 2);
    
    			printf("0x%X:\t%s\t%s\t0x%X\n", addr, opcodes[c], registers[param1], param2);
    
    			addr += 3;
    			break;
    		case INP:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			if (param1 < AX || param2 > SP)
    				reg(addr + 1);
    
    			printf("0x%X:\t%s\t%s\n", addr, opcodes[c], registers[param1]);
    
    			addr += 2;
    			break;
    		case LOAD:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			if (param1 < AX || param2 > SP)
    				reg(addr + 1);
    
    			if ((param2 = fgetc(file)) == EOF)
    				eof(addr + 2);
    
    			printf("0x%X:\t%s\t%s\t0x%X\n", addr, opcodes[c], registers[param1], param2);
    
    			addr += 3;
    			break;
    		case STOR:
    			if ((param1 = fgetc(file)) == EOF)
    				eof(addr + 1);
    
    			if ((param2 = fgetc(file)) == EOF)
    				eof(addr + 2);
    
    			if (param2 < AX || param2 > SP)
    				reg(addr + 2);
    
    			printf("0x%X:\t%s\t0x%X\t%s\n", addr, opcodes[c], param1, registers[param2]);
    
    			addr += 3;
    			break;
    		default:
    			fprintf(stderr, "ERROR: Invalid opcode at address 0x%X\n", addr);
    			return 1;
    		}
    	}
    
    	fclose(file);
    
    	return 0;
    }
    sxe.h
    Code:
    /*-
     * Smartzkid's emulator's hardware information and SXE ecutable info
     *
     * Joonas Hilska <joonas.hilska(at)kopteri.net>
     * This code is public domain (no copyright).
     * You can do whatever you want with it.
     */
    
    #ifndef _SXE_H_
    #define	_SXE_H_
    
    /* Instructions */
    #define	INIT	0
    #define	DIS1	1
    #define	JMP		2
    #define	TERM	3
    #define	PUSH	4
    #define	POP		5
    #define	JIE		6
    #define	INC		7
    #define	MOV		8
    #define	INP		9
    #define	JNE		10
    #define	LOAD	11
    #define	STOR	12
    #define	VAL		13
    
    /* Registers */
    #define	AX		0
    #define	BX		1
    #define	CX		2
    #define	DX		3
    #define	EX		4
    #define	FX		5
    #define	GX		6
    #define	HX		7
    #define	IX		8
    #define	JX		9
    #define	KX		10
    #define	LX		11
    #define	MX		12
    #define	NX		13
    #define	OX		14
    #define	PX		15
    #define	SP		16
    
    /* Executable's magic number */
    #define	SXE_MAGIC	0x52
    
    /*
     * Header of SXE executables
     */
    struct sxe_header {
    	char	s_magic;		/* Magic number (SXE_MAGIC) */
    	char	s_entry;		/* Entry point */
    
    #if 0 /* Proposed fields */
    	char	s_codestart;	/* Beginning address of program code */
    	char	s_codeend;		/* Ending address of program code */
    	char	s_datastart;	/* Beginning address of program data */
    	char	s_dataend;		/* Ending address of program data */
    	char	s_total;		/* Total size */		/* Does this include header? Should datatype be bigger? */
    #endif /* 0 */
    };
    
    #endif /* _SXE_H_ */
    EDIT2: Damn it messed my formatting.
    Last edited by fronty; 10-09-2009 at 04:37 PM.

  19. #19
    Join Date
    Feb 2006
    Location
    Belgium
    Posts
    3,137
    Mentioned
    3 Post(s)
    Quoted
    5 Post(s)

    Default

    Nice work

  20. #20
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default Rev 3

    Changes:
    • Rewrote assembler
    • Implemented labels
    • Added Opcodes below
    • Working away from INIT & TERM


    Header changed to:
    Magic Number (0x53)
    Code Start
    Code End


    SHL 0x0E, Register
    SHR 0x0F, Register
    DEC 0x10, Register
    ADD 0x11, Register, Register, Register | A = B + C
    SUB 0x12, Register, Register, Register | A = B - C
    PROD 0x13, Register, Register, Register | A = B * C
    DIV 0x14, Register, Register, Register | A = B / C
    MOD 0x15, Register, Register, Register | A = B % C



    What's next?
    • Comments
    • Remove INIT/TERM


    Emulator.scar
    SCAR Code:
    program Emulator;
    const
      datafile = 'C:\program.bxe';
    var
      RAM: Array [0..255] of Byte; //0-63 = Program, 64-79 = Stack, 80-255 = general memory
      Register: Array [0..16] of Byte;
      Counter: Integer;
      C_END, PC: Integer; //Program counter. TODO: Replace with register PC
      OpCode: Byte;
      CPUIsntRunning: Boolean;
    const
      //Register array locations (all here for reference)
      AX = 0;
      BX = 1;
      CX = 2;
      DX = 3;
      EX = 4;
      FX = 5;
      GX = 6;
      HX = 7;
      IX = 8;
      JX = 9;
      KX = 10;
      LX = 11;
      MX = 12;
      NX = 13;
      OX = 14;
      PX = 15;
     
      SP = 16;
     
      //Program is @ RAM 0-63
      //Stack is @ RAM 64-79 (15 levels deep)

      INIT = $00;
      DIS1 = $01;
      JMP  = $02;
      TERM = $03;
      PUSH = $04;
      POP  = $05;
      JIE  = $06;
      INCX = $07;
      MOV  = $08;

      JNE  = $0A;
      LOAD = $0B;
      STOR = $0C;
      VAL  = $0D;
      SHLX = $0E;
      SHRX = $0F;
      DECX = $10;
      ADD  = $11;
      SUB  = $12;
      PROD = $13;
      DIVX = $14;
      MODX = $15;

    function nem: Byte;
    begin
    try
      result := RAM[PC];
    except
      writeln('Debugtime!');
      wait(40000);
    end
      inc(PC);
    end;

    procedure ReadFileIntoRAM(theFile: String);
    var
      hFile, i: integer;
      data: byte;
    begin
      try
        hFile := OpenFile(theFile, false);
        for i := 0 to 2 do
        begin
          ReadFileByte(hFile, data);
          case i of
            0: if(data <> $53) then
                 RaiseException(erUnexpectedEof, 'File is not an SXE executable');
            1: PC := data;
            2: C_END := data;
          end;
        end;
        i := 0;
        repeat
          ReadFileByte(hFile, data);
          RAM[i] := data;
          inc(i);
        until((EndOfFile(hFile)) or (i > 63));
        writeln('Program loaded (' + inttostr(i) + '/64 bytes used)');
      except
        writeln('Error loading executable: ' + ExceptionParam);
        TerminateScript;
      finally
        CloseFile(hFile);
      end;
    end;

    begin
    { ASM Instructions    //TODO: Group similar instructions, ie inc/dec
       INIT 0x00, InitialCounter  //NOT YET IMPLEMENTED
       DIS1 0x01, Register
       JMP  0x02, JumpLocation
       TERM 0x03         //TODO: REMOVE
       PUSH 0x04, Register
       POP  0x05, Register
       JIE  0x06, Register, Register, JumpLocation
       INC  0x07, Register
       MOV  0x08, Register, Register
       INP  0x09, Register //TODO: discard? [INPut]
       JNE  0x0A, Register, Register, JumpLocation
       LOAD 0x0B, Register, RAM
       STOR 0x0C, RAM, Register
       VAL  0x0D, Register, Value
       SHL  0x0E, Register
       SHR  0x0F, Register
       DEC  0x10, Register
       ADD  0x11, Register, Register, Register | A = B + C
       SUB  0x12, Register, Register, Register | A = B - C
       PROD 0x13, Register, Register, Register | A = B * C
       DIV  0x14, Register, Register, Register | A = B / C
       MOD  0x15, Register, Register, Register | A = B % C
       
       
       AND  0xXX, Register, Register //TODO
       NAND 0xXX, Register, Register
       OR   0xXX, Register, Register
       NOR  0xXX, Register, Register
       XOR  0xXX, Register, Register
    }

      ClearReport;
      readFileIntoRAM(datafile);
      Register[SP] := 79;

      repeat
        OpCode := nem;

        case OpCode of
          INIT:
            CPUIsntRunning := false;
          DIS1:
            AddToReport(inttostr(Register[nem]));
          JMP:
            PC := nem;
          TERM:
            CPUIsntRunning := true;
          POP:
            begin
              Register[nem] := RAM[Register[SP]];
              inc(Register[SP]);
            end;
          PUSH:
            begin
              dec(Register[SP]);
              RAM[Register[SP]] := Register[nem];
            end;
          JIE:
            if(Register[nem] = Register[nem])then
              PC := nem
            else
              nem;
          JNE:
            if(Register[nem] <> Register[nem])then
              PC := nem
            else
              nem;
          INCX:
            inc(Register[nem]);
          MOV:
            Register[nem] := Register[nem];
          LOAD:
            Register[nem] := RAM[nem + 80];
          STOR:
            RAM[nem + 80] := Register[nem];
          VAL:
            Register[nem] := nem;
          SHLX:
            Register[nem] := Register[RAM[PC - 1]] shl 1;
          SHRX:
            Register[nem] := Register[RAM[PC - 1]] shr 1;
          DECX:
            dec(Register[nem]);
          ADD:
            Register[nem] := Register[nem] + Register[nem];
          SUB:
            Register[nem] := Register[nem] - Register[nem];
          PROD:
            Register[nem] := Register[nem] * Register[nem];
          DIVX:
            Register[nem] := Register[nem] / Register[nem];
          MODX:
            Register[nem] := Register[nem] mod Register[nem];
        else
          writeln('INVALID OPCODE: ' + inttostr(OPCODE));
        end;

        inc(Counter);
      //  writeln('Line ' + inttostr(counter));
      until((CPUIsntRunning) or (PC = C_END))
    end.

    Assembler.scar
    SCAR Code:
    program Assembler; //TODO: Rewrite this.
    {
    [Planned] Executable layout

    [Code header]
    0x52
    Code Start Address
    Code End Address

    [Data segment]
    todo..


    [Code segment]
    Instructions

    }


    const
      SourceFile = 'C:\program'; //Input file is .asm, output file is .bxe
    var
      timer: integer;
      prog: string;
      progWords: TStringArray;
      initdata, code: Array of Byte;
      labelList: Array of record
        address: integer; name: string;
      end;
      labelUsages: Array of record
        address: integer; labelname: string;
      end;


    function ParseWords(data: string): TStringArray;
    var
      i, j, k: integer;
    begin
      data := Replace(data, chr(13) + chr(10), ' ');
      SetLength(result, length(data) / 8);
      k := 1;
      for i := 1 to length(data) do
      begin
        if(k > 0) then
        begin
          if((data[i] = ' ') or (i = length(data))) then
          begin
            if(j > high(result))then
              SetLength(result, j + 15);
            if(i = length(data)) then
              inc(i);
            result[j] := copy(data, k, i - k);
            inc(j);
            k := -1;
          end;
        end
          else if(data[i] <> ' ') then
            k := i;
      end;
      SetLength(result, j);
    end;

    procedure AddToLabelList(codeAddress: integer; labelName: String);
    begin
      SetLength(labelList, Length(LabelList) + 1);
      labelList[high(LabelList)].address := codeAddress;
      labelList[high(LabelList)].name := Copy(labelName, 1, length(labelName) - 1);
    end;

    procedure AddToLabelCalls(codeAddress: integer; labelName: String);
    begin
      SetLength(labelUsages, Length(labelUsages) + 1);
      labelUsages[high(labelUsages)].address := codeAddress;
      labelUsages[high(labelUsages)].labelName := labelName;
    end;

    function ParseInstructions(data: TStringArray): Array of Byte;
    var
      InstructionList: Array of String;
      Registers: Array of String;
      Code: Array of Byte;
      done: boolean;
      i, j, r: integer;
    begin
      Registers := ['AX', 'BX', 'CX', 'DX', 'EX', 'FX', 'GX', 'HX', 'IX',
                    'JX', 'KX', 'LX', 'MX', 'NX', 'OX', 'PX', 'SP'];
      InstructionList := ['INIT', 'DIS1', 'JMP', 'TERM', 'PUSH', 'POP',
                       'JIE', 'INC', 'MOV', 'INP', 'JNE', 'LOAD',
                       'STOR', 'VAL', 'SHL', 'SHR', 'DEC', 'ADD', 'SUB',
                       'PROD', 'DIV', 'MOD'];
      SetLength(Code, Length(data));

      for i := 0 to high(data) do
      begin
        done := false;

        if(TrimNumbers(data[i]) = '') then
        begin
          Code[r] := StrToInt(data[i]);
          inc(r);
          done := true;
        end;

        if(done) then
          Continue;
       
        for j := 0 to high(InstructionList) do  //Process instructions
          if(data[i] = InstructionList[j]) then
          begin
            Code[r] := j;
            inc(r);
            done := true;
            break;
          end;
         
        if(done) then
          Continue;

        for j := 0 to high(Registers) do
          if(data[i] = Registers[j]) then
          begin
            Code[r] := j;
            inc(r);
            done := true;
            break;
          end;

        if(done) then
          Continue;

        if(data[i][length(data[i])] = ':') then  //Define labels
          AddToLabelList(r, data[i])
        else
          begin
            AddToLabelCalls(r, data[i]);  //Process label calls
            Code[r] := -1;
            inc(r);
          end;
      end;

      SetLength(Code, r);
      result := Code;
    end;

    procedure ProcessLabels(var data: Array of Byte);
    var
      i, j: integer;
    begin
      for i := 0 to high(labelUsages) do
        for j := 0 to high(labelList) do
          if(LabelUsages[i].LabelName = labelList[j].name) then
          begin
            if(data[labelUsages[i].address] = 255) then
              data[labelUsages[i].address] := labelList[j].address //+ length(initdata)
            else
              writeln('debug: your assembler is broken ;)');
            break;
          end;
    end;

    function ReadSource: string;
    var
      h: integer;
    begin
      h := OpenFile(SourceFile + '.asm', false);
      ReadFileString(h, result, FileSize(h));
      CloseFile(h);
    end;

    procedure WriteFile(data: Array of Byte);
    var
      h, i: integer;
    begin
      h := RewriteFile(SourceFile + '.bxe', true);
      for i := 0 to high(initdata) do
        WriteFileByte(h, initdata[i]);
      for i := 0 to high(data) do
        WriteFileByte(h, data[i]);
      CloseFile(h);
    end;

    begin
      SetLength(InitData, 3);
      InitData[0] := $53;
      timer := GetSystemTime;
      prog := readSource;
      progWords := ParseWords(prog);
      code := ParseInstructions(progWords);
      processLabels(code);
      InitData[1] := 0;
      InitData[2] := length(code);
      writeFile(code);
      writeln('Assembled in ' + inttostr(GetSystemTime - timer) + ' ms');
    end.

    program.asm
    Code:
    begLabel:
    VAL AX 64
    STOR 0 AX
    VAL SP 0
    
    alabel:
    POP AX
    DIS1 AX
    LOAD AX 0
    JIE SP AX endlbl
    JMP alabel
    
    endlbl:
    INC BX
    VAL AX 2
    JNE BX AX begLabel
    VAL CX 99
    INC CX
    DIS1 CX
    Program.asm comments
    Code:
    begLabel:
    VAL AX 64 //Set AX to 64
    STOR 0 AX //Store AX (64) into first slot of general memory
    VAL SP 0 //Set the stack pointer to 0 (with proper stack-range checking, this won't work anymore, but that is a ways away)
    
    alabel:
    POP AX //Place the next byte into AX
    DIS1 AX //Display said byte
    LOAD AX 0 //Load the first slot of general memory (64) into AX
    JIE SP AX endlbl //If the stack is at location AX (64), break out of the loop (64 is currently the max length of a program)
    JMP alabel //Loop to the beginning of the program-dump code
    
    endlbl:
    INC BX //Increase BX
    VAL AX 2 //Set AX to 2
    JNE BX AX begLabel //If BX =/ AX, jump to the initialization code (loop twice)
    VAL CX 99 //Set CX to 99
    INC CX //Increase CX
    DIS1 CX //Display CX (100)
    New Instructions
    Code:
    VAL AX 5
    SHL AX
    DIS1 AX
    
    VAL BX 10
    SHR BX
    DIS1 BX
    
    ADD CX AX BX
    DIS1 CX
    
    SUB CX AX BX
    DIS1 CX
    
    PROD CX AX BX
    DIS1 CX
    
    DIV CX AX BX
    DIS1 CX
    
    MOD CX BX AX
    DIS1 CX
    
    DEC CX
    DIS1 CX
    Last edited by Smartzkid; 10-10-2009 at 02:14 AM. Reason: Rev 3.3 + Another sample program
    Interested in C# and Electrical Engineering? This might interest you.

  21. #21
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    I have one suggestion: Now when you can calculate size of program code from values in executable header, why not give program code just that much space? If you have 20 bytes of code, you waste over 40 bytes. Or if you have 70 bytes of code, your program code extends into stack!

    And btw, your PUSH/POP doesn't check for over or underruns.

    I will update my disassembler in evening/tomorrow when I'm back home.

  22. #22
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default

    Hmm, I like it.

    Willdo:
    Move stack to upper memory
    Move ram to directly after program
    Interested in C# and Electrical Engineering? This might interest you.

  23. #23
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Btw, ram = whole thing, including code, data, stack etc.

    Updated disassembler. Supports both revision 2 and 3. I included executable for those who don't have C compiler installed.

  24. #24
    Join Date
    Sep 2006
    Location
    New Jersey, USA
    Posts
    5,347
    Mentioned
    1 Post(s)
    Quoted
    3 Post(s)

    Default

    Oops, meant general memory
    Interested in C# and Electrical Engineering? This might interest you.

  25. #25
    Join Date
    Sep 2009
    Posts
    66
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Played around with this a bit and it's quite usable already. I'm looking forward to see what this will be!

Page 1 of 3 123 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •