Results 1 to 8 of 8

Thread: The Background Magic

  1. #1
    Join Date
    Sep 2012
    Location
    Here.
    Posts
    2,007
    Mentioned
    88 Post(s)
    Quoted
    1014 Post(s)

    Default The Background Magic

    The Background Magic of Programming

    Foreword:
    This is not a tutorial in the sense that it teaches you how to do something. This however, should teach you some of the background magic in programming and why stuff works like it does. And after you know why stuff is the way it is, this may help improve your efficiency!

    Table of Contents:
    1. Introduction
    2. A Brief History
    3. Pointers
    4. Stack and Heap
    5. FAQ
    6. Credits

    Introduction:
    Greetings, everyone! I have noticed that many people here have no background programming experience. While that isn't necessary to be a good, or even great programmer here, knowing some extra background can still help!

    A Brief History:
    1. In the*important* beginning computers used assembly languages. Even now assembly languages exist and they are how everything works, even your computer. Windows, Linux, Apple, handheld devices, etc. They all use some assembly language (most big devices use MIPS and smaller ones use ARM, but exceptions exist and there are more assembly languages than even those 2).
      Every program (script) you ever write will be somehow converted to the assembly language of your computer before it is run. All the 1s and 0s are actually machine language (assembly) commands. It is possible to program directly in an assembly language still, but it's typically rarely done outside of tiny electronic devices.
    2. The next *important* big break in Programming was the C language. Objects weren't around yet, but so many new things came along. C could be compiled into any assembly language (this was unique for the era) and it was significantly easier and faster than programming in an assembly language because it started doing behind the scenes magic (although not nearly as much as most current languages). Typically C is not used nowadays because it's considered to base and difficult to write most programs in due to the lack of objects.
    3. After C came C++. C++ was another tremendous breakthrough, mainly because it added more magic, but it also introduced objects to the world of programming. Objects are a wonderous creation that lets the programmer do a variety of things. Objects store a variety of fields (Strings, integers, booleans, other objects, etc) that are all owned as values inside them. Furthermore objects have methods. What is a method you ask? It's a function that exists entirely within an object. Only the object can call it, and it typically exhibits behavior in relation to the containing object. {IE: A 'Gun' (object) calls 'Shoot' (method) to fire a 'Bullet' (object created by the gun) in a given direction.}
    4. Many other important things happened before and since C++, but this short history should give a decent lead into some of the magic discussed in this tutorial.

    What Type of Sorcery is a Pointer?
    Pointers are some of the most infuriating things in all of programming. They easily cause the most anger and hatred in the programmer's life (if they have to use them). Thankfully most languages nowadays throw them in the magic category and you never have to know they exist. However, that does not mean pointers stopped existing in programming, it just means the compiler (in our case SIMBA) does all the translation for you without you realizing it ever happened.

    So what is a pointer? It's an integer telling your computer where a variable is stored in memory. Your computer doesn't know where the values of variables are without the pointer telling it where. Think of it like a mailbox. I could send a letter to YoHoJo telling him he's awesome, but the mailman can't take it to him without an address. The address is the pointer.

    Using pointers and passing them along is referred to as "passing by reference" (google that to learn more than what I will cover). It is passing by reference because when you pass the pointer through to a method, the method received only the reference (address) of the variable as opposed to the actual value (which is referred to as passing by value). How can we use this to our advantage? Chances are you already have, or have at least seen it happening in someone's script. Do you know the difference between:
    Simba Code:
    procedure Something(var a, b: string);
    and
    Simba Code:
    procedure Something(a, b: string);

    There is a single var before the names of the variables are stated in the first procedure. That var declaration tells the compiler (SIMBA) that the variables after it are references to an actual variable stored elsewhere. This is called passing by reference (the variable used in the called method refers to the original variable stored in the caller method). That means a procedure can change the value of variables passed to it, while without the pointer, it can't change the value.

    Simba Code:
    function WhichBank(var a, b: string): boolean;
    begin
      Result:= False;
      case b of:
        '1st': begin
          a:= 'Wells Fargo';
          Result:= True;
        end;
        '2nd': begin
          a:= 'Local bank';
          Result:= True;
        end;
        '3rd': begin
          a:= 'Union';
          Result:= True;
        end;
      Result:= False;
    end;

    var
      bank, street: string;
    begin
      bank:= '';
      street:= '2nd';
      if(WhichBank(bank, street))then
      begin
        WriteLn('My bank is ' + bank);
      end;
      else begin
        WriteLn('My bank wasn't found.');
      end;
    end.

    In the example above, the result will be "My bank is Local bank". That happened because, thanks to pointers, we could determine if the bank was found (the true/false return) and we could also return it in the bank variable! This can be used to essentially return as many variables as you want from one method!
    What is the Stack and Heap?

    They are different types of memory for a program.
    Well that was a simple answer. Let's complicate it a little - OK, a lot.

    Stack: the overall memory of your main method and everything it has to use (kind of). By default all memory in your program goes on the stack. It exists in a LIFO (Last in, first out) type of memory. For example, let's call a method at any given point in our program. When the method finally returns back to wherever called it, the related portion of the stack and all that memory it used becomes free and usable again. So stack memory (existing from methods at least) is freed commonly and easily with no effort required from the programmer. (I should add the note that any global variables will remain around for the length of the program simply because they exist at the top of the stack, and only ending the program will free them).

    Heap: a dynamic memory that memory is allocated to on the fly. That means special objects and explicit declarations of memory end up going on the heap. These allocations must be cleaned up by the scripter manually. A couple examples in SIMBA you are familiar with would be bitmaps or DTMs, or anything else that should be "Free"d. This is very useful to keep items around for a long time, but on the same hand you have to be careful when using them. You can easily cause a memory leak in a commonly called procedure by declaring these objects and then not Freeing them!
    Is there any disadvantage to having hundreds of tiny procedures?
    Sadly, there is. Procedures use their magic in order to jump in code from whoever called them and then finally back when they Exit. Two pieces of magic invisible code actually happens. A tiny return pointer variable is created (integer size) telling the program what line it was on when it called the method. In fact, an extra command was also issued to jump to the method, as well as back to the calling code (The jump however is an incredibly quick command, only taking 1 CPU cycle - less than 1ns on most computers).

    As small as a single integer and 2 CPU cycles may be for that extra method call, it simply isn't always needed and sometimes too many methods can make your code simply too hard to read. But don't be discouraged! On the opposite side of the spectrum, too large of methods can be far too difficult to read and they can end up increasing the overall program memory by more than need be. It's all about balance.

    NOTE: It's extremely hard to stress how small a difference 2 CPU cycles and an integer is. I just don't want it to seem like every line of code deserves its own method - there is an overkill moment.
    Does this relate to pointers?
    Yup! The pointer is also useful for memory considerations! In my earlier example, 2 strings were 'passed' as pointers to the 'WhichBank' function. Normally, without pointers that means the called function has to store additional local space for those variables so you can edit them without changing the original value. That means there were actually 4 strings in existence, filling up more memory! Strings also use up a lot of memory, so that can be inefficient as more and more are used (and it gets passed further down into more functions).

    Thankfully that can be improved! Using pointers actually means the local memory stored on the heap (the local method) are actually integers (the mailbox pointing to the original value in memory). So instead of 4 strings using up memory in the program, we are only using 2 string and 2 integers! And the best part about the situation is no additional processing power is used to accomplish this!
    So why don't we always use pointers?
    Well, sometimes we do still need to use local variables. We don't necessarily want to edit those pointers and their original values, and then we'd need to create local variables instead for that space, which means we ended up using more memory than had we ignored pointers.
    FAQ:
    Ask me some questions and I'll toss them up here.
    As a side note, there are tons more things happening in the background; this simply happens to be all that came to mind during a conversation yesterday.
    Credits
    I stole most of the style of this from Chris!'s Ultimate Guide to Making a Superb Guide!

    Also, years of schooling in exactly this low-level stuff.
    Last edited by Kevin; 01-15-2013 at 04:46 PM.

  2. #2
    Join Date
    Sep 2012
    Location
    Here.
    Posts
    2,007
    Mentioned
    88 Post(s)
    Quoted
    1014 Post(s)

  3. #3
    Join Date
    Jan 2009
    Location
    Turlock/LA, California
    Posts
    1,494
    Mentioned
    3 Post(s)
    Quoted
    66 Post(s)

    Default

    this was a fun read, but i have one criticism: no where in your post did you say the name "pass by reference". Passing by reference is indeed what you are doing, and if someone wanted to further look this up, they wouldn't know where to start :/ (because u dont say the name).

    but overall, it was an interesting read. i think lape is going to integrate pointers, so maybe you should also talk about different data structures.

  4. #4
    Join Date
    Sep 2012
    Location
    Here.
    Posts
    2,007
    Mentioned
    88 Post(s)
    Quoted
    1014 Post(s)

    Default

    Ah, pass by reference. Probably a simpler way to explain the pointers and (like you said) for further research.

    Lape is directly integrating pointers? I may try looking into writing some examples then when that happens.

    Also, thanks!

  5. #5
    Join Date
    Feb 2011
    Location
    The Future.
    Posts
    5,600
    Mentioned
    396 Post(s)
    Quoted
    1598 Post(s)

    Default

    C++ has no "Methods".. Only "Members". C# and Java have "Methods".

    The difference is that C++ can have pointers to member functions and friends. Friends can access members. Methods are not the same as members even when using C#'s unsafe keyword and has no friend classes. I think C# has delegates to compensate but still no friend classes.

    Well at least, I like to think they're not the same.. Last time I mixed it up, I got laughed at :c


    it just means the compiler (in our case SIMBA) does all the translation for you without you realizing it ever happened.
    Interpreter != Compiler.

    The address is the pointer
    The address is not a pointer. A pointer holds a reference to a specific location in memory.. Aka the address. An address is a cell-block in memory that holds information.

    When a method finally returns back to whoever called it (another method or the stack), the heap of the returning method will be recycled and all that memory becomes free and usable again.
    This isn't true. The stack does that. The stack is where functions reside and the stack unwinds and cleans up when the function returns. In fact, functions can be thought of as Labels in Assembly. I don't think you allocate on the heap for functions.. I'm pretty sure functions reside on the stack due to stackoverflows when doing infinite recursions (Non-Tail Recursions).

    Also, any allocations within a function are NOT cleaned up or recycled when the function returns.. You have the job of doing the cleaning since you're the one that asked for the allocation in the first place. Stack variables on the other hand are cleaned up because the stack unwinds and returns to it's previous state.

    Allocations on the heap must be cleaned up by the user.. Simba has Alloc and Free functions for these.. Ex: Bitmaps. CreateBitmap and FreeBitmap. DTMFromString and FreeDTM.. etc..


    Pointers are Longs not exactly Ints. It depends on the system as sometimes longs would be 4 bytes and sometimes 8.. It's usually safer to use size_t in C++ or a long in any other language (C# usually uses longs to represent pointers).


    Using pointers actually means the local memory stored on the heap (the local method) are actually integers
    Local memory = stack memory. It's not stored on the heap.

    If you do not allocate with something such as malloc or new or new[], then it is stack allocated unless you use an allocator. Likewise, if you don't free or delete or delete[], it will leak. The OS tends to claim the memory back regardless when the program terminates but it can seriously slow down or lag your application/system. Especially if doing graphics stuff like GDI, GDI+, OpenGL, DirectX, OCL, etc.. Also be careful about what pointers you delete because they can dangle and cause problems with objects.



    Thankfully, there's no way to do this in Simba scripts. Nice thread layout and content/topic though. +1 for that.
    Last edited by Brandon; 01-09-2013 at 01:16 AM.
    I am Ggzz..
    Hackintosher

  6. #6
    Join Date
    Sep 2012
    Location
    Here.
    Posts
    2,007
    Mentioned
    88 Post(s)
    Quoted
    1014 Post(s)

    Default

    Quote Originally Posted by Brandon View Post
    C++ has no "Methods".. Only "Members". C# and Java have "Methods".

    The difference is that C++ can have pointers to member functions and friends. Friends can access members. Methods are not the same as members even when using C#'s unsafe keyword and has no friend classes. I think C# has delegates to compensate but still no friend classes.

    Well at least, I like to think they're not the same.. Last time I mixed it up, I got laughed at :c
    Methods and functions are different. Methods and members are also different. I agree with those statements. But C++ does have methods. Methods are simply functions residing within objects that can be run on them, and they do exist. It is one of the biggest changes in C++ over C.

    Quote Originally Posted by Brandon View Post
    Interpreter != Compiler.
    Correct. Simba compiles the coded PascalScript into ByteCode commands, however. Interpreters don't *usually* have compiling errors as seen so often in Simba scripts. PascalScript as a language however, is a compiled language as seen here: Intro to PascalScript. The name really throws off the idea on this one. I sat on your side of the argument as well and I think I was corrected on it a couple of weeks ago.

    Quote Originally Posted by Brandon View Post
    The address is not a pointer. A pointer holds a reference to a specific location in memory.. Aka the address. An address is a cell-block in memory that holds information.
    As noted in many C tutorials, "pointers are just variables that store memory addresses". Every variable has an address and a value, with the value stored at the address. In the case of pointers, changing the "value" changes the address you look at, which can cause a segfault if handled incorrectly.

    Quote Originally Posted by Brandon View Post
    This isn't true. The stack does that. The stack is where functions reside and the stack unwinds and cleans up when the function returns. In fact, functions can be thought of as Labels in Assembly. I don't think you allocate on the heap for functions.. I'm pretty sure functions reside on the stack due to stackoverflows when doing infinite recursions (Non-Tail Recursions).

    Also, any allocations within a function are NOT cleaned up or recycled when the function returns.. You have the job of doing the cleaning since you're the one that asked for the allocation in the first place. Stack variables on the other hand are cleaned up because the stack unwinds and returns to it's previous state.

    Allocations on the heap must be cleaned up by the user.. Simba has Alloc and -Free functions for these.. Ex: Bitmaps. CreateBitmap and FreeBitmap. DTMFromString and FreeDTM.. etc..
    I did screw up here. You are correct in how you refer to the stack. It's LIFO memory and functions do reside on the stack. But we both screwed up on the heap description. The heap is a dynamic memory that memory is allocated to on the fly. And as such you are correct when you say those allocations must be cleaned up by the user (FreeDTM, etc). But when you say "any allocations within a function are NOT cleaned up or recycled when the function returns," that isn't technically true. Anything put on the heap will not be recycled. But anything added to the stack will be. That includes function pointers and any primitives (string, int, boolean, etc) that was created in or during the function.

    Quote Originally Posted by Brandon View Post
    Pointers are Longs not exactly Ints. It depends on the system as sometimes longs would be 4 bytes and sometimes 8.. It's usually safer to use size_t in C++ or a long in any other language (C# usually uses longs to represent pointers).
    Longs and shorts are also referred to as Long Ints and Short Ints. Any number that can be written without decimals and in a non-fractional manner is n integer. The confusion of calling something an 'Int' can cause a lot of issues, but for the point of this tutorial, simply referring to it as smaller than most strings is fairly valid. You're right, it is far safer to use size_t in C++ to get the correctly sized integer. And to get to safest size of a pointer, you would use uintptr_t.


    I will definitely adjust how I refer to the stack and the heap in this. And if you think I should, I can try and clarify the issues you and I just went into about the size of integers?

    All in all though, thanks for the review!

  7. #7
    Join Date
    Oct 2011
    Location
    Chicago
    Posts
    3,352
    Mentioned
    21 Post(s)
    Quoted
    437 Post(s)

    Default

    Rep, it takes time to write something like that out




    Anti-Leech Movement Prevent Leeching Spread the word
    Insanity 60 Days (Killer workout)
    XoL Blog (Workouts/RS/Misc)

  8. #8
    Join Date
    Sep 2014
    Posts
    10
    Mentioned
    0 Post(s)
    Quoted
    6 Post(s)

    Default

    Good read, thanks for the info!

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •