The Background Magic of Programming
Foreword:
This is not a tutorial in the sense that it teaches you how to do something. This however, should teach you some of the background magic in programming and why stuff works like it does. And after you know why stuff is the way it is, this may help improve your efficiency!
Table of Contents:
- Introduction
- A Brief History
- Pointers
- Stack and Heap
- FAQ
- Credits
Introduction:
Greetings, everyone! I have noticed that many people here have no background programming experience. While that isn't necessary to be a good, or even great programmer here, knowing some extra background can still help!
A Brief History:
- In the*important* beginning computers used assembly languages. Even now assembly languages exist and they are how everything works, even your computer. Windows, Linux, Apple, handheld devices, etc. They all use some assembly language (most big devices use MIPS and smaller ones use ARM, but exceptions exist and there are more assembly languages than even those 2).
Every program (script) you ever write will be somehow converted to the assembly language of your computer before it is run. All the 1s and 0s are actually machine language (assembly) commands. It is possible to program directly in an assembly language still, but it's typically rarely done outside of tiny electronic devices. - The next *important* big break in Programming was the C language. Objects weren't around yet, but so many new things came along. C could be compiled into any assembly language (this was unique for the era) and it was significantly easier and faster than programming in an assembly language because it started doing behind the scenes magic (although not nearly as much as most current languages). Typically C is not used nowadays because it's considered to base and difficult to write most programs in due to the lack of objects.
- After C came C++. C++ was another tremendous breakthrough, mainly because it added more magic, but it also introduced objects to the world of programming. Objects are a wonderous creation that lets the programmer do a variety of things. Objects store a variety of fields (Strings, integers, booleans, other objects, etc) that are all owned as values inside them. Furthermore objects have methods. What is a method you ask? It's a function that exists entirely within an object. Only the object can call it, and it typically exhibits behavior in relation to the containing object. {IE: A 'Gun' (object) calls 'Shoot' (method) to fire a 'Bullet' (object created by the gun) in a given direction.}
- Many other important things happened before and since C++, but this short history should give a decent lead into some of the magic discussed in this tutorial.
What Type of Sorcery is a Pointer?
Pointers are some of the most infuriating things in all of programming. They easily cause the most anger and hatred in the programmer's life (if they have to use them). Thankfully most languages nowadays throw them in the
magic category and you never have to know they exist. However, that does not mean pointers stopped existing in programming, it just means the compiler (in our case SIMBA) does all the translation for you without you realizing it ever happened.
So what is a pointer? It's an integer telling your computer where a variable is stored in memory. Your computer doesn't know where the values of variables are without the pointer telling it where. Think of it like a mailbox. I could send a letter to YoHoJo telling him he's awesome, but the mailman can't take it to him without an address. The address is the pointer.
Using pointers and passing them along is referred to as "passing by reference" (google that to learn more than what I will cover). It is passing by reference because when you pass the pointer through to a method, the method received only the reference (address) of the variable as opposed to the actual value (which is referred to as passing by value). How can we use this to our advantage? Chances are you already have, or have at least seen it happening in someone's script. Do you know the difference between:
Simba Code:
procedure Something(var a, b: string);
and
Simba Code:
procedure Something(a, b: string);
There is a single
var before the names of the variables are stated in the first procedure. That
var declaration tells the compiler (SIMBA) that the variables after it are references to an actual variable stored elsewhere. This is called passing by reference (the variable used in the called method refers to the original variable stored in the caller method). That means a procedure can change the value of variables passed to it, while without the pointer, it can't change the value.
Simba Code:
function WhichBank(var a, b: string): boolean;
begin
Result:= False;
case b of:
'1st': begin
a:= 'Wells Fargo';
Result:= True;
end;
'2nd': begin
a:= 'Local bank';
Result:= True;
end;
'3rd': begin
a:= 'Union';
Result:= True;
end;
Result:= False;
end;
var
bank, street: string;
begin
bank:= '';
street:= '2nd';
if(WhichBank(bank, street))then
begin
WriteLn('My bank is ' + bank);
end;
else begin
WriteLn('My bank wasn't found.');
end;
end.
In the example above, the result will be "My bank is Local bank". That happened because, thanks to pointers, we could determine if the bank was found (the true/false return) and we could also return it in the bank variable! This can be used to essentially return as many variables as you want from one method!
What is the Stack and Heap?
They are different types of memory for a program.
Well that was a simple answer. Let's complicate it a little - OK, a lot.
Stack: the overall memory of your main method and everything it has to use (kind of). By default all memory in your program goes on the stack. It exists in a LIFO (Last in, first out) type of memory. For example, let's call a method at any given point in our program. When the method finally returns back to wherever called it, the related portion of the stack and all that memory it used becomes free and usable again. So stack memory (existing from methods at least) is freed commonly and easily with no effort required from the programmer. (I should add the note that any global variables will remain around for the length of the program simply because they exist at the top of the stack, and only ending the program will free them).
Heap: a dynamic memory that memory is allocated to on the fly. That means special objects and explicit declarations of memory end up going on the heap. These allocations must be cleaned up by the scripter manually. A couple examples in SIMBA you are familiar with would be bitmaps or DTMs, or anything else that should be "Free"d. This is very useful to keep items around for a long time, but on the same hand you have to be careful when using them. You can easily cause a memory leak in a commonly called procedure by declaring these objects and then not Freeing them!
Is there any disadvantage to having hundreds of tiny procedures?
Sadly, there is. Procedures use their magic in order to jump in code from whoever called them and then finally back when they Exit. Two pieces of magic invisible code actually happens. A tiny return pointer variable is created (integer size) telling the program what line it was on when it called the method. In fact, an extra command was also issued to jump to the method, as well as back to the calling code (The jump however is an incredibly quick command, only taking 1 CPU cycle - less than 1ns on most computers).
As small as a single integer and 2 CPU cycles may be for that extra method call, it simply isn't always needed and sometimes too many methods can make your code simply too hard to read. But don't be discouraged! On the opposite side of the spectrum, too large of methods can be far too difficult to read and they can end up increasing the overall program memory by more than need be. It's all about balance.
NOTE: It's extremely hard to stress how small a difference 2 CPU cycles and an integer is. I just don't want it to seem like every line of code deserves its own method - there is an overkill moment.
Does this relate to pointers?
Yup! The pointer is also useful for memory considerations! In my earlier example, 2 strings were 'passed' as pointers to the 'WhichBank' function. Normally, without pointers that means the called function has to store additional local space for those variables so you can edit them without changing the original value. That means there were actually 4 strings in existence, filling up more memory! Strings also use up a lot of memory, so that can be inefficient as more and more are used (and it gets passed further down into more functions).
Thankfully that can be improved! Using pointers actually means the local memory stored on the heap (the local method) are actually integers (the mailbox pointing to the original value in memory). So instead of 4 strings using up memory in the program, we are only using 2 string and 2 integers! And the best part about the situation is no additional processing power is used to accomplish this!
So why don't we always use pointers?
Well, sometimes we do still need to use local variables. We don't necessarily want to edit those pointers and their original values, and then we'd need to create local variables instead for that space, which means we ended up using more memory than had we ignored pointers.
FAQ:
Ask me some questions and I'll toss them up here.
As a side note, there are tons more things happening in the background; this simply happens to be all that came to mind during a conversation yesterday.
Credits