Just as in any Pascal routine, you can use local variables inside your assembler code. Local variables are declared in the same way as for Pascal routines, by a var
section. In this chapter, I will give a detailed overview of how local variables are implemented and used in asm
blocks.
Storage space for local variables is allocated on the stack, by means of the compiler generated entry code (and freed again in exit code). Please note that for some complex types like AnsiStrings
, the space allocated is for a pointer to the actual data (strings reside on the heap and the string variable is a mere pointer to the data) and further action will be needed to allocate and assign the actual data. continuing the narrative from the previous chapter, have a look at the following code:
procedure DoSomething(First, Second, Third: Integer); pascal; var SomeTemp: Integer asm ... end;
From that previous chapter, we already know that using the pascal
calling convention will result in parameters being pushed onto the stack prior to invoking the procedure. The call instruction will push the return address onto the stack. Next, entry code will cause the value of ebp
to also be pushed onto the stack. Then, ebp
is set up as base pointer for accessing the data on the stack frame. At this point, the stack frame looks therefore as follows:
Because we have also declared a local variable, SomeTemp
, the compiler will add code (for instance push ecx
) to reserve space on the stack for said variable:
As stated before, ebp
contains a base pointer for accessing data on the stack frame. Since the stack grows downwards, higher addresses contain parameters, while lower addresses contain local variables. In our particular example, the stack frame has the following slots allocated:
Parameters: First = ebp + $10 (ebp + 16) Second = ebp + $0C (ebp + 12) Third = ebp + $08 (ebp + 8) Local Variables: SomeTemp = ebp - $04 (ebp - 4)
A next local variable will be allocated at ebp -8
and so on. Just as with parameters on the stack, you can (and should) use the variable name to refer to the actual location on the stack:
mov eax, SomeTemp
which will be translated by the compiler into:
mov eax, [ebp-4]
Please note that the content of these variables is generally not initialised, and you should treat it as being undefined. It is your task to initialise them when and if required.
Because using local variables cause overhead for creating and managing the stack frame, it is worth analysing your algorithm carefully to determine whether or not you need local storage. Clever use of available registers and smart code design can often avoid the need for local variables altogether. Apart from avoiding overhead for allocating and managing local variables, moving data between registers is significantly faster than accessing data in main memory (but beware of stalls and other performance hits, for instance by reading a register immediately after writing it). When you are writing Object Pascal code, the Delphi compiler will perform optimisations by trying to use registers wherever feasible. Loop counter variables are a particular case in point and you, too, should favour registers for such usage. Of course, inside an asm..end
block, you are on your own and the compiler will not perform such optimisations for you. Well structured code will therefore aim to use registers as much as possible, especially for data that is used most often.
Quite a number of data types will require simply allocation of space on the stack frame when you declare a local variable. ShortInt
, SmallInt
, LongInt
, Byte
, Word
, DWord
, Boolean
, ByteBool
, WordBool
, LongBool
, Char
, Ansichar
and Widechar
all belong to this category.
While not all of these types are 32-bits wide, reservation of stack space will always happen in chunks of 32-bits at a time. That means that if you use smaller types, like for instance byte
or word
, the unused part of the allocated space is undefined. For instance, if you declare a local variable as follows:
var AValue: ShortInt;
While AValue
only requires one byte, a full dword is allocated on the stack frame. This behaviour ensures that data on the stack is always aligned on a dword boundary, which improves performance and makes the logic to calculate variable locations easier (and allows for easy use of scaling in indirect addressing). You should however not use the remainder part of the allocated space (the padding) since this ultimately an implementation issue. Future compiler versions might behave differently. If you need additional storage space, simply use the appropriate, larger type.
Please note that this rule for dword allocation does not apply to local variables of type record
, even though they are also stored on the stack frame. Their member fields' alignment depends on the state of the alignment switch ({$A} directive) and the use of the packed
modifier. This is discussed in more detail in the next paragraph.
This alignment behaviour is another good reason to refer to local variables using their variable name, rather than manually calculating the offset yourself. The compiler will calculate the correct offset for you. In the example above, AValue
occupies only one byte. Hence, only the lowest byte of the allocated dword is used. So, this instruction:
mov al, AValue
will result in the compiler generating the following code:
mov al, [epb-$01]
In Pascal routines, outside asm...end
blocks, the compiler might optimise a local variable into a register, in which case no space for it will be allocated on the stack frame. While such optimisation does not happen inside your own asm...end
blocks, you should be aware of this behaviour when observing compiler generated code through the CPU window. Similarly, sometimes the compiler will generate code that uses esp
directly, rather than an offset from ebp
, thus saving the need for initialisation of ebp
. As argued before, it's not generally advisable to use esp
directly in your own assembler code, as it makes it extremely hard to read and maintain the code and it is prone to introducing subtle coding errors that will be hard to find and debug. While study compiler generated code can be very instructive, remember that you are not a machine, but a human programmer. Machines are good in making sure they calculate the right offsets and the like - humans mostly are not. It's likely that in most cases stack frame overhead will not constitute a bottleneck, especially if you design your code carefully. If you identify that stack frame overheads cause performance issues in your application, you should reconsider your algorithm. It might sound very obvious, but too many programmers end up at some point optimising code that has no effect on the overall application's performance.
Just as in the case of simple types, local record variables are stored on the stack frame. In that respect, they are not fundamentally different in their usage from simple types (see previous paragraph). However, the compiler's record alignment mechanism is more complex. This can seriously complicate things for the programmer if he/she is coding offsets directly.
There are two key factors that define the compiler's record alignment behaviour: the alignment directive ({$A} or {$ALIGN}) and the packed
modifier. Furthermore, the actual alignment of the record member fields is dependent on the field type. For example, let's consider the following record declaration:
TMyRecord = record FirstValue: DWord; SecondValue: Byte; ThirdValue: DWord; FourthValue: Byte; end;
The alignment boundary for each member field of the record depends on its type and its size. In the example above, Firstvalue
and ThirdValue
are of type DWord
, which is a 32-bit type. With alignment on, they will be aligned to dword boundaries. Since in between those two members, there is a byte-sized field, SecondValue
, the compiler will add three padding bytes, thus ensuring that ThirdValue
is properly aligned. The following picture shows the memory allocation for this record in the aligned state:
By adding the packed
modifier to the record declaration, the record's member fields are no longer aligned. You can see the result in the following illustration, as the padding bytes are no longer present:
Similarly, when alignment is turned off by using the {$A-} directive, even without the packed
modifier there will be no padding between record member fields. Fortunately, just as for simple types, you can refer to record member fields by their names, and the compiler will calculate the correct offsets for you. However, always make sure you use operands of the proper size, i.e. specify the operand size explicitly. In that way, your code will continue to work correctly even when alignment is changed or the packed
modifier is introduced at a later stage:
mov eax, DWORD PTR [ARecord.FirstValue] mov al, BYTE PTR [ARecord.Byte]
Dynamic variables, long strings, wide strings, dynamic arrays, variants, and interfaces in Delphi are all variable types that are stored in heap memory. In order to use them, you use a reference variable, i.e. a pointer to the actual variable data. In assembler, you will be responsible for the allocation and management of the memory and its contents.
In other words, if you use heap allocated types as local variables, memory will be allocated for the reference (the pointer) to that variable on the stack frame, but you are responsible for the actual allocation and deallocation of the memory and for initialising the contents. In Pascal, most of these types are largely automatically managed, so allocation and deallocation happens behind the scenes. In assembler blocks, that is obviously not the case.
You can call GetMem
to allocate memory and return a pointer to the newly allocated memory. You need to pass the amount of memory needed in eax
and upon return from GetMem
the eax
register will contain the pointer, which you can then store in the appropriate slot on the stack frame.
Next: Chapter 4: Returning Results