Home > Writings > Programming > Using Assembler in Delphi > Chapter 3: Local Variables

Using Assembler in Delphi

Chapter 3: Local Variables

Just as in any Pascal routine, you can use local variables inside your assembler code. Local variables are declared in the same way as for Pascal routines, by a var section. In this chapter, I will give a detailed overview of how local variables are implemented and used in asm blocks.

3.1. Local variables and the stack frame

Storage space for local variables is allocated on the stack, by means of the compiler generated entry code (and freed again in exit code). Please note that for some complex types like AnsiStrings, the space allocated is for a pointer to the actual data (strings reside on the heap and the string variable is a mere pointer to the data) and further action will be needed to allocate and assign the actual data. continuing the narrative from the previous chapter, have a look at the following code:

procedure DoSomething(First, Second, Third: Integer); pascal;

var

  SomeTemp: Integer

asm

  ...

end;

From that previous chapter, we already know that using the pascal calling convention will result in parameters being pushed onto the stack prior to invoking the procedure. The call instruction will push the return address onto the stack. Next, entry code will cause the value of ebp to also be pushed onto the stack. Then, ebp is set up as base pointer for accessing the data on the stack frame. At this point, the stack frame looks therefore as follows:

Picture showing stack with parameters, return address and saved ebp value and esp and ebp pointing to the latter.

Because we have also declared a local variable, SomeTemp, the compiler will add code (for instance push ecx) to reserve space on the stack for said variable:

Picture showing stack with parameters, return address, saved ebp and space for SomeValue

As stated before, ebp contains a base pointer for accessing data on the stack frame. Since the stack grows downwards, higher addresses contain parameters, while lower addresses contain local variables. In our particular example, the stack frame has the following slots allocated:

Parameters:

First  = ebp + $10 (ebp + 16)

Second = ebp + $0C (ebp + 12)

Third  = ebp + $08 (ebp + 8)



Local Variables:

SomeTemp = ebp - $04 (ebp - 4)

A next local variable will be allocated at ebp -8 and so on. Just as with parameters on the stack, you can (and should) use the variable name to refer to the actual location on the stack:

mov eax, SomeTemp

which will be translated by the compiler into:

mov eax, [ebp-4]

Please note that the content of these variables is generally not initialised, and you should treat it as being undefined. It is your task to initialise them when and if required.

Because using local variables cause overhead for creating and managing the stack frame, it is worth analysing your algorithm carefully to determine whether or not you need local storage. Clever use of available registers and smart code design can often avoid the need for local variables altogether. Apart from avoiding overhead for allocating and managing local variables, moving data between registers is significantly faster than accessing data in main memory (but beware of stalls and other performance hits, for instance by reading a register immediately after writing it). When you are writing Object Pascal code, the Delphi compiler will perform optimisations by trying to use registers wherever feasible. Loop counter variables are a particular case in point and you, too, should favour registers for such usage. Of course, inside an asm..end block, you are on your own and the compiler will not perform such optimisations for you. Well structured code will therefore aim to use registers as much as possible, especially for data that is used most often.

3.2. Simple types as local variables

Quite a number of data types will require simply allocation of space on the stack frame when you declare a local variable. ShortInt, SmallInt, LongInt, Byte, Word, DWord, Boolean, ByteBool, WordBool, LongBool, Char, Ansichar and Widechar all belong to this category.

While not all of these types are 32-bits wide, reservation of stack space will always happen in chunks of 32-bits at a time. That means that if you use smaller types, like for instance byte or word, the unused part of the allocated space is undefined. For instance, if you declare a local variable as follows:

var

  AValue: ShortInt;

While AValue only requires one byte, a full dword is allocated on the stack frame. This behaviour ensures that data on the stack is always aligned on a dword boundary, which improves performance and makes the logic to calculate variable locations easier (and allows for easy use of scaling in indirect addressing). You should however not use the remainder part of the allocated space (the padding) since this ultimately an implementation issue. Future compiler versions might behave differently. If you need additional storage space, simply use the appropriate, larger type.

Please note that this rule for dword allocation does not apply to local variables of type record, even though they are also stored on the stack frame. Their member fields' alignment depends on the state of the alignment switch ({$A} directive) and the use of the packed modifier. This is discussed in more detail in the next paragraph.

This alignment behaviour is another good reason to refer to local variables using their variable name, rather than manually calculating the offset yourself. The compiler will calculate the correct offset for you. In the example above, AValue occupies only one byte. Hence, only the lowest byte of the allocated dword is used. So, this instruction:

mov al, AValue

will result in the compiler generating the following code:

mov al, [epb-$01]

In Pascal routines, outside asm...end blocks, the compiler might optimise a local variable into a register, in which case no space for it will be allocated on the stack frame. While such optimisation does not happen inside your own asm...end blocks, you should be aware of this behaviour when observing compiler generated code through the CPU window. Similarly, sometimes the compiler will generate code that uses esp directly, rather than an offset from ebp, thus saving the need for initialisation of ebp. As argued before, it's not generally advisable to use esp directly in your own assembler code, as it makes it extremely hard to read and maintain the code and it is prone to introducing subtle coding errors that will be hard to find and debug. While study compiler generated code can be very instructive, remember that you are not a machine, but a human programmer. Machines are good in making sure they calculate the right offsets and the like - humans mostly are not. It's likely that in most cases stack frame overhead will not constitute a bottleneck, especially if you design your code carefully. If you identify that stack frame overheads cause performance issues in your application, you should reconsider your algorithm. It might sound very obvious, but too many programmers end up at some point optimising code that has no effect on the overall application's performance.

3.3. Records as local variables

Just as in the case of simple types, local record variables are stored on the stack frame. In that respect, they are not fundamentally different in their usage from simple types (see previous paragraph). However, the compiler's record alignment mechanism is more complex. This can seriously complicate things for the programmer if he/she is coding offsets directly.

There are two key factors that define the compiler's record alignment behaviour: the alignment directive ({$A} or {$ALIGN}) and the packed modifier. Furthermore, the actual alignment of the record member fields is dependent on the field type. For example, let's consider the following record declaration:

TMyRecord = record

  FirstValue:  DWord;

  SecondValue: Byte;

  ThirdValue:  DWord;

  FourthValue: Byte;

end;

The alignment boundary for each member field of the record depends on its type and its size. In the example above, Firstvalue and ThirdValue are of type DWord, which is a 32-bit type. With alignment on, they will be aligned to dword boundaries. Since in between those two members, there is a byte-sized field, SecondValue, the compiler will add three padding bytes, thus ensuring that ThirdValue is properly aligned. The following picture shows the memory allocation for this record in the aligned state:

Picture showing stack usage for aligned record, showing three unused bytes between SecondValue and ThirdValue

By adding the packed modifier to the record declaration, the record's member fields are no longer aligned. You can see the result in the following illustration, as the padding bytes are no longer present:

Picture showing stack usage for non-aligned record, without any padding bytes

Similarly, when alignment is turned off by using the {$A-} directive, even without the packed modifier there will be no padding between record member fields. Fortunately, just as for simple types, you can refer to record member fields by their names, and the compiler will calculate the correct offsets for you. However, always make sure you use operands of the proper size, i.e. specify the operand size explicitly. In that way, your code will continue to work correctly even when alignment is changed or the packed modifier is introduced at a later stage:

mov eax, DWORD PTR [ARecord.FirstValue]

mov  al, BYTE  PTR [ARecord.Byte]
3.4. Heap allocated types as local variables

Dynamic variables, long strings, wide strings, dynamic arrays, variants, and interfaces in Delphi are all variable types that are stored in heap memory. In order to use them, you use a reference variable, i.e. a pointer to the actual variable data. In assembler, you will be responsible for the allocation and management of the memory and its contents.

In other words, if you use heap allocated types as local variables, memory will be allocated for the reference (the pointer) to that variable on the stack frame, but you are responsible for the actual allocation and deallocation of the memory and for initialising the contents. In Pascal, most of these types are largely automatically managed, so allocation and deallocation happens behind the scenes. In assembler blocks, that is obviously not the case.

You can call GetMem to allocate memory and return a pointer to the newly allocated memory. You need to pass the amount of memory needed in eax and upon return from GetMem the eax register will contain the pointer, which you can then store in the appropriate slot on the stack frame.

Next: Chapter 4: Returning Results