Mega Code Archive

Inline Assembler in Delphi (IV) Records

Title: Inline Assembler in Delphi (IV) - Records Question: How to work with records in inline assembler Answer: Inline Assembler in Delphi (IV) Records By Ernesto De Spirito edspirito@latiumsoftware.com Passing records as parameters Like static arrays, records are internally passed as pointers to the data, independently of whether the parameter is passed by value or by reference (either as "var" or as "const"). Given the following declarations... type TRecord = record Id: integer; Name: string; end; var a, b: TRecord; procedure InitRecord(var r: TRecord; Id: integer; const Name: string); begin r.Id := Id; r.Name := Name; end; ...a call to the procedure InitRecord in assembler would be like this: // In Object Pascal: // InitRecord(a, n, s); // In Inline Assembler: asm lea eax, a // EAX := @a; // 1st parameter in EAX mov edx, n // EDX := n; // 2nd parameter in EDX mov ecx, s // ECX := s; // 3rd parameter in ECX call InitRecord // InitRecord; end; Accessing the fields of a record Record fields are located at a certain offset from the address of the record (the address of the first field). In the example, assuming we have the address of a record of type TRecord in the EAX register, the field Id is located at [EAX+0] (or simply [EAX]), and the field Name is located at [EAX+4], but normally we don't write code using hardwired numbers. Instead, to produce self-explanatory and maintainable code we have five alternative ways: mov edx, [eax + TRecord.Name] mov edx, (TRecord PTR [eax]).Name mov edx, (TRecord [eax]).Name mov edx, TRecord[eax].Name mov edx, [eax].TRecord.Name The five previous sentences would assemble as: mov edx, [eax + 4] Instead of a register (like EAX), the syntaxes also apply to local variable names. You can infer from the first syntax that in inline assembler the expression RecordType.Field is evaluated at compile time as a constant representing the offset at which the Field is located in the RecordType. For example, the following sentence is valid: mov ecx, TRecord.Name // mov ecx, 4 Returning to the topic, the procedure InitRecord (introduced above) can be implemented in assembler like this: procedure InitRecord(var r: TRecord; Id: integer; const Name: string); asm // EAX = @r; EDX = Id; ECX = @Name[1] mov (TRecord PTR [eax]).Id, edx // EAX^.Id := EDX; // Id // _LStrAsg(@EAX^.Name, @Name) -- EAX^.Name := Name lea eax, (TRecord PTR [eax]).Name // EAX := @(EAX^.Name); mov edx, ecx // EDX := @Name[1]; call System.@LStrAsg // _LStrAsg(EAX, EDX) end; Upon entry to the procedure, we have EAX pointing to the record (first parameter), EDX containing the Id (second parameter), and ECX pointing to the data of the Name string (third parameter). Assigning an integer is quite simple, but assigning a string is a little bit more complicated: If the destination string is not the empty string then begin Decrement the reference count of the destination string; If the reference count of the destination string reaches zero then Release the destination string; end; If the source string is not the empty string then Increment the reference count of the source string; Assign source to destination; The _LStrAsg procedure (in the System unit) implements this logic for us. The procedure receives two parameters: the first (in EAX) is the destination string passed by reference, and the second (in EDX) is the source string passed by value (what is actually passed is the pointer, since strings are pointers to the actual characters). Therefore, in our case, EAX should be the address of the string variable that will be assigned (i.e. EAX should contain the address of r.Name), while EDX should be the value to be assigned: EAX -- r.Name -- r.Name[1] == EAX = @r.Name EDX -- Name[1] == EDX = @Name[1] Ref.: "--" means "points to" (or "contains the address of") So, we set EAX and EDX and then we call _LStrAsg: lea eax, (TRecord PTR [eax]).Name // EAX := @(EAX^.Name); mov edx, ecx // EDX := @Name[1]; call System.@LStrAsg // _LStrAsg(EAX, EDX) Low level functions to work with records Like with static arrays, if the record is passed by value, it is responsibility of the called function to preserve the record. When a function needs to change the values of one or more fields of a record passed by value, normally it creates a local copy and works on the copy. The compiler creates a copy for us in the "begin" of Pascal functions, but in full assembler functions we have to do it by ourselves. One way of doing this is like it was shown in part III with static arrays. Here is another way: procedure OperateOnRecordPassedByValue(r: TRecord); var _r: TRecord; asm // Copy the elements of "r" (parameter) in "_r" (local copy) // Move(r, _r, sizeof(TRecord)); lea edx, _r // EDX := @_r; mov ecx, type TRecord // ECX := sizeof(TRecord); call Move // Move(EAX^, EDX^, ECX); lea eax, _r // EAX := @_r; mov edx, TRecord_TypeInfo // EDX := TRecord_TypeInfo; call System.@AddRefRecord // System._AddRefRecord(EAX,EDX); lea eax, _r // EAX := @_r; // optional // Here goes the rest of the function. We'll work on the // record "_r" (the local copy), now pointed by EAX. end; This time we called the Move procedure instead of copying the data with REP MOVSB right there. This way we write less code. IMPORTANT: Copying the memory values only works with records that don't contain fields of reference-counted types such as strings, dynamic arrays, or variants of type string or dynamic array. If we have one or more string fields, or fields of any other reference-counted type, after copying the memory values we have to increment their respective reference counts. The procedure _AddRefRecord (in the System unit) does that. It takes two parameters: a pointer to the record (in EAX) and a pointer to the type information data for the record generated by the compiler (in EDX). The type information for a record is basically a data structure which contains the positions and types of the reference-counted fields of the record. The procedures to work with records declared in the System unit (_InitializeRecord, _AddRefRecord, _CopyRecord, and _FinalizeRecord) require a pointer to the type information data as their last parameter. But, where is that data? Well, unfortunately there is not a symbol to access its location directly. We have to get its address by a call to the TypeInfo function, but this is not a function we can call from assembler code because it's not a true function, but a built-in function that the compiler resolves at compile-time. One possible workaround it to initialize a global variable calling the TypeInfo function from our Pascal code: var TRecord_TypeInfo: pointer; : initialization TRecord_TypeInfo := TypeInfo(TRecord); And then we can use it like this: procedure OperateOnRecordPassedByValue(r: TRecord); var _r: TRecord; asm // Copy the elements of "r" (parameter) in "_r" (local copy) // Move(_r, r, sizeof(TRecord)); lea edx, _r // EDX := @_r; mov ecx, TYPE TRecord // ECX := sizeof(TRecord); call Move // Move(EAX^, EDX^, ECX); // System._AddRefRecord(@_r, TypeInfo(TRecord)); lea eax, _r // EAX := @_r; mov edx, TRecord_TypeInfo // EDX := TypeInfo(TRecord); call System.@AddRefRecord // System._AddRefRecord(EAX, EDX); lea eax, _r // EAX := @_r; // optional // Here goes the rest of the function. We'll work on the // record "_r" (the local copy), now pointed by EAX. // We have to finalize the local copy before returning // System._FinalizeRecord(@_r, TypeInfo(TRecord)); lea eax, _r // EAX := @_r; mov edx, TRecord_TypeInfo // EDX := TypeInfo(TRecord); call System.@FinalizeRecord // System._FinalizeRecord(EAX, EDX); end; Notice that before the function returns we have to make a call to _FinalizeRecord to destroy the local record (for instance, this will decrement the reference count of strings pointed by string fields). Calling Move and then _AddRefRecord is a valid way to copy records if and only if the destination record hasn't been initialized (after calling _AddRefRecord, the record is initialized). If the destination record is already initialized, then we have to call _CopyRecord instead. For example: procedure proc(const r: TRecord); var _r: TRecord; begin // _r := r; asm mov edx, eax // EDX := @r; lea eax, _r // EAX := @_r; mov ecx, TRecord_TypeInfo // ECX := TypeInfo(TRecord); call System.@CopyRecord // System._CopyRecord(EAX, EDX, ECX); end; end; Note that since this is a normal Pascal function (not a full assembler function), the compiler automatically generates code to initialize and finalize the local record variable (in the "begin" and "end" of the procedure respectively). The combination Move plus _AddRefRecord is identical in effect to _InitializeRecord plus _CopyRecord: procedure OperateOnRecordPassedByValue(r: TRecord); var _r: TRecord; asm // Copy the elements of "r" (parameter) in "_r" (local copy) // System._InitializeRecord(@_r, TypeInfo(TRecord)); push eax // Push(EAX); // @r lea eax, _r // EAX := @_r; mov edx, TRecord_TypeInfo // EDX := TypeInfo(TRecord); call System.@InitializeRecord // System._InitializeRecord(EAX, EDX); // _r := r; lea eax, _r // EAX := @_r; pop edx // EDX := Pop(); // @r mov ecx, TRecord_TypeInfo // EDX := TypeInfo(TRecord); call System.@CopyRecord // System._CopyRecord(EAX, EDX, ECX); lea eax, _r // EAX := @_r; // optional // Here goes the rest of the function. We'll work on the // record "_r" (the local copy), now pointed by EAX. // We have to finalize the local copy before returning // System._FinalizeRecord(@_r, TypeInfo(TRecord)); lea eax, _r // EAX := @_r; mov edx, TRecord_TypeInfo // EDX := TypeInfo(TRecord); call System.@FinalizeRecord // System._FinalizeRecord(EAX, EDX); end; Like _AddRefRecord, the procedure _InitializeRecord is only meant to be used with uninitialized records. Returning record values Returning record values is exactly the same as returning static array values. Functions returning records receive an additional last parameter which is the pointer to the memory location where they should place their return value, i.e., the value of the last parameter is @Result. The memory for the result record should be allocated, initialized and freed by the caller (it is not responsibility of the called function). For example, let's consider the following function: function MakeRecord(Id: integer; const Name: string): TRecord; begin Result.Id := Id; Result.Name := Name; end; The function is declared to receive two parameters and to return a record, but internally is like a procedure that gets three parameters: EAX = the Id for the new record EDX = the Name for the new record ECX = the address of the Result record (@Result) The function can be rewritten in assembler as follows: function MakeRecord(Id: integer; const Name: string): TRecord; asm // EAX = Id; EDX = @Name[1]; ECX = @Result mov (TRecord PTR [ecx]).Id, eax // ECX^.Id := EAX; // Id // (@Result)^.Id := EAX; // Result.Id := EAX; // Result.Name := Name; // System.@LStrAsg(@(Result.Name), @Name[1]) // System.@LStrAsg(@(ECX^.Name), @Name[1]) lea eax, (TRecord PTR [ecx]).Name // EAX := @(ECX^.Name); call System.@LStrAsg // _LStrAsg(EAX, EDX) end; NOTE: We don't assign the value of EDX before calling _LStrAsg because EDX already contains the desired value (passed as parameter). Calling functions that return records Consider the following code: a := MakeRecord(n, s); One would be tempted to think that the compiler translates it to: asm mov eax, n mov edx, s lea ecx, a // ECX := @a; // @Result call MakeRecord end; But things don't happen that way, at least not in Delphi 5. The compiler allocates and initializes a local variable to hold the result, and then copies the result record to the destination record. Not only we have an inefficiency for performing a copy that would be unneeded if we used a code like the above, but -as we have seen above- the copy itself is not as innocent as a call to the Move procedure (_CopyRecord checks the type information data at runtime to locate the fields that require special treatment). Of course, the invisible local variable is first initialized and eventually gets finalized. This way of doing things is terribly inefficient. If you need speed, call record-returning functions using assembler as I showed above, passing directly the address of the variable that will hold the result as the last parameter (@Result). Well, this is it for now. In the next part we'll see some basics of working with objects. Previous: Inline Assembler in Delphi (III) - Static Arrays Next: Inline Assembler in Delphi (V) - Objects