Mega Code Archive

Performanceone

Program Performance ******************************************************************* [my stuff] re 'performance' or more specifically, 'performance monitoring', here's a tip using vital info from Bob Swart's excellent page on 'Delphi Optimising' at: http://www.tietovayla.fi/BORLAND/techlib/dl390/dl390.html let's say you're worried about the performance of your app, and in the absence of a 'profiler', you want to try and track down performance bottlenecks if possible... here's a quick way: include MMSYSTEM in your uses clause then find the TIME taken to execute particular blocks of code by enclosing those blocks with calls to the timeGetTime function (which is in MMSYSTEM.DLL, and which returns 'milliseconds since Windows started as a LongInt) -note that a LongInt can hold 47 days worth of milliseconds like this... var startTime: LongInt; endTime: LongInt; selectTime: LongInt; redrawTime: LongInt; tempString: String; begin startTime := timeGetTime; //select the step that we want to select ListsManager.SetStepSelection(clickedElem); endTime := timeGetTime; selectTime := (endTime - startTime); startTime := timeGetTime; //redraw everything DrawObj.ReDrawAll(DrawArea.Canvas, DrawArea.Width, DrawArea.Height, ListsManager.stepObjList, ListsManager.linkObjList); endTime := timeGetTime; redrawTime := (endTime - startTime); tempString := 'The selection process takes ' + IntToStr(selectTime) + ' milliseconds' + Chr(13) + 'The redraw process takes ' + IntToStr(redrawTime) + ' milliseconds'; Application.MessageBox(PChar(tempString), ' Programmers Debug Window', mb_OK); end; for instance... ******************************************************************************* [this next is similar to the above but more in-depth]... Here is a contribution to the Optimization debate. Comments will be welcome. Irwin Scollar: {TestTime, a program to Test the timing of a routine, using the RDSTC (Read Time Stamp Counter) for the number of cycles since a Pentium processor was started or reset, based on: Jon Shemitz, Using RDTSC for Pentium Benchmarking, in Don Taylor et al, Delphi 3 Programming, Coriolis Group Press, ISBN 1-57610-179-7, 1997} {Repeatedly run it a number of times and take lowest observed value to account for Windows doing other things with higher priority. Under WinNT/2000 try: start /high testtime to force high running priority or even perhaps start /realtime testime to force still higher priority} {$APPTYPE CONSOLE} {try various complier options for dcc32.exe} Program TestTime; uses SysUtils; const D32 = $66; Function RDTSC : Comp; Var TimeStamp : Record case byte of 1 : (Whole:comp); 2 : (Lo, Hi : Longint); end; begin asm db $0F; db $31; {$ifdef Cpu386} mov [Timestamp.Lo],eax mov [Timestamp.Hi],edx {$else} db D32 mov word ptr TimeStamp.Lo,AX db D32 mov word ptr TimeStamp.Hi,DX {$endif} end; Result := TimeStamp.Whole; end; function Comp2Str(N : Comp): string; begin Result := Format('%.0n',[N]); end; procedure TimeIt; var StartTime, StopTime : Comp; i : integer; Overhead : Comp; begin i := 0; {get Timer overhead} StartTime := RDTSC; Overhead := RDTSC - StartTime; StartTime := RDTSC; {code to time goes here, e.g.} While (i < 10000) do inc(i); StopTime := RDTSC - StartTime - Overhead; {display result in processor cycles} Write('Code takes ',Comp2Str(StopTime), ' cycles'); ReadLn; end; begin TimeIt; end. _______________________________________________ Delphi mailing list -> Delphi@elists.org http://elists.org/mailman/listinfo/delphi ******************************************************************** SEE BOB SWART'S DELPHI OPTIMISATION PAGE AT: http://www.tietovayla.fi/BORLAND/techlib/dl390/dl390.html IT'S VERY GOOD to pinpoint your performance problems get yourself a decent profiler. I used SpeedDaemon and GProfiler. GProfiler is free. Get it here: www.eccentrica.org/gabr/gpprofile/gpprofile.htm (includes source) The improve your application's design have a look here 8) http://oop.com/white_papers/delphi/business-objects.htm ********************************************************************** I would have to agree that the use of profiling tools can help a programmer to drastically improve the performance of his/her code, and that many of the major bottlenecks in modern applications come from unnecessary recursion or looping, network traffic or poorly designed SQL Queries. In my experience, many programmers today (myself included to some extent), consistently write code that greatly underperforms, we continue to use the same badly thoughtout, and inefficient, solutions to the same problems over and over again. How many of you have seen or written code that scans a list for the data you are looking for, and once you have found the data continue to search the rest of the list? Use of the Continue, Break and Exit commands are always handy to bear in mind in such circumstances.... Many programmers remain ignorant of good algorithm design, and as such, rarely consider the use of indexing, binary chop functions, hash tables and the like (me included), and even if they do, use the least effective solution for the problem in question. All too often have I seen programmers "fix" their performance problems, by simply upping the specification of the machines that their users will use, that is fine if you are in a position to control that, but many of us do not have such a luxury. I often think back to the days of my good old C64 (for those new to such stuff an old 8-bit 64KB god of a machine you plugged into your TV), and wonder what the programmers of such machines (who knew exactly how to squeeze every last pound of flesh from those beasties) could do with the monolithic specification of todays desktop PC's, there is a definate connection between the speed of the target machine, and the laziness of the programmer. Call it unrealistic in todays marketplace if you must, but the modern programmer is seriously underperforming, and the specification of modern machines mean that the laziest and least tallented programmers can still compete... Many of you have suggested ensuring that optimization is turned on, but in my experience, and the experience of many others, is that turning optimisation on can change the meaning of code, and give undesired and often unexplainable behaviour, and as such I never use this feature... What is lacking is knowledge of how best to optimise algorithms, queries, and reduce network traffic, I think this thread could be both usefull and interesting to all who read it if we stop debating over David's initial tips, and start to submit new ideas and sources of information to each other regarding the above. Remember, many of us do not trust optimisation, it has bitten us too many times :-( regards, Paul Jackson ****************************************************************************************** > > This last one is an old trick for swapping two elementes without using a > > temporary variable. > > Swap(a,b): > > a = 0010 > > b = 0101 > > b := a xor b (=0111) > > a := a xor b (=0101) > > b := a xor b (=0010) >I've seen this befoer, and I recall that it isn't really any faster. >Firstly, you have to declare a and b with the var modifier. This means >that each reference to a or b requires a memory access. Your version >requires three MOVs and an XOR for each line, and they're all on memory >locations instead of registers. Unoptimized, the standard temporary >variable method produces similar code, all with memory access since the >Intel instruction set has no way of transferring directly between two >memory addesses. > >Optimized, apparently, neither routine changes. In this case, I'd still >choose the standard method. It is easier to read--more obvious that it >swaps the contents of two variables. It also yields ten MOVs rather than >nine MOVs and three XORs. When put into the context of another >subroutine, I suspect the benefits would be even greater. Takse Swap out >of its own procedure--inline it--and it should work even better since the >compiler can better utilize the registers. If this really needs to be optimized, the following will do it in 4 MOVs, a PUSH, and a POP: procedure swap (var a, b: integer); asm push edi mov ecx, [a] // pointer to "a" is in EAX, so this is really mov ecx, [eax] mov edi, [b] // pointer to "b" is in EDX, so this is really mov edi, [edx] mov [a], edi mov [b], ecx pop edi end; If you're absolutely certain the caller is not using EDI, then the PUSH/POP could be removed, but as a general-purpose procedure EDI needs to be preserved.