Mega Code Archive

 
Categories / Delphi / Examples
 

Word Counter and some secrets

Title: Word Counter and some secrets. Question: Making fast word counter. Answer: In this article my purpose is not only to show you how to make word counter because it is a trivial task. I want to show you that knowing what Delphi is doing at the backstage is important. Here is a code I write long, long time ago. It is the execute method of a word counting thread: procedure TCounter.Execute; var WEnd: Boolean; Ch: PChar; A: Integer; begin if not WordsInfo.Calculated then begin WordsInfo.Words := 0; WordsInfo.Lett := 0; WEnd := True; for A := 1 to Length(UniMain.UniRE.Text) do begin if Terminated then Exit; Ch := PChar(Copy(UniMain.UniRE.Text, A, 1)); if ((Ch 'A') and (Ch 'a') and (Ch begin Inc(WordsInfo.Lett); if WEnd = True then begin WEnd := False; Inc(WordsInfo.Words) end; end else WEnd := True; end; WordsInfo.Calculated := True; end; end; It is VERY BADLY WRITTEN. Why? I just sat down and began writting. The result was extremly slow program. UniMain is my form's name. UniRE is the name of the rich edit control. And UniMain.UniRE.Text is a string representation of the memo data. But the part I was using it was one of the bottlenecks. To give you a string Delphi makes some convertions. so I added this: var ... TheText: string; begin TheText := UniMain.UniRE.Text; ... and replaced all UniMain.UniRE.Text with TheText. Making the conversion just once and not in a cycle speeded the counting over hundred times! You can try it if you don't believe me. But that is not the only optimisation I did. Some people may wonder why I used the line Ch := PChar(Copy(UniMain.UniRE.Text, A, 1)); but this is the way I was working with strings in Turbo Pascal. I Replaced this line with Ch := UniMain.UniRE.Text[A]; and changed Ch from PChar to Char. This also speeded the program over hundred times!!! This was enough. But there is still a place for optimization. I replaced if ((Ch 'A') and (Ch 'a') and (Ch with if ((Ch in ['a'..'z']) or (Ch in ['A'..'Z'])) then Using "Ch in ['a'..'z']" is theoretical faster than the previous. I swapped the places of the lowercase and uppercase check because it is more likely to encounter lowercase and the check for uppercase letters will not be perfomed. This is a small optimization compared to the previous two but is good to be done. May be there are others optimization I could make but this code is quick enough - on my computer it counts 1MB text file for about a second. Here is the final code: procedure TCounter.Execute; var WEnd: Boolean; Ch: Char; A: Integer; TheText: string; begin TheText := UniMain.UniRE.Text; if not WordsInfo.Calculated then begin WordsInfo.Words := 0; WordsInfo.Lett := 0; WEnd := True; for A := 1 to Length(TheText) do begin if Terminated then Exit; Ch := TheText[A]; if ((Ch in ['a'..'z']) or (Ch in ['A'..'Z'])) then begin Inc(WordsInfo.Lett); if WEnd = True then begin WEnd := False; Inc(WordsInfo.Words) end; end else WEnd := True; end; WordsInfo.Calculated := True; end; end;