Mega Code Archive

 
Categories / Delphi / Examples
 

Understanding FindFirst and FindNext

Title: Understanding FindFirst and FindNext Understanding How FindFirst and FindNext Work Reference Threads: thread102-1418012: FindFirst/FindNext thread102-1512569: findfirst - findnext This document will be a discussion of how FindFirst/FindNext functions. It will include a description of the behavior of the Sysutils variant and how it functions. As well, it will describe some ideas to utilize what is going on under the hood to provide some simplicity and some performance in the code. The basics of FindFirst/FindNext Most of us should know the basics of how FindFirst and FindNext work. They are located in the Sysutils unit. What follows is an example that will form a basis for argument in the reference thread that I started. Since only minor changes will be required, I will just post the program that I used. CODE {$APPTYPE CONSOLE} program test1; uses sysutils; { test program #1 for FindFirst/FindNext FAQ by Glenn9999, demonstrates testing for attributes } procedure writeattr(inattr: longint); { tests file attributes and writes out a corresponding letter for it } begin { list of file attributes } if (inattr and faReadOnly) = faReadOnly then write('R'); if (inattr and faHidden) = faHidden then write('H'); if (inattr and faSysFile) = faSysFile then write('S'); if (inattr and faVolumeID) = faVolumeID then write('V'); if (inattr and faDirectory) = faDirectory then write('D'); if (inattr and faArchive) = faArchive then write('A'); end; var SR: TSearchRec; retcode: integer; begin retcode := FindFirst('C:\*.*', faAnyFile, SR); while retcode = 0 do begin write(SR.Name, '(':30-Length(SR.Name)); writeattr(Sr.Attr); writeln(')'); retcode := FindNext(SR); end; FindClose(SR); readln; end. As we can see, this program lists all files in "C:\*.*" along with the attributes attached to them. To distinguish between the attributes you test using bitmasks. And is the operator which is used. How this works in a technical level is a subject for another FAQ, if necessary, but know that this is how it is done. I selected C:\*.* because most of us can relate to what we will see (the contents of a relatively basic XP install). The output of the above program on my system is as follows: CODE AUTOEXEC.BAT (A) BJPrinter (HD) boot.ini (RHSA) CONFIG.SYS (A) Documents and Settings (D) error.log (A) FSC (D) IO.SYS (RHSA) MSDOS.SYS (RHSA) NTDETECT.COM (RHSA) ntldr (RHSA) pagefile.sys (HSA) Program Files (RD) RECYCLER (HSD) sqmdata00.sqm (HA) sqmnoopt00.sqm (HA) System Volume Information (HSD) WINDOWS (D) The issue of the first thread To continue on to the idea that was mentioned in the first thread I started, I'll change the first program by putting faDirectory in the Attr spot of FindFirst. We can then observe the issue which prompted my post: CODE AUTOEXEC.BAT (A) CONFIG.SYS (A) Documents and Settings (D) error.log (A) FSC (D) Program Files (RD) WINDOWS (D) We are told in the manual for FindFirst: Quote: FindFirst searches the directory specified by Path for the first file that matches the file name implied by Path and the attributes specified by the Attr parameter. .. Attributes can be combined by adding their constants or values. For example, to search for read-only and hidden files in addition to normal files , pass (faReadOnly + faHidden) the Attr parameter. I didn't look at the manual too closely until just now, but the bolded part is what puts some confusion into things - and why the question was asked about in the thread. The specified attribute is included IN ADDITION to files with regular attributes. Hence, there are 4 directories and 3 files specified. Admittedly it seems to work according to documentation, but it is still definitely broken in functionality, since it doesn't behave in what I would call a predictable way. We have 3 "normal files" listed, which would be true, but only 4 of the 7 directories that we expected back returned. This is no doubt because the other directories had hidden attributes attached to them. SFFFN After I posted that, I experimented with writing something with a more predictable filter. In having a look into it, I found that the filtering function related to Attr is written in the Sysutils implementation of FindFirst as opposed to anything to do with a Windows API call. So I ended up writing a version which will do strict filtering, more out of interest. I shared it because I thought it would be something interesting to share. You will see it at the end of that thread, but I will reproduce it here. CODE unit sfffn; { unit to encapsulate new FindFirst/FindNext/FindClose. Certain optimizations were made compared to the sysutil versions - most important being the fact that the functions in this unit will *STRICTLY* filter attributes presented to it. So don't expect this version to act like the Sysutils version. Example: faDirectory in Attr only returns directories. faHidden+faSysFile only returns files /directories with BOTH attributes This means there shouldn't be a need for post-filtering unless you want to "OR" the attributes - like if you want to return either faHidden or faSysFile. To that end, this should present an improvement in that realm. The relevant definitions from Sysutils were copied into this source file so the added weight of Sysutils is not necessary if all you do is search files. Certain helps were also gotten from the Sysutils source. Functions in this unit: SFindFirst; Like the FindFirst SFindNext; Like the FindNext SFindClose; Like the FindClose } interface uses windows; const faReadOnly = $00000001; faHidden = $00000002; faSysFile = $00000004; { faVolumeID = $00000008; faVolumeID seems not used in Windows anyway } faDirectory = $00000010; faArchive = $00000020; faAnyFile = $0000003F; type TFileName = string; LongRec = packed record Lo, Hi: Word; end; TSearchRec = record Time: Integer; Size: Integer; Attr: Integer; Name: TFileName; ExcludeAttr: Integer; FindHandle: THandle; FindData: TWin32FindData; end; function SFindFirst(const Path: string; Attr: Integer; var F: TSearchRec): Integer; function SFindNext(var F: TSearchRec): Integer; procedure SFindClose(F: TSearchRec); implementation procedure move_rec(var F: TSearchRec); var LocalFileTime: TFileTime; begin with F do begin Size := FindData.nFileSizeLow; Attr := FindData.dwFileAttributes; Name := FindData.cFileName; FileTimeToLocalFileTime(FindData.ftLastWriteTime, LocalFileTime); FileTimeToDosDateTime(LocalFileTime, LongRec(F.Time).Hi, LongRec(F.Time).Lo); end; end; function SFindFirst(const Path: string; Attr: Integer; var F: TSearchRec): Integer; const faSpecial = faHidden or faSysFile or faDirectory; var ret: boolean; begin if Attr = faAnyFile then F.ExcludeAttr := not Attr and faSpecial else F.ExcludeAttr := Attr; F.FindHandle := FindFirstFile(PChar(Path), F.FindData); if F.FindHandle INVALID_HANDLE_VALUE then begin while (F.FindData.dwFileAttributes and F.ExcludeAttr) F.ExcludeAttr do begin ret := FindNextFile(F.FindHandle, F.FindData); if (not ret) then begin Result := GetLastError; exit; end; end; move_rec(F); Result := 0; end else Result := GetLastError; end; function SFindNext(var F: TSearchRec): Integer; var ret: boolean; begin repeat ret := FindNextFile(F.FindHandle, F.FindData); if (not ret) then break; until (F.FindData.dwFileAttributes and F.ExcludeAttr) = F.ExcludeAttr; move_rec(F); if ret then Result := 0 else Result := GetLastError; end; procedure SFindClose(F: TSearchRec); { copied from SysUtils source - nothing changes for this } begin if F.FindHandle INVALID_HANDLE_VALUE then Windows.FindClose(F.FindHandle); end; end. We can see that the FindFirst and FindNext call a Windows API function which does not do any filtering. You will also see that this is true of the Sysutils FindFirst and FindNext if you load up the copy of SYSUTILS.PAS that is likely with your Delphi install (not reproducing it here for the obvious copyright concerns). The main difference between this version and the Sysutils version is that it does not return files with the faArchive attribute, and will return anything with an attribute (the two main logic changes). We will see that in recoding the program that was given to use SFFFN.PAS CODE BJPrinter (HD) Documents and Settings (D) FSC (D) Program Files (RD) RECYCLER (HSD) System Volume Information (HSD) WINDOWS (D) We now only have directories and we have all of them. Post-Filtering Now as it was pointed out, it's pretty hard to come up with a filter that would suit all circumstances. As was pointed out in the second thread, the only way to return things that are NOT faDirectory is to test for that (again an issue of bitmasks and how they work). So we're back to doing the post-filtering, as most people seem to generally do, and what seems best. Now the question out of this comes whether we can improve upon FindFirst/FindNext in this respect and come up with something that is at the very least a little more functionally useful, and most certainly efficient. An attempt at that is posted below: CODE unit bfffn; { unit to encapsulate new FindFirst/FindNext/FindClose. This is a bare version which does not take attribute in any form and will return all files - ideal for post-filtering algorithms. Performance improvements can be made which departs from the TSearchRec structure to TWin32FindData, but those will not be made within this unit. The relevant definitions from Sysutils were copied into this source file so the added weight of Sysutils is not necessary if all you do is search files. Certain helps were also gotten from the Sysutils source. Functions in this unit: BFindFirst; Like the FindFirst BFindNext; Like the FindNext BFindClose; Like the FindClose } interface uses windows; const faReadOnly = $00000001; faHidden = $00000002; faSysFile = $00000004; { faVolumeID = $00000008; faVolumeID seems not used in Windows anyway } faDirectory = $00000010; faArchive = $00000020; faAnyFile = $0000003F; type TFileName = string; LongRec = packed record Lo, Hi: Word; end; TSearchRec = record Time: Integer; Size: Integer; Attr: Integer; Name: TFileName; ExcludeAttr: Integer; FindHandle: THandle; FindData: TWin32FindData; end; { attribute not specified in BFindFirst! } function BFindFirst(const Path: string; var F: TSearchRec): Integer; function BFindNext(var F: TSearchRec): Integer; procedure BFindClose(F: TSearchRec); implementation procedure move_rec(var F: TSearchRec); var LocalFileTime: TFileTime; begin with F do begin Size := FindData.nFileSizeLow; Attr := FindData.dwFileAttributes; Name := FindData.cFileName; FileTimeToLocalFileTime(FindData.ftLastWriteTime, LocalFileTime); FileTimeToDosDateTime(LocalFileTime, LongRec(F.Time).Hi, LongRec(F.Time).Lo); end; end; function BFindFirst(const Path: string; var F: TSearchRec): Integer; begin F.FindHandle := Windows.FindFirstFile(PChar(Path), F.FindData); if F.FindHandle INVALID_HANDLE_VALUE then begin move_rec(F); Result := 0; end else Result := GetLastError; end; function BFindNext(var F: TSearchRec): Integer; var ret: boolean; begin ret := Windows.FindNextFile(F.FindHandle, F.FindData); if ret then begin move_rec(F); Result := 0; end else Result := GetLastError; end; procedure BFindClose(F: TSearchRec); { copied from SysUtils source - nothing changes for this } begin if F.FindHandle INVALID_HANDLE_VALUE then Windows.FindClose(F.FindHandle); end; end. As you can see a lot of code was removed, which relates to finding files by a specific bitmask specified within Attr. This also affords some control over the code, and actually allows us to be specific about what is moved or not moved. Some Testing Programs I ran some trials against a testing program using Sysutils and this BFFFN unit, using this program. CODE {$APPTYPE CONSOLE} {$DEFINE NODEBUG} program test4; uses sysutils; { test program #4 for FindFirst/FindNext FAQ by Glenn9999, allows for benchmarking of file seeking performance. parses through all directories, optionally writing all the results} type DWord = Longint; { a timing routine } function TimeMS: DWord; stdcall; external 'winmm.dll' name 'timeGetTime'; procedure listdirectory(indir: string); var SR: TSearchRec; retcode: integer; begin {$IFDEF DEBUG } writeln(indir); {$ENDIF} retcode := FindFirst(indir + '\*.*', faAnyFile, SR); while retcode = 0 do begin {$IFDEF DEBUG } writeln(SR.Name); {$ENDIF} if (SR.Attr and FaDirectory) = faDirectory then begin if (SR.Name '.') and (SR.Name '..') then ListDirectory(indir + '\' + SR.Name); end; retcode := FindNext(SR); end; Findclose(SR); end; var STime: Longint; numtimes, i: longint; begin write('Number of times to run: '); readln(numtimes); STime := TimeMS; for i := 1 to numtimes do listdirectory('C:'); STime := TimeMS - STime; write('Program completed in ', STime, ' ms. Press Enter to Exit.'); readln; end. All were run against the drive five times(51,040 files; 6,555 folders). Using Sysutils: 3485ms Using BFFN as posted: 3469ms Using BFFN moving only file name and attr: 3250ms Using the Windows variants in the main program: 3157ms Another benchmark - five times (15300 files, 1670 folders on WinME) Using Sysutils: 1737ms Using BFFN as posted: 1798ms Using BFFN moving only file name and attr: 1560ms Using the Windows variants in the main program: 1509ms In profiling these programs, the Windows calls, specifically NTDLL.DLL and KERNEL32.DLL take up 98% of the execution time, there is not incredibly much to tune (of this time, FindClose calls took 11%, FindNext took 3%, and FindFirst 0.2%). Though, as you will notice, there is some room for improvement, especially if you scan a large number of files. This difference is from the existence of the code layer on top of the Windows calls in Sysutils. For those interested in what the program looked like with the Windows variants in it, here that is: CODE {$APPTYPE CONSOLE} {$DEFINE NODEBUG} program test7; uses windows; { test program #7 for FindFirst/FindNext FAQ by Glenn9999, allows for benchmarking of file seeking performance. parses through all directories, optionally writing all the results change define to DEBUG in order to write the directory paths and names TWin32FindDataA = record dwFileAttributes: DWORD; ftCreationTime: TFileTime; ftLastAccessTime: TFileTime; ftLastWriteTime: TFileTime; nFileSizeHigh: DWORD; nFileSizeLow: DWORD; dwReserved0: DWORD; dwReserved1: DWORD; cFileName: array[0..MAX_PATH - 1] of AnsiChar; cAlternateFileName: array[0..13] of AnsiChar; end; } const faDirectory = $00000010; type DWord = Longint; { a timing routine } function TimeMS: DWord; stdcall; external 'winmm.dll' name 'timeGetTime'; procedure listdirectory(indir: string); var SR: TWin32FindData; retcode, FindHandle: integer; begin {$IFDEF DEBUG } writeln(indir); {$ENDIF} FindHandle := Windows.FindFirstFile(PChar(Indir + '\*.*'), SR); if FindHandle INVALID_HANDLE_VALUE then Retcode := 0 else Retcode := GetLastError; while retcode = 0 do begin {$IFDEF DEBUG } writeln(SR.cFileName); {$ENDIF} if (SR.dwFileAttributes and FaDirectory) = faDirectory then begin if (String(SR.cFileName) '.') and (String(SR.cFileName) '..') then ListDirectory(indir + '\' + SR.cFileName); end; if Windows.FindNextFile(FindHandle, SR) then Retcode := 0 else Retcode := GetLastError; end; if FindHandle INVALID_HANDLE_VALUE then Windows.FindClose(FindHandle); end; var STime: Longint; numtimes, i: longint; begin write('How many times to run: '); readln(numtimes); STime := TimeMS; for i := 1 to numtimes do listdirectory('C:'); STime := TimeMS - STime; write('Program completed in ', STime, ' ms. Press Enter to Exit.'); readln; end.