Mega Code Archive

 
Categories / Delphi / Algorithm Math
 

Towards a more accurate sort order

Title: Towards a more accurate sort order Question: Sorting Addresses is a pain at the best of times, especially when a client supplies bad data (You may define clear fields in your DB, but when the data comes in, does it fit easily??) This attempts to resolve this issue Answer: unit AddrSortOrder; {The custom sort order is used to deal with the fact that the house and flat numbers are sorted as strings. They are stored as strings to allow things like '150-175' as a house number, or '3a', or perhaps even simply a flat 'A'. The need for a custom sort order is caused by the fact that with an ordinary ASCII sort order '4' will appear after '30'. This is not desirable behaviour. This approach to fix this problem is to look for the first number in the string (if there is one) and then use this as some kind of primary sort order. The rest of the sorting will then be done on the remaining characters (with preceding and trailing spaces stripped out), based on the ASCII value of their upper- case varients. Potential problems caused by this approach include (but are not limited to) the use of accented characters will possibly cause strange orderings and furthermore, if there is a block of flats with three floors A, B, C for example then supposing the flats on those floors are A1, A2, A3, B1, B2, B3 then the ordering of records will not be ideal - this approach will sort them as A1, B1, A2, B2, A3, B3. This behaviour is regrettable, but acceptable - we cannot tell that it is not flat A on floor 1 for example. It's unlikely that we will be able to find a sort order that always produces ideal results. Some examples of sorted lists (not all ideal): EXAMPLE 1 EXAMPLE 2 EXAMPLE 3 Flat 1 1 A Flat 2 -2 B 3 2-4 C 3B 3a 1 Flat 3A 5 2 } interface uses SysUtils; function CalcSortIndex(NumStr:string):double; implementation function CalcSortIndex(NumStr:string):double; var strlength,i,j,tmp:integer; found:boolean; numpart,strpart,divisor:double; choppedstr:string; begin //This function will return the sort index value for the string passed strlength:=length(NumStr); if strlength=0 then begin result:=0; exit; end; found:=false; //split the string into a 'number' and a 'string' part.. //initialise choppedstr:=numstr; numpart:=0; //Locate the first digit (if there) for i:=1 to strlength do begin if numstr[i] in ['0'..'9'] then begin found:=true; //First digit found!! break; end; end; //for i.. if found then begin //now get the to the end of the digits.. found:=false; for j:=i to strlength do begin if not(numstr[j] in ['0'..'9']) then begin found:=true; //end of digits found break; end; end; //for j.. //Separate out the string parts if found then begin //Number was embedded.. val(copy(numstr,i,j-i),numpart,tmp); Delete(choppedstr,i,j-i); end else begin //Number went to the end of the string val(copy(numstr,i,strlength),numpart,tmp); Delete(choppedstr,i,strlength); end; end; choppedstr:=Uppercase(trim(choppedstr)); strlength:=length(choppedstr); //evaluate a number for the remaining part of the string strpart:=0; divisor:=1; for i:=1 to strlength do begin divisor:=divisor/256; //convert from Char to single using a variant conversion strpart:=strpart+(ord(choppedstr[i])*divisor); end; //All done, return the value result:=numpart+strpart; end; end. NB a version of this Algorithm for MSSQL7 is also posted (Title 'Towards a more accurate sort order in MSSQL7' ArticleID 2983)