Mega Code Archive

 
Categories / Delphi / LAN Web TCP
 

Removing HTML elements from text

Title: Removing HTML elements from text. Question: A situation arose where I had to develop a set of procedures to remove HTML elements such as unwanted links from within a text file and at the same time convert any carriage returns to HTML paragraph markers, tabs to spaces etc to form a new web document. Answer: The following two procedures were implemented : ____________________________________________________ Code: procedure TMainForm.LoadFileIntoList(TextFileName:String; AWebPage:TStringList; WithFilter:Boolean); var CurrentFile : TStringList; begin CurrentFile := TStringList.Create; CurrentFile.LoadFromFile(TextFileName); if WithFilter then FilterHTML(CurrentFile,AWebPage) else with AWebPage do AddStrings(CurrentFile); CurrentFile.Free; end; procedure TMainForm.FilterHTML(FilterInput, AWebPage:TStringList); var i,j : LongInt; S : String; begin FilterMemo.Lines.Clear; FilterMemo.Lines := FilterInput; with AWebPage do begin FilterMemo.SelectAll; j := FilterMemo.SelLength; if j 0 then begin i := 0; repeat if FilterMemo.Lines.GetText[i] = Char(VK_RETURN) // detect cr then S := S+'' else if FilterMemo.Lines.GetText[i] = ' then repeat inc(i); until FilterMemo.Lines.GetText[i] = '' else if FilterMemo.Lines.GetText[i] = Char(VK_TAB) // detect tab then S := S+' ' else S := S+ FilterMemo.Lines.GetText[i]; // just add text inc(i); until i = j+1; Add(S); // add string to WebPage end else Add('No data entered into field.'); // no data in text file end; end; ___________________________________________________ Implementation: All you have to do is call : LoadFileIntoList("filename.txt",Webpage, True); Where the filename is the name of the file you want to process. "WebPage" is a TStringList And the boolean value on true filters the file, but on false does not. NB: In this example a TMemo object called "FilterMemo" was placed on a form (not visible). ___________________________________________________ Example: WebPage := TStringList.Create; try Screen.Cursor := crHourGlass; AddHeader(WebPage); with WebPage do begin Add('Personal Details'); LoadFileIntoList("filename.txt",Webpage, True); end; AddFooter(WebPage); finally WebPage.SaveToFile(HTMLFileName); WebPage.Free; Screen.Cursor := crDefault; end; ___________________________________________________ Improvements: If anybody has any suggested improvements or modifications then please contact me. Thanks, Pete Davies, 14th August 2000