Mega Code Archive

 
Categories / Delphi / Examples
 

URL Parsing class

Title: URL Parsing class Question: Sometimes when working with url's we need to be able to break it down to it's elements, like protocol, host and document Answer: To split url's to it's elements isn't complicated task, When I faced this problem I was sure I didn't have to write one line of code because I was sure to be able to find good article here to complete the task. What I needed was to be able to get each element of an url, that is protocol, host, port number, document and each parameter of the url. My search was not succesful so I decided to write this article. The fist thing to do is to extract the protocol from the url, that's very simple, I do it like this: function parseProtocol(url: string): string; var i: integer; begin { Parse out the protocol part of url } { Locate protocol seperator } i := pos('://', Url); if i 0 then begin { The protocol is everything until the seperator } Result := Copy(url, 1, i - 1); end else begin { If no protocol is found, set the default protocol } result := 'http'; end; end; Next thing to do is to extract the host part of the url function parseHost(url: string): string; var i, pathSep, portSep: integer; tmpstr: string; begin { Parse out the host part from url } tmpstr := url; { Locate the protocol seperator } i := pos('://', tmpstr); if i 0 then begin Inc(i, 2); end; Delete(tmpstr, 1, i); { Locate the path seperator } pathSep := Pos('/', tmpstr); if pathSep 0 then begin { Delete everything beyond path seperator } Delete(tmpstr, pathSep, Length(tmpstr)); end; { Locate port number seperator } portSep := Pos(':', tmpstr); if portSep 0 then begin { If port number found, then delete it } Delete(tmpstr, portSep, Length(tmpstr)); end; { If we find parameter seperator know, the url is invalid, but still let's clear it out if it's found } i := pos('?', tmpstr); if i 0 then begin Delete(tmpstr, i, Length(tmpstr)); end; { Same goes for the bookmark seperator } i := pos('#', tmpstr); if i 0 then begin Delete(tmpstr, i, Length(tmpstr)); end; { Now we should only hold the host part } Result := tmpstr; end; With those functions you get the idea of how i'm doing this. This is not the most efficient code to do it, but it get's the job done so I'm happy with it. I attached the whole code as class that does all of this, and includes a function RelateUrl(RootUrl, RelativeUrl: string): string; which can accept the root url like "http://www.delphi3000.com/folder/test.html" and the relativeurl like "../index.html" and returns "http://www.delphi3000.com/index.html". Here is an example of how to use this class, I put one TEdit, one TMemo, and TButton on a form, TEdit box has the text "http://www.delphi3000.com/articles/article_3145.asp". I've added abhUrlClass to the uses clause. The code for the Button1Click event is like: procedure TForm1.Button1Click(Sender: TObject); var urlParser: TABHUrl; i: integer; begin urlParser := TABHUrl.Create(Edit1.Text); Memo1.Lines.Add('Protocol: ' + urlParser.Protocol); Memo1.Lines.Add('Host: ' + urlParser.Host); Memo1.Lines.Add('Document: ' + urlParser.Document); for i := 0 to urlParser.ParamCount - 1 do begin Memo1.Lines.Add('Param name: ' + urlParser.Param[i].Name); Memo1.Lines.Add('Param value: ' + urlParser.Param[i].Value); end; FreeAndNil(urlParser); end; What's missing in this code are two things, parse out Bookmark element (#top) and to url decode the parameter values. I didn't need those functions in my project so I skipped it like every good programmer does :) Never do to much :) I anyone decides to add those functions I would be glad to recieve those. Best regards Arni B. Halldorsson abh@hugvit.is