Mega Code Archive

 
Categories / Delphi / Activex OLE
 

Export source code in any language in HTML

Title: Export source code in any language in HTML Question: It was asked for an easy way to export all types of source code to HTML. Answer: First of all I want to mention that I know that someone has already posted solution to this problem, but I tried to use a different approach. The scope in my mind is to gain the maximun flexibility from the tool, so the must is to write my own parser. Just a little background to let you understand what's on my mind. When you talk about sintax highlighting the concept is to split the language "tags", where for "tag" i mean a word, a symbol, a number, in "categories" and obviously any "category" is rendered with its own font! Now focus on "tags", different languages have different tags so the "begin" tag in Delphi is "{" in C-like languages. There are mainly three types of tags: - Simple tags, such as a keyword - Enclosing tags, such as comments defined with "{","}" - Line tags, such as comments defined with "//" that close automatically at the end of the line. The first step I done was to encapsulate these concepts in classes, so I defined three classes: - TCodeDelimiter: that's for defining Enclosing tags and Line tags - TCodeCategory: that's practically a collection of TCodeDelimiter or a collection of Simple tags (a TStringList) with the relative font definition - TCodeLanguage: that's a collection of TCodeCategory with the default font for the language. Now the second step, what's about the parser? I splitted the problem in two phases, first I have to identify the tag I'm searching for, this is done by a simple loop, second check if the found tag is defined to be highlight! In a source code you can meet four kinds of characters a letter, a number, a symbol or a space, using this grouping method the loop try to identify the tag. The loop identifies continuos group of charcters of the same kind as tags. An example: TMyClass.MyMethod Tag1: "TMyClass" all characters of the same kind (letters) Tag2: "." (symbol) Tag3: "MyMethod" all characters of the same kind (letters) To check if a tag need to be highlighted or not I used a tree where each character is a node has as leafs all the possible characters that can follow. So the two keywords (or tags) "inherited" and "interface" generate the following tree: I | N / \ H T | | E E | | R R | | I F | | T A | | E C | | D E Note that a tag can be nested in another one, for example "in" is nested in "INherited", to resolve these problem each node has a boolean property that define if a node can be final for a tag or not, so a tag need to be highlighted only if, naviganting trought the tree, the last node found is a "final" node. Now, why I do this way? Easy, once you are able to determinate if a given tag is a wanted one you can do anything you want....render it in HTML it's easy. Now you have a routine that you can reuse anytime you have to deal with similar problems. I hope all this may be useful for you.