Mega Code Archive

 
Categories / Delphi / Algorithm Math
 

Converting Text for different Code Pages

Title: Converting Text for different Code Pages Question: Recently I ran into the problem of converting text for the Shift-JIS (Japanese Idioms) code pages when creating an i-mode interface for my companies Content Management System. But before I was about to start writing all by myself I checked into the tool Microsoft gave us. Answer: All Systems (Win 95+ and WinNT4+) with MS Internet Explorer 4 and newer have a library named mlang.dll in the Winnt\System32 directory. Usually you can tell Delphi to simply import these COM Libraries. This one however, Delphi did not. I started to convert the "most wanted" interface for myself. The results I present you here. First I give you the code for the conversion unit, that allows you simply convert any text from code page interpretation into another one. Following I will shortly discuss the code and give you a sample of how to use it. uCodePageConverter ================== {* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Unit Name : uCodePageConverter * Autor : Daniel Wischnewski * Copyright : Copyright 2002 by gate(n)etwork. All Right Reserved. * Urheber : Daniel Wischnewski * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *} unit uCodePageConverter; interface uses Windows; const IID_MLangConvertCharset: TGUID = '{D66D6F98-CDAA-11D0-B822-00C04FC9B31F}'; CLASS_MLangConvertCharset :TGUID = '{D66D6F99-CDAA-11D0-B822-00C04FC9B31F}'; type tagMLCONVCHARF = DWORD; const MLCONVCHARF_AUTODETECT: tagMLCONVCHARF = 1; MLCONVCHARF_ENTITIZE : tagMLCONVCHARF = 2; type tagCODEPAGE = UINT; const CODEPAGE_Thai : tagCODEPAGE = 0874; CODEPAGE_Japanese : tagCODEPAGE = 0932; CODEPAGE_Chinese_PRC : tagCODEPAGE = 0936; CODEPAGE_Korean : tagCODEPAGE = 0949; CODEPAGE_Chinese_Taiwan : tagCODEPAGE = 0950; CODEPAGE_UniCode : tagCODEPAGE = 1200; CODEPAGE_Windows_31_EastEurope : tagCODEPAGE = 1250; CODEPAGE_Windows_31_Cyrillic : tagCODEPAGE = 1251; CODEPAGE_Windows_31_Latin1 : tagCODEPAGE = 1252; CODEPAGE_Windows_31_Greek : tagCODEPAGE = 1253; CODEPAGE_Windows_31_Turkish : tagCODEPAGE = 1254; CODEPAGE_Hebrew : tagCODEPAGE = 1255; CODEPAGE_Arabic : tagCODEPAGE = 1256; CODEPAGE_Baltic : tagCODEPAGE = 1257; type IMLangConvertCharset = interface ['{D66D6F98-CDAA-11D0-B822-00C04FC9B31F}'] function Initialize( uiSrcCodePage: tagCODEPAGE; uiDstCodePage: tagCODEPAGE; dwProperty: tagMLCONVCHARF ): HResult; stdcall; function GetSourceCodePage( out puiSrcCodePage: tagCODEPAGE ): HResult; stdcall; function GetDestinationCodePage( out puiDstCodePage: tagCODEPAGE ): HResult; stdcall; function GetProperty(out pdwProperty: tagMLCONVCHARF): HResult; stdcall; function DoConversion( pSrcStr: PChar; pcSrcSize: PUINT; pDstStr: PChar; pcDstSize: PUINT ): HResult; stdcall; function DoConversionToUnicode( pSrcStr: PChar; pcSrcSize: PUINT; pDstStr: PWChar; pcDstSize: PUINT ): HResult; stdcall; function DoConversionFromUnicode( pSrcStr: PWChar; pcSrcSize: PUINT; pDstStr: PChar; pcDstSize: PUINT ): HResult; stdcall; end; CoMLangConvertCharset = class class function Create: IMLangConvertCharset; class function CreateRemote(const MachineName: string): IMLangConvertCharset; end; implementation uses ComObj; { CoMLangConvertCharset } class function CoMLangConvertCharset.Create: IMLangConvertCharset; begin Result := CreateComObject(CLASS_MLangConvertCharset) as IMLangConvertCharset; end; class function CoMLangConvertCharset.CreateRemote( const MachineName: string ): IMLangConvertCharset; begin Result := CreateRemoteComObject( MachineName, CLASS_MLangConvertCharset ) as IMLangConvertCharset; end; end. As you can see, I did translate only one of the many interfaces, however this one is the most efficient (according to Microsoft) and will do the job. Further I added some constants to simplify the task of finding the most important values. When using this unit to do any code page conersions you must not forget, that the both code pages (source and destination) must be installed and supported on the computer that does the translation. OIn the computer that is going to show the result only the destination code page must be installed and supported. To test the unit simple create a form with a memo and a button. Add the following code to the buttons OnClick event. (Do not forget to add the conversion unit to the uses clause!) SAMPLE ====== procedure TForm1.Button1Click(Sender: TObject); var Conv: IMLangConvertCharset; Source: PWChar; Dest: PChar; SourceSize, DestSize: UINT; begin // connect to MS multi-language lib Conv := CoMLangConvertCharset.Create; // initialize UniCode Translation to Japanese Conv.Initialize(CODEPAGE_UniCode, CODEPAGE_Japanese, MLCONVCHARF_ENTITIZE); // load source (from memo) Source := PWChar(WideString(Memo1.Text)); SourceSize := Succ(Length(Memo1.Text)); // prepare destination DestSize := 0; // lets calculate size needed Conv.DoConversionFromUnicode(Source, @SourceSize, nil, @DestSize); // reserve memory GetMem(Dest, DestSize); try // convert Conv.DoConversionFromUnicode(Source, @SourceSize, Dest, @DestSize); // show Memo1.Text := Dest; finally // free memory FreeMem(Dest); end; end; Further Information regarding code page translations you will find at MSDN - IMLangConvertCharset Best regards Daniel Wischnewski