Archive for the ‘CodeGear’ Category

Just released: Castalia 2009.2

Posted on June 24th, 2009 in Business, Castalia, CodeGear, Delphi, TwoDesk | 2 Comments »

I’m very excited to announce the latest version of Castalia, the ultimate tool for Delphi developers.

The major focus of Castalia 2009.2 has been improving the parser and adding support for many language features that have been added to Delphi in recent years. I’m very happy to say that the Castalia Delphi parser is now fully up-to-date.

In addition, Castalia 2009.2 includes the following improvements over the previous version:

* Fixed: “Index Out of Bounds” error during some context switches
* Fixed: Some Castalia features not available after line 30768 in the code editor
* Fixed: Access violation when firing a code template with an empty scope

Castalia users with a current maintenance subscription can download version 2009.2 today at http://subscribe.twodesk.com.

Everyone else can grab the free trial at http://www.twodesk.com/castalia.

Castalia 2009.1 is now available.

Posted on April 23rd, 2009 in Castalia, CodeGear, Delphi, TwoDesk | No Comments »

If you have a current subscription, you can get Castalia 2009.1 at http://subscribe.twodesk.com.  Everyone else should give the free trial a run (http://www.twodesk.com/castalia).

Here’s what’s new:

* New: “Modeless” “Add Parameter” refactoring
* Improved code formatting for “Remove Unused Variables” refactoring
* New: “CurrentLine” function in code template scripting
* Fixed: Unicode characters in identifiers can cause memory leaks in Delphi 2009.
* Fixed: Improper cursor movement after some text searches
* Fixed: <Esc> key closes Code Insight *AND* pops a bookmark off the stack if Code Insight is invoked when there are bookmarks on the stack.

Enjoy!

–Jacob

Released: Castalia 2008.4.1

Posted on December 30th, 2008 in Castalia, CodeGear, Delphi, TwoDesk | 2 Comments »

Castalia 2008.4.1 is now available.

Current customers can download from http://subscribe.twodesk.com

Everyone else can get the free trial at http://www.twodesk.com/castalia/

This release is entirely a bugfix release.  The following issues have been addressed:

  •  Fixed: Access violations when pressing F3 after using modeless text search
  •  Fixed: ‘Clipboard’ function not recognized in code templates
  •  Fixed: Clipboard insertions incorrectly translated to new code template format
  •  Fixed: Some hotkey pickers don’t display properly in D2005, D2006, D2007, or D2009
  •  Fixed: Code template indenting stuck at 2 spaces
  •  Fixed: Library precompiler activates at the wrong time under some circumstances
  •  Fixed: Delphi 2009 hangs at the splash screen under some circumstances
  •  Fixed: Incorrectly identified syntax error when using parameterized types with multiple parameters

Released: Castalia 2008.4

Posted on December 10th, 2008 in Castalia, CodeGear, Delphi, TwoDesk | 7 Comments »

I just uploaded release and trial binaries for Castalia 2008.4.  This is the biggest Castalia release in quite a while, with the huge changes to the code templates and other things.  Here’s what’s new:

  • New code template engine
  • Improved template activation strings
  • New code template scripting
  • New code template context-sensitivity with “scope expressions”
  • New “sidebar
  • And other little tweaks and fixes….

I’ve talked about the templates a little in this blog, but haven’t mentioned the “sidebar” at all.  Here you go: http://www.twodesk.com/castalia/sidebar.html.

Castalia 2008.4 is available immediatedly for current subscribers at http://subscribe.twodesk.com, or you can download the free trial athttp://www.twodesk.com/castalia/download.html.

More on the new code templates

Posted on November 25th, 2008 in Castalia, CodeGear, Delphi, TwoDesk | 2 Comments »

Here’s another interesting tidbit about the upcoming new code templates.  Again, this is all pre-release, and is subject to change or withdrawal until it actually ships.  Comments and suggestions are, as always welcome.

Castalia’s code templates have always been context-sensitive, to a degree.  In the template editor, there have always been checkboxes that allow you to specify whether a template should fire in strings or in comments.  This is important because when you type “if” in a comment, you probably don’t mean to create an if..then statement.

This sensitivity is good, but it could be better.  Take, for example, the “r=” template, which expands to “Result :=”.  This is only really useful in a function, and wouldn’t be good to do in a procedure (in fact, if you typed this in a procedure, having it NOT fire would serve as a good reminder that you don’t have a return value for your procedure – catching the error before it gets out of control).  With Castalia’s current templates, there’s no way to specify that a template fire only in a function, not in a procedure.

With the new templates, this is easy.  In the template editor, the checkboxes are gone, and have been replaced with a simple “scope” edit control.  Each template has a scope string, which is sort of like a demented CSS selector.  Here’s what the scope string looks like for the “r=” template:

 function – string & function – comment

This is reasonably self explanatory, but let’s explore a bit.  This string is made of two individual scope selectors, joined with an &, meaning AND.  Each individual selector is in the form “positive – negative” and will match any location in code where the positive part is true, and the negative part is false.  So “function – string” means “in a function, but not in a string.”  “function – comment” means “in a function, but not in a comment.”  The & means that both of those selectors must be true in order for the template to fire.

The individual parts of a selector can contain nested words, for greater precision.  For example, if you wanted a template to fire only in a function that is a member of a class, you could do it:

class > function

The positive or negative part of a selector can be blank.  In the example above, the negative selector is blank (so there is no “-” sign).  If the positive selector is blank, the selector begins with the “-” operator.  This selector will fire everywhere except in a comment or a string:

 - comment & – string

Note that the “-” operator is always binary – it is part of the selector, not a modifier of a selector.  That is, this is invalid:

method – (comment & string)

What is being intended here should be written like this:

method – comment & method – string

Or like this:

method & – comment & – string

If you want to use an OR operator, you would use a comma.  For example, the following scope would match any class except in a comment, or any string:

class – comment, string

Operator precedence between & and , is simply left to right.  You can change that with parenthesis:

class – comment, (string & method)

This would fire in any class except in a comment, or in any string that’s in a method.  There are better ways to write that, but it demonstrates the concept.  One more example wouldn’t hurt, so here’s how I would write it better:

class – comment, method > string

Scope selectors in the new Castalia templates allow for powerful and completely flexible control over where templates fire, and are one of the major improvements over the old template system.  The more I work on this, more excited I am to get it finished and get it into your hands.

Preparing for Delphi 2009: Part 4

Posted on September 9th, 2008 in Castalia, CodeGear, Delphi | 4 Comments »

Over the last few days, I’ve written about things to look for in your code to be prepared for Delphi 2009.  In today’s installment, I’m going to discuss a few Windows API calls that gave me a little trouble when porting Castalia to Delphi 2009.

GetProcAddress

The GetProcAddress API call is used to find the address of an exported function in a DLL.  If you’re using it, it probably looks something like this:

procAddr := GetProcAddress(Handle, PChar(‘SomeFunctionName’));

We run into trouble with that last parameter.  GetProcAddress expects an ASCII string.  That is, an array of bytes, not words.  In Delphi 2009, the above code will fail, because you’ll get a pointer to a UTF-16 string.  Here’s the way it should look now:

procAddr := GetProcAddress(Handle, PAnsiChar(AnsiString(‘SomeFunctionName’)));

The typecast to AnsiString ensures that the string is a string of bytes, not words.  Then the PChar cast is changed to PAnsiChar for completeness.

ToAscii

ToAscii is used to convert a keyboard state into an ASCII character.  It’s usually called from a KeyDown or KeyUp event handler.  Here’s a typical use:

GetKeyboardState(KS);

I := ToAscii(KeyCode, KS, @TempChar, 0);

Most of the parameters of ToAscii are beyond the scope of this post, but the point is that it takes the current keyboard state (represented by KS and KeyCode) and turns it into the ASCII character which should be displayed.

Of course, now that a Char is no longer limited to Ascii (and the size of the Char data type has changed), ToAscii isn’t really going to work any more.  I have to admit I was a bit surprised to discover the solution – the ToUnicode API call:

 I := ToUnicode(KeyCode, 0, KS, TempChar, 1, 0);

There are a few more parameters involved here (and note that TempChar is no longer referenced by pointer).  Again, most of them don’t matter, but the usage above will work for most cases.  If you’re using ToAscii in your code, you’re going to want to replace it with ToUnicode.

 MultiByteToWideChar

The MultiByteToWideChar function maps a string (often an ASCII or UTF-8 string) to a WideString.  Now that the default string type is unicode, and all string types are compatible by assignment, calls to MultiByteToWideChar can be replaced by simple assignment:

//Old code

MultiByteToWideChar(0, 0, pchar(sourceString), Length(S), PWideChar(targetString), Length(targetString));

//New code

targetString := sourceString;

Any API call that works with strings is probably worth examining for correctness, since most API calls that require a string also require the length of that string.  You should check whether it expects the length of the string in bytes or in WCHARs, and even if it might expect one form of string or another.

One note though: Delphi’s included Windows API units include overloaded versions of most of the API routines that involve strings, so you can call them with either a string or an AnsiString, and they’ll still work.  As I said before, 99.9% of your code is likely to simply compile and work without any changes.  That’s still true.

Delphi 2009 and Unicode

Posted on September 8th, 2008 in Castalia, CodeGear, Delphi | 8 Comments »

Alright, I keep getting questions about what the new 16-bit Char type (and its associated UnicodeString) really mean.  Let’s take a couple minutes off from the “Preparing for Delphi 2009″ series and discuss exactly how this works.

The core issue here is that really, a character according to the unicode standard isn’t 16 bits, it’s 32 (ok, actually it’s 21, but that’s not a normal data size, so we use 32).  Since you obviously can’t fit 32 bits (or even 21 bits) into a 16-bit data type, what is going on with this 16-bit Char type?

In the unicode world, there are two possibilities here.  The first is called UCS-2, which basically means that only a subset of the entire character set is allowed.  That is, you can only use the characters that will fit into 16 bits.

The other possibility is UTF-16, which uses a 16-bit data type, and has a mechanism for splitting a larger character into two of these 16-bit chunks.  When this happens, each chunk is called a Surrogate, and the two of them together is called a Surrogate Pair.

So, let’s settle this once and for all: Delphi 2009 uses UTF-16.  It allows the entire unicode character set, using surrogate pairs for the characters that take more than 16 bits to represent.

Before I get to some specifics, there are a couple of terms we should make sure we have straight:

In unicode, a character is called a Code Point.  The letter ‘A’, an exclamation mark, a space, a line feed, and any other “thing” that is represented as part of the unicode “character set” (called the Code Space) is a code point.

On the other hand, that “chunk” of data – 16-bits in UTF-16 – is called a Code Unit.  If you have a code point that won’t fit into 16 bits, you’ll need to combine two code units to form a surrogate pair.

In Delphi terms, the new 16-bit Char data type represents a code unit, not a code point.  So, when I wrote last week that Length(myString) returns the number of printable characters in myString, that could have been a little misleading.  Length(myString) returns the number of code units in the string.  If some of those code units are surrogates, the number of code points you see on screen may not be the same as the number of code units in the string.

Now for a couple of frequently (and anticipated) asked questions:

Q: Is it really safe to assume that the size of a string is Length(myString) * 2?

A: If by “size,” you mean “size in bytes,” yes.  That will give you the size of the string in bytes, because Length(myString) tells you the number of code units, which are always 2 bytes.  I would, however, suggest that you not use the magic number 2, but rather SizeOf(Char), because 1. it’s more readable, and 2. it’s not inconcievable that the size of the Char data type could change again in the future – use SizeOf(Char) and you’re already ready for it.

If by “size” you mean something else, keep reading…

Q:  How do I get these characters into my strings?

A: The easiest way is just to include them in your source code.  Since the compiler and code editor are fully unicode enabled, you can just put the character into your source code.  However, that isn’t always easy if you’re using a keyboard that isn’t really designed for the characters you’re using, and isn’t always the most readable solution either.  There is another way:

Delphi 2009 includes a new unit in the RTL called Character.pas.  It has a bunch of utility functions (and a utility class) to help with these conversions.  Let’s say you want a string that has the codepoint $20086 in it.  You could do the math to figure out the surrogate pair and do S := Chr($D840) + Chr($DC86); or you could use the ConvertFromUtf32 function from Character.pas: S := ConvertFromUTF32($20086);

Both will give you the same result, but ConvertFromUtf32 is certainly easier to use.

Note that if you do ShowMessage(S), you’ll see only one character on the screen, but that Length(S) will return 2, since there are two code units used to represent the one character.

Q: How do I determine the number of code points in a string with surrogate pairs, instead of the number of code units?

A: SysUtils.pas has some helper functions for things like this.  In this case, we could do I := ElementToCharLen(myString, Length(myString)); and I would contain the number of code points in the string.

Hopefully this will answer some of the questions that have come up.  If there are things that still aren’t clear, feel free to comment, and I’ll do my best to clear things up.

Preparing for Delphi 2009: Part 3

Posted on September 8th, 2008 in Castalia, CodeGear, Delphi | 1 Comment »

Last week, I blogged about the changes to the Delphi string type and how it might affect some of your memory management code.  Those were just warmups for today.

Today I’m writing about TStream and Delphi 2009.  Reading and writing streams was the biggest source of issues in porting my code to Delphi 2009.  I suspect that it will be the same for most of you.  Hopefully today’s post will help you prepare your code ahead of time.

Again, the root of the problem lies in the fact that most of the time when we write code to read or write streams, we assume that a Char is one byte, and a string’s length in Chars is the same as its size in bytes.  Since this isn’t true any more, our code that assumes that it is may be broken.

Note that the following is true for ALL stream classes, whether it’s TMemoryStream, TFileStream, or some other TStream implementation.

TStream.Write

TStream.Write expects the number of BYTES to write to the stream, not the number of Chars. The following code is very common, but incorrect:

Stream.Write(Pointer(myString)^, Length(myString));

This code will compile without complaint in Delphi 2009, but it won’t do what you probably wanted it to, which is write the whole string to the stream.  Your first instinct might be to replace Length(myString) with SizeOf(myString) but that won’t work either, since the SizeOf(someString) is always 4 (it’s just a pointer, remember?).  Generally, we should use the Length * SizeOf construct that I disliked a couple days ago:

Stream.Write(Pointer(myString)^, Length(myString) * SizeOf(Char));

I’d like to point out that this code is 100% backwards compatible with prior versions of Delphi.  It behaves correctly in Delphi 5, when SizeOf(Char) was 1, and it behaves correctly in Delphi 2009 when SizeOf(Char) is 2.  This is particularly important in Castalia, which uses the same code base to compile for the last six versions of Delphi.

I said that this solution works generally, but there are instances when you may want to do something else… Specifically, if you want to write in some encoding other than UTF-16.  We’ll leave that for later though (hint: the answer involves a new class called TEncoding).

TStream.Read

Of course, if you change your stream write code, you’ll probably need to change your stream read code. It all depends, of course, on how the stream is written. A typical pattern is to write the length of the string, then the string:

L := Length(myString);

Stream.Write(L, SizeOf(Integer));

Stream.Write(Pointer(myString)^, Length(myString) * SizeOf(Char));

Now, the old way to read this will look like this:

Stream.Read(L, SizeOf(Integer));

SetLength(myString, L);

Stream.Read(pointer(myString)^, L);

But this won’t work, as it will only read L bytes, which is going to be half the string when SizeOf(Char) is 2.  I’m sure by now you’re already a step ahead of me on the solution:

Stream.Read(L, SizeOf(Integer));

SetLength(myString, L);

Stream.Read(pointer(myString)^, L * SizeOf(Char));

The first two lines were fine – reading an Integer isn’t affected by the change in the size of a Char, and SetLength takes the number of Char elements in the string, just as it always has.  Reading the string from the stream, however, we need to make sure we’re telling the stream object how many BYTES to read, not how many CHARS.

Once again, the general rule holds: If the routine deals specifically with strings, it expects the number of CHARS to use.  If it works with general memory buffers, it expects the number of BYTES to use.  The trick here is just translating between CHARS and BYTES in appropriate places.

As I said at the beginning, I’m predicting that stream reading and writing is going to be the single most common cause of issues when porting code to Delphi 2009.  The simplest way to go about it is to do what I’ve noted in this post.  We’ll come back to a couple of other solutions in a few days, but tomorrow I’m going to talk about a couple of Windows API calls that you might be using that will need a bit of work.

Preparing for Delphi 2009: Part 2

Posted on September 6th, 2008 in Castalia, CodeGear, Delphi | No Comments »

Yesterday, I wrote about the new UnicodeString type in Delphi 2009, and the fact that the Char type is now a 2-byte Char instead of a 1-byte Char.  I wrote about how this can affect calls to SizeOf and Length, among other things.

While porting my code to Delphi 2009, I found a few instances where the changes introduced some memory management issues.  Of course, the root cause of the problems were that my code assumed that SizeOf(Char) = 1, which is no longer true.

When that change was made, some decisions had to be made about certain parts of the VCL (or, more correctly, the RTL) and what parameters they would expect.  Most of Delphi’s memory management routines expects parameters in numbers of Bytes, not Chars:

FillChar

A common way to use FillChar has long been FillChar(memoryBuffer, Length(memoryBuffer), 0).  This should fill memoryBuffer with zeroes.  However, if memoryBuffer is an array of Char, this won’t work any more.  It will only fill half of the array, because Length() returns the number of elements in the array, not the size of the array in bytes.  The solution is to use SizeOf() instead of Length():

FillChar(memoryBuffer, SizeOf(memoryBuffer), 0);

Also, note that FillChar fills memoryBuffer with BYTES, not CHARS, even if the buffer is an array of Char.  If your code reads FillChar(memoryBuffer, SizeOf(memoryBuffer), #36), your chars will be $3636, not $36 (AKA $0036) as you might have intended.  To get a UnicodeString full of #36, you’ll need to use StringOfChar().  The analogous code to the above call of FillChar() is as follows:

StrPCopy(memoryBuffer, StringOfChar(#36, Length(memoryBuffer)));

Note that StringOfChar() takes the number of Char elements, not the size in bytes (hint for remembering: if the routine is specifically for strings, it probably takes the number of Char elements.  If it’s a generic memory-management routine, it probably takes the number of bytes).

Move

Move() can have the same problems as FillChar(). If you’re using Move() with Char arrays, Length() won’t work the way it used to:

Move(charArray1, charArray2, Length(charArray1));

Move is a generic memory management routine, so it expects the number of BYTES to move, not the number of Chars.  Here’s the right way:

Move(charArray1, charArray2, SizeOf(CharArray1));

Alternatively, you could do Move(charArray1, charArray2, Length(charArray1) * SizeOf(Char)), but I think that’s unnecessary.  Of course, if you feel it’s more readable, it will work just as well.

Copy

Copy() is related to Move(), though it’s aimed specifically at strings and arrays.  This means that when using Copy(), you should pass in the INDICES of the elements (probably CHAR elements, if they’re potentially strings), and NOT the byte offset of the elements you want to copy.  This isn’t as likely to be an issue, but it did bite me once while porting Castalia to Delphi 2009.

I hope that’s enough for one day.  On Monday, we’ll look at how TStream descendants might present some problems.

Preparing for Delphi 2009: Part 1

Posted on September 5th, 2008 in Castalia, CodeGear, Delphi | 7 Comments »

As Delphi 2009, with its big unicode changes, is soon to be upon us (It was announced and made available for sale on Aug. 25), I think it’s a good idea to talk about some of the issues that might come up.

My experience with Delphi 2009 (which I’ve been field testing) is that the vast majority of code will actually behave just the way it did before, with no changes.The only trouble will come with code that assumes that the size of a Char is 1 byte, or that the length of a string is equal to its size in bytes. Now that a Char is a 2-byte data type, and the length of a string is different than the number of bytes it takes up, we have to re-examine some old code and change some old habits.

Over the next few days, I’ll look at a few of the things that I encountered in porting my code, which I expect will be the same issues most people will face.  Here goes…

string, WideString and AnsiString

The under-the-hood changes to the string type are interesting, but aren’t really relevant here, except to say that WideString is more or less deprecated.  The new string type UnicodeString sort of replaces WideString, with a lot more capability (code that uses WideString will still compile and run just fine).

String is mapped to UnicodeString, so all of your code that uses the string type will automatically be unicode, where before it was ASCII (ok, it wasn’t necessarily ASCII, because you could codepages and things, but it WAS restricted to 8-bit characters).  Char is mapped to WideChar, so a Char is now 16-bits by default, instead of 8. 99.9% of the time, this is what you’ll want.

However, for the rare times when you’ll specifically want 8-bit characters, the types AnsiChar and AnsiString still behave like “old” Delphi strings, with 8-bit characters.  AnsiChar and AnsiString are assignment-compatible with UnicodeString and WideChar, so you can do myUnicodeString := myAnsiString and it will just work.  Keep in mind though, if you assign a UnicodeString to an AnsiString, there is potential data loss as 16-bit characters are “compressed” into 8 bits.

We’re going to ignore AnsiChar and AnsiString for most of this discussion.  The times when you’ll need them will be fairly obvious, and most of the time you should use the standard string and Char types, which are unicode.

Length and SizeOf

Any call to Length(string) is a potential problem.  The Length of a string is no longer its size in bytes, but rather the number of printable characters in the string.  The size in bytes is best determined by the expression Length(myString) * SizeOf(Char).

With Char arrays (often encountered when using Windows API calls directly), Length(charArray) will return the number or Char elements in the array, which again is its size in bytes.  If you want the size in bytes, call SizeOf(charArray).  When using API calls like FormatMessage, GetClassName, GetWindowText, etc… which take a Char buffer and its size, make sure you’re passing the right size – be it the size in bytes or the length.  MOST Windows API calls want the number of chars in the array, so you should pass Length(charArray), not SizeOf(charArray).

If you have a null-terminated Char array, you can get the number of printable characters in the array with StrLen(charArray).  For example, if you declared charArray: array[0..99], and assign charArray = ‘hello’, you’ll find that Length(charArray) returns 100, SizeOf(charArray) returns 200, and StrLen(charArray) returns 5.

Oh, and one more note… SizeOf with strings isn’t very helpful.  SizeOf(myString) is always going to be 4, because myString just a pointer.

That’s enough for today.  Tomorrow, I’ll talk about some possible memory management issues you might encounter, and how to easily fix them.