ActiveX Components and Arrays
Paul Milenkovic
Copyright 2002
Scalar data values can be
passed in and out of COM and ActiveX components through function calls defined
on the interfaces to these components.
Arrays can be passed as well, but the choice of the computer language
using the component as well as the language used to implement the component
places restrictions on how arrays are handled.
One could skirt the issue by calling a COM interface to pass one scalar
element at a time, but there is a significant boost to runtime efficiency to
pass an entire array in a single function call.
The C and C++ languages
treat an array variable as a pointer to the initial element in a list of
elements. Even if an array variable is
passed by value, by copying its contents to the stack frame used to invoke a
function, that function can both read and write elements of the array using
that pointer value.
COM and ActiveX are
protocols for v-table object interfaces for invoking functions of software
components across module, process, and even machine boundaries. COM is the interface specification and
ActiveX is a specification for implementing Visual Basic controls using
COM. An ActiveX object is a COM object
at a low level, and I will use the terms ActiveX and COM interchangeably. The C and C++ concept of an array variable
as a pointer to read/write memory is not workable without special handling in a
COM interface. The array variable is a
pointer to what memory? If COM were
used only to cross a module boundary in the same process address space, the
array variable pointer could point to memory shared between modules. COM needs to operate transparently when
crossing process and machine boundaries where memory cannot be shared; memory
must be replicated on each side of the boundary and the contents of memory has
to be physically copied from one side of the boundary to the other. This physical transfer of data across
boundaries is called marshaling.
The ability to marshal data
across boundaries makes COM and ActiveX compatible with the .NET
environment. The .NET environment is a
gated community where the world is divided between managed code, the byte-code
compiled, type checked, array-bounds checked, garbage-collected, and
neighborhood-association regulated environment inside the gates and unmanaged
code, the unregulated chaos of cars up on cement blocks in the front yard believed
to exist outside the gates. COM and
ActiveX components are content to stay outside and pass data through the gates
in a manner designated by the security guard, and as such, are considered good service
providers to a gated community.
COM interfaces can be
accessed from more than one computer language.
While COM interfaces can be defined in the computer language used to
write the underlying component, it is convenient to describe these interfaces
in a common interface description language (IDL) and to compile that IDL to a
type library (TLB). Tools associated
with specific computer languages such as Visual Basic, Visual C++, and Delphi
generate IDL automatically as part of the process of specifying COM interfaces
in that language; it is helpful to know about IDL to have control over this
process. The IDL not only specifies the
variables making up function interfaces, it also specifies the marshaling. A function variables can be decorated with
the [in], [out], [in, out], or [out, retval] attributes. The keywords in and out mean what they say –
data needs to be marshaled (copied) in, out, or first in and then out of an
interface boundary to write, read, or write/read data to that interface.
The [out, retval] attribute
specifies an extra return value for a COM interface function that returns an
HRESULT status code. While not strictly
required, it is recommended that a COM function should return a status code,
especially if is marshaled between machines over a network where a transport
failure is a common occurrence, and such a function need a way to designate a “real”
return value in addition to the status code.
Computer languages such as Delphi, Visual Basic, and C# have a way of hiding
the status code and making the [out,retval] parameter the return value seen in the
normal function syntax. The HRESULT
status code is set and checked behind the scenes using exception handling. The Delphi language specifies this approach
using the safecall attribute; this approach is standard in Visual Basic; the C#
language uses this approach by default and disables this style of HRESULT
handling using the [PreserveSig] attribute.
In limited circumstances,
it is possible to write IDL to specify a C-style array along with marshaling
attributes to specify how the array data gets copied when crossing the
interface boundary. Even though an
array could in theory be referenced with a pointer for an in-process COM
control, the protocol must work with situations where the array must be copied;
a .NET application interacting with a COM or ActiveX control is such a
situation. We can pass an array of
single-precision floating-point numbers into an interface with the IDL
statement
HRESULT _stdcall
CopyIn([in, size_is=1] float* Buf, [in] long nBuf);
We can return an array of
floating-point numbers with the IDL statement
HRESULT _stdcall
CopyOut([out, size_is=1, length_is=2] float* Buf,
[in] long nBuf, [out] long* nCopy);
The first statement states
that an array is to be copied in to the interface, and variable nBuf, a 32-bit
signed integer, which is argument 1 (starting at 0) contains the number of
array elements. The second statement
states that the array is to be copied out of the interface. The size_is parameter tells us that variable
nBuf contains the number of elements of the buffer while length_is tells us
that variable nCopy, a return value, contains the actual number of elements to
copy. You see, a marshaling buffer must
be allocated when the function is called yet before the function returns any
values, otherwise the return values have no place to go. The value of nCopy can specify a number to
elements less than nBuf to actually go ahead and copy across the marshaled
interface, but that value is only known after the function returns. We are stuck with allocating a buffer of
nBuf elements, but we are only required to copy nCopy elements, providing some
efficiency gain. We also avoid [in,
out] unless we really need to pass data in both directions – we do this for
efficiency.
Limited circumstances mean this
is probably how you want to handle arrays, but in most circumstances this
seemingly sensible way of passing arrays is not allowed. One circumstance is Visual Basic 6.0, the
last Visual Basic before .NET. What is
the point of developing an ActiveX control in the first place if you can’t use
it with Visual Basic 6.0? Visual Basic
6.0 won’t support such arrays. C# .NET
will support this kind of array, and developing an ActiveX control for use with
C# is not entirely without merit – it may be an effective way of using your
existing unmanaged code base from the managed .NET environment without
extensive rewriting. Most tools for
compiling IDL into a type library, however, will disregard size_is and
length_is. Without giving any error
message, these tools will in a most frustrating fashion simply disregard those
attributes, and C# code linking to your ActiveX through a type library will
only pass length 1 arrays. You can get
around this restriction by writing out interfaces with IDL-like attribute decorations
in C# by hand, but that defeats the multi-language nature of having a type
library in the first place.
There are two approaches to
this problem. One is to curse Microsoft
for designing in all these restrictions and coming up with half-measures. The other approach is to try to figure out
how they meant their stuff to be used and live with it. And how things are meant to be is largely
controlled by Visual Basic.
The plain truth is that
Visual Basic wants an array as a Windows safe array data type packaged inside a
Windows variant data type. Don’t
believe anything anyone else tells you.
An array is a variant that references a safe array. There is no other way to do it. Can you pass Visual Basic a safe array? Nope.
You will be mislead by Web sites and books telling you that Visual Basic
uses safe arrays and showing you safe-array IDL. Your array needs to be in two layers of packaging – variant
containing a safe array containing the actual array – and you need to accept
this.
A variant is a 16-byte data
structure containing either a scalar data element or a pointer to a composite
data element along with a tag saying what the data element is. A variant is the fundamental runtime-typed
data element in Visual Basic; Delphi Pascal also supports variants as an
intrinsic data type. A safe array is
another data structure, one containing a reference to an array along with data
fields for array bounds. A collection
of Windows functions set those array bounds, allocate and free memory, set the
array reference, and lock and unlock the array reference for low-level access
to the array data. In Visual Basic,
those Windows functions are invoked behind the scenes. Most high-level languages that allow for
runtime typing – Lisp, MATLAB, Visual Basic, and scripting languages – have
such types that require special handling when exposed to C++ but are fairly
transparent when used in the high-level language.
You can go ahead and
declare a typed array as
Dim my_array() As Single;
That array is implicitly
packaged in a variant. You will find it
to be type-compatible with variant function arguments, and the IDL needs to
specify a variant or a variant pointer to take such an array.
Potential ways to write the
IDL to pass a Visual Basic array include
HRESULT _stdcall
CopyIn([in] VARIANT Buf);
HRESULT _stdcall
CopyOut([out] VARIANT Buf);
HRESULT _stdcall
CopyOut([out, retval] VARIANT* Buf);
Type VARIANT specifies a
variant, not to be confused with VARIANT_BOOL, which is simply a 16-bit Visual
Basic-compatible Boolean scalar variable.
The first IDL statement is the way to pass an array from Visual Basic in
to a COM component – Visual Basic takes care of allocating and freeing the
memory for the array, so the COM component receiving the array does not have to
do anything with memory. The second IDL
statement is how one could pass data out of a COM component back into a Visual
Basic array – if it were allowed! The
second statement is legal IDL: the variant is passed by value, but it contains
a reference to the array data, and the [out] parameter instructs COM to copy
array data from the COM component back out to Visual Basic – COM knows about
data structures such as variant and safe array and does the correct
action. Trouble is that Visual Basic
won’t work this way, and if Delphi is used to develop the component, the Delphi
Type Library Editor used to generate the IDL to describe COM interfaces won’t
let you do it either. The Delphi Type
Library Editor, however, will allow IDL that Visual Basic won’t recognize – it
is a strange mixture of restrictions that are mindful of Visual Basic but are
not completely conforming to Visual Basic.
The third statement is the
only way permitted by Visual Basic to pull an array from the COM component back
into Visual Basic – the Visual Basic function call generated from the type
library has the variant containing the array as its return value and the HRESULT
status code is hidden from the Visual Basic syntax. Visual Basic does not permit an [out] without the [out, retval]
in this instance. The length of the
array is returned along with the array as a field of the safe array structure;
this is a good thing because a function is allowed only one [out,retval]. The COM object has to allocate a fresh array
for each function call and return a reference to a variant containing that
array to Visual Basic. Both Delphi and
Visual Basic are high-level languages with respect to variants – once allocated
and assigned to a variant, the memory for a safe array is managed behind the
scenes.
It is possible to push an
array from an ActiveX component back to Visual Basic by way of an ActiveX
event, a callback from the component back to Visual Basic. For example, an ActiveX component gets a
request to return a list of waveform samples from a waveform display or from an
A/D, and it returns an array of samples through the event. The caller is the Delphi ActiveX object, the
called module is the Visual Basic form (the ActiveX container), and the
direction of data flow is from the Delphi object to the Visual Basic program.
The Delphi Type Library
Editor is used to add an OnTransmit() method to the Events object of an ActiveX
control wrapper for a Delphi VCL component – ActiveX controls have to be
developed first as VCL components as Delphi cannot create them from
scratch. The Text tab of the Type
Library Editor display for the OnTransmit() method shows
[
id(0x00000007)
]
HRESULT OnTransmit([in]
VARIANT wave_samp, [in] long nSamp, [in] VARIANT_BOOL end_of_src );
I wrote a new VCL component
called TWaveMonitor, which is exported as an ActiveX control by wrapper class
TWaveMonitorX. The bulk of that wrapper
is written automatically by the wizard dialog that generates the ActiveX
control; wrapper methods that don’t get generated automatically need to be
added by hand. A good way for automatic
method generation to ignore something is to use a non-Automation type
(non-ActiveX type); you need to add an ActiveX event with a proper
Automation-compatible type using the Type Library Editor as in the above
example, and you need to write code to transcribe the non-Automation to the
Automation-compatible type.
The following method of
TWaveMonitorX receives the OnTransmit event from the VCL component TWaveMonitor
and forwards it to the IWaveMonitorXEvents interface in the Visual Basic
program receiving the event. I don’t
mean to get into the mechanics of Delphi events and ActiveX events and how
events get transmitted and forwarded – I want to show how arrays are allocated
and passed.
procedure TWaveMonitorX.TransmitEvent(Sender:
TObject;
var wave_samp: array of Single; nsamp: Integer; end_of_src:
Boolean);
type
Single_listT = array [0 .. 1] of Single;
var
wave_sampV: Variant;
wave_sampP: ^Single_listT;
i: Integer;
begin
wave_sampV := VarArrayCreate([0,nsamp-1],varSingle);
wave_sampP := VarArrayLock(wave_sampV);
for i := 0 to nsamp-1 do
wave_sampP[i] := wave_samp[i];
VarArrayUnlock(wave_sampV);
if FEvents <> nil then FEvents.OnTransmit(wave_sampV,nsamp,end_of_src);
end;
The declaration var
wave_samp: array of Single is an array style internal to Delphi while the
variant variable wave_sampV will contain an array in a style exportable with
ActiveX to Visual Basic. The function
call VarArrayCreate([0,nsamp-1],varSingle) dynamically allocates a safe array
inside a variant of nsamp elements with bounds ranging from 0 to nsamp-1, all
in one step. The array elements are
specified as Single (single-precision floating point -- float in C). The function VarArrayLock exposes a raw
pointer to the actual array data while VarArrayUnlock releases that lock,
invalidating the pointer. In playing
fast and loose with the Pascal type checking mechanism by assigning that
pointer to a generic pointer-to-array-of-Single type, we can reference that
array with the syntax wave_sampP[] (which is a Delphi short hand for
wave_sampP^[] in more standard Pascal – Delphi adopts the C convention of
ambiguity between array pointers and array variables). The for loop copies array elements from the
Pascal-style array to the array-element memory of the safe array contained
inside the variant. The method call
FEvents.OnTransmit() forwards the repackaged array to the ActiveX event
interface. While copying array elements
costs some computer cycles, copying in this fashion is much more efficient than
having to invoke an ActiveX interface method call for each array element we
want to bring across.
We are dynamically
allocating a safe array with each event that transmits a buffer of
samples. If the array size is static,
we could allocate that safe array once and reuse it, but we still need some way
to free that dynamically allocated array.
From what I gather, Delphi is working as a high-level language where the
array is freed behind the scenes.
Information on the precise details is hard to come by, but my guess is
that the safe array in wave_sampV is automatically freed by the Delphi runtime
library when local variable wave_sampV goes out of scope. The call to FEvents.OnTransmit happens
before wave_sampV goes out of scope and destroys the array. If you are concerned, you could invoke
VarClear(wave_sampV) after you are done using wave_sampV. What Visual Basic decides to do with the
array is its own business, but it is fair to assume that wave_sampV is only
valid within the scope of FEvents.OnTransmit() and if you assign wave_sampV to
another variant variable in Visual Basic, it allocates a fresh safe array and
copies array elements, and applies similar rules to free that copy.
So the memory allocation is
explicit but the de-allocation should takes place behind the scenes. One way to check is to run the Visual Basic
app under Windows 2000 and to press Ctrl-Alt-Del to activate the task manager
and use it to monitor memory consumption.
I did just that and found out the memory allocation kept growing and
growing whenever I generated OnTransmit() events – it leaked memory!
More Google searching
turned up the fact that Delphi 6 Update 2 contained a patch to the IDE and the
runtime library (RTL) meant to fix some unspecified problems with
variants. Guess what, I installed the patch
and the memory leak went away. It is
rare that one’s application bug is a compiler, runtime environment, or
operating system problem – with Windows it is usually parameters passed from
your application to Windows that caused Windows to bomb where your debugger
won’t trace – in a legalistic sense a bug in your application and not in
Windows, but the brittleness of Windows to application faults is known to be
the source of much wailing and gnashing of teeth. Here, it was an actual bug in the development tools.
In conclusion, the proper
way for pushing ([in]) or pulling ([out,retval]) arrays between Visual Basic
and a COM or ActiveX component implemented in Delphi is to use a variant
containing a safe array. An array
memory allocation is required for each [out,retval] transfer, and one just
lives with it. The allocation,
assignment, and freeing of such arrays is implicit in the Visual Basic
syntax. While somewhat less automatic,
Delphi provides a variant data type, the VarArrayCreate, VarArrayLock, and
VarArrayUnlock functions for allocating and accessing a safe array contained in
a variant, and it provides for automatic freeing of the safe array contained in
a variant if one installs all of the Delphi bug updates.
Borland has tweaked the
Delphi language to work with Visual Basic, especially in the area of the
variant data type. C++ has to follow a
standard, and vendors are not allowed to extend C++ at will to deal with such
issues. Microsoft Visual C++ fires the
full artillery barrage of inheritance classes, template classes, and macros to
try and work with COM, and when the shelling stops, I still don’t know if there
is some streamlined C++ way of working with Visual Basic arrays apart from
pulling the safe array out of the variant and then using the safe array API
functions. Delphi is somewhat higher-level
in that it has the VarArray functions for operating directly on the variant
variable.
I haven’t completely
chronicled all the blind alleys in getting to this point. I am completely amazed that something as
basic as transferring an array can be such a tangled mess of capabilities and
poorly documented restrictions of those capabilities. As I mentioned earlier, all dynamic-typed languages maintain data
structures containing both data and data-type tags internally, Visual Basic
being no exception, and if you write interfaces to C++, those internal data structures
need to be exposed. There really is
only one way of passing arrays from Visual Basic to COM, but the COM
specification tantalizingly offers other ways of passing arrays one would like
to use, and the process of discovering the things that don’t work is
frustrating. It is quite simple: there
is no Southern Continent, only you have spent the last few months freezing your
backside in a converted coal ship with a tyrant captain finding that out.
Even Microsoft has thrown
in the towel in coming out with .NET and its managed runtime environment. In .NET, an array is an object type
recognized by the runtime system and is the same in all the .NET
languages. You can pass an array by
value and in effect receive a reference that allows both reading and writing
the array elements – functionally the same as a C++ array. The one reason not to completely abandon COM
is that it is an effective way of connecting .NET to all of the unmanaged
Windows code one has written over the years, and for that reason, I expect COM
and ActiveX to be around for quite a while.