COM in a Nutshell
Introduction
This is where I sum up what I (think I) understood about Microsoft's Component
Object Model.
Notes from Appleman's "Developing ActiveX Components with VB"
- GUIDs are 16-byte, hexadecimal numeric identifiers used to uniquely
identify items. They come with different names like CLSID, ProgID, IID,
etc., depending on the type of object the GUID is used for
- An ActiveX object is uniquely identified with a CLSID ("class
ID"), but can also be handled through a text-based ProgID ("programming
ID", which, ideally, should be unique, but doesn't have to be)
- To be found, an object must be registered. Visual Basic DLLs contain
the exported functions DllRegisterServer and DllUnregisterServer, which
can be called to perform the registration (or clear the registration) for
the objects supported by the DLL. These functions are called by the command-line
utility regsvr32.exe or the installation program as needed. This results
in a new entry being created in the Registry, under HKEY_CLASSES_ROOT\CLSID
- An object offers one or more interfaces, where an interface is
a set of functions that an object exposes
- All COM objects implement at least one interface, IUnknown, which
exposes three methods: AddRef(), Release(), and QueryInterface().
QueryInterface() is used to ask a COM object whether it supports a particular
interface; If it does, QueryInterface() returns a pointer to that interface
- An interface is given a human-readable name in source code (and, by
convention, starts with an I for "interface"), but Windows actually
uses IIDs ("interface IDs") to locate a particular interface
within an object
- The list of interfaces available on a system can be found in the system
registry under the HKEY_CLASSES_ROOT\Interface section of the Registry
- Once you have released an interface, if you wish to add functions, you
cannot change this interface since that would break all applications (an
interface is a contract between an object and a calling application.) Instead,
you must create a new interface. Newer applications, which are aware of
this new interface, will be able to take advantage of the new features,
while older applications will ignore this new interface and keep calling
the older interface
- If it handles data, those can only be accessed indirectly through an
interface, ie. there's no such thing as a Public data member in COM objects
- Should an object need data persistance, it can use OLE structured
storage, where storages are the equivalent of directories, and steams
are similar to files
- Marshaling is the process of collecting the necessary information
for an EXE to call a COM object that lives outside of the memory space of
the calling EXE, since 32-bit versions of Windows keep the two processes
separate
- Marshaling isn't required if the COM server lives in the same address
space as the calling EXE, but is required if the COM server is itself an
EXE, and, thus, lives in a different address space. The COM EXE can run
on either the same computer as the caller EXE, or on a remote computer,
in which case the technology called Distributed Component Object Model (DCOM)
is used
- Interfaces are described in a Type library, which is located
under HKEY_CLASSES_ROOT\Typelib
- For each object, type libraries provide the necessary information for
marshaling to occur, ie. it lists all the interfaces, and within each interface,
all the functions and their format. A type library includes the CLSID of
the object whose interfaces it describes. The type library GUID is compiled
into the caller EXE, letting it locate the actual object by looking up its
CLSID through the type library
- Objects can be access in two modes: Early bound (the EXE can
compile direct function calls by looking up the address of the method
in the vtable array of pointers to functions), or late bound (the
EXE go through the IDispatch interface to locate functions). Early binding
is recommended for performance reasons, since calls are made directly to
the ad hoc interface, while with late binding, calls are made through the
IDispatch interface which can determine the methods and properties
of the object at runtime
- An interface that does double duty as both an IDispatch Interface and
a standard interface is called a dual interface. It gives applications a
choice. When an application uses the direct interface for an object, it
is called early binding
- In Visual Basic you can implement early binding by adding a reference
to an object via the references dialog box, then declaring a variable using
the specific object type instead of As Object. This can substantially reduce
the time it takes to access methods and properties in the object
- Late binding occurs in Visual Basic when you dimension an object variable
to be As Object. An object variable can hold any type of object. Since the
variable can reference any object and must support whatever methods or properties
that object may implement, it clearly can have no way of knowing until runtime
what those methods and properties may be. Without late binding and the IDispatch
Interface, the As Object type of variable would not be possible
- OLE Automation ("Automation" for short) is the late-bound
way of calling an object, and thus, relies on the existence of an IDispatch
for this interface
- The IDispatch interface consists of the following functions: GetTypeInfoCount(),
GetTypeInfo(), GetIDsOfNames(), Invoke()
- Each method or property supported by an interface has a dispatch ID
- a number that identifies that method or property. Thus, while each method
has its own dispatch ID, the function to set a property and the function
to retrieve that same property will share the same dispatch ID
- In VB, each interface has a corresponding IDispatch interface. This
is called a dual interface, something which COM objects built with
VB has, but is not necessarily available in COM objects built with other
tools, like VC++, ie. some objects only offer the IDispatch interface, making
them only late-bound
- When an object is created in memory, COM returns a pointer to its IUnknown
interface, whose QueryInterface() function VB then uses to locate
a given interface. All objects include the IUnknown interface, while only
objects capable of late-binding support the IDispatch interface.
Stuff
- CLSID = Class ID = Object a.k.a. component class (used in CoCreateInstance());
A DLL or EXE can have multiple classes, ie. multiple CLSID's pointing to
the same .DLL/.EXE
- IDL = Interface Definition Language (MIDL.EXE)
- IClassFactory = Class factory = class object
- DISPID = ?
- Three interfaces = vtable (custom), dispatch (Automation), and dual
(?)
- OLE is based on COM; About 150 API's and 120 interfaces
- HResult can be positive or negative number
- Binary compatibility
- DLL Hell
Questions
- HResult positive = good news, negative = bad news?
- If multiple ProgID's in Registry, which CLSID to choose in eg. CreateObject("project1.userctl")?
First entry?
- In VB, how to create an interface, so as to have an OCX pointing to
the object's CLSID, which contains multiple IID's? Just add a .CTL file
to a project, and voilą! New interface?
- In VB, what does binary compatibily do?
- What is a ProgID? .CTL and .OCX says "MSWinsockLib.Winsock"
while Registry says "VersionIndependentProgID=MSWinsock.Winsock"
- Does a COM object need a type library? Is it stored in an independent
file, or inside the OCX?
Resources
Tools
- OLEView
("This administration and testing tool browses in a structured way,
configures, activates, and tests all Microsoft Component Object Model (COM)
classes installed on your computer.")
Sites
Books