See the question and my original answer on StackOverflow

At its lowest level, COM is really only a binary-level standard that describes how two pieces of software can communicate. It's binary because it's 100% language independant, it does not rely on source code, but only on a specific layout of structures in memory.

In my opinion, the best article to start with is The COM Programmer's Cookbook. This famous binary standard is explained at the beginning of the document that I quote here:

The separation between service user and implementation is done by indirect function calls. A COM interface is nothing more than a named table of function pointers (methods), each of which has documented behavior. The behavior is documented in terms of the interface function's parameters and a model of the state within the object instance. The description of the model within the instance will say no more than is required to make the behavior of the other methods in the interface understandable. The table of functions is referred to as a vtable.

An interface is actually a pointer to a vtable. The vtables are usually shared by multiple instances, so the methods need a different pointer to be able to find the object that the interface is attached to. This is the interface pointer, and the vtable pointer is the only thing that is accessible from it by clients of the interface. By design, this arrangement matches the virtual method-calling convention of C++ classes, so a COM interface is binary-compatible with a C++ abstract class.

And the schema that comes with it represents the binary standard layout in memory: