COM Interop in Mono (part 1)
The past few months I have been looking into supporting COM Interop in Mono. Not just for Microsoft COM on Windows, but hopefully XPCOM on Windows and Linux, as well as COM ported to Linux by third parties (Mainsoft, for example). I initially tried to do everything in managed code and did come up with a solution; but it was ugly and lacked some of the functionality of MS's implementation. So, I took the plunge and began looking into the mono runtime.
COM Interop is a large topic. The definitive book for COM Interop, .Net and COM: The Complete Interoperability Guide, is a solid 1500+ pages). So I'll skip any general explanation, and instead focus on the implementation in mono. COM Interop is bidirectional. It provides a mechanism to expose unmanaged COM objects to managed clients. The managed wrapper is called a runtime callable wrapper (RCW). It also provides a way to expose managed objects as COM objects to unmanaged clients. The wrapper exposed to unmanaged code is called a COM callable wrapper (CCW). My initial focus has been on the former.
Currently, I am using the same interop assemblies that MS does. An interop assembly is generated from a COM type library, and essentially converts COM type information into metadata. This interop assembly is what is referenced by managed code. This assembly contains managed interfaces that correspond to COM interfaces, as well as managed classes that correspond to COM coclasses. The methods on each interface are in the same order as they are in the COM interface. Thus, we can determine the layout of the COM interface vtable. The usage of COM objects in managed code can be divided into two cases.
The first case is when an class in the interop assembly is created in managed code. When the user creates a class that is marked with the ComImportAttribute (all classes and interfaces in the interop assembly are marked with this attribute), extra space is reserved in the class for a pointer to the unmanaged COM object.
All methods on the RCW are marked as internal calls. When the runtime tries to resolve the internal call in mono, instead of looking it up in the internal call tables, a trampoline (forgive me if I'm using this term wrong) is emitted that will call the method on the underlying COM object. That trampoline first calls a helper method and passes it the MonoMethod and the MonoObject that the call is for. The pointer to the COM object is obtained from the MonoObject (stored in the extra space reserved for the pointer). Next, the interface that the method was defined on is determined. The GuidAttribute on this interface is used to call QueryInterfaces on the COM object. The returned interface pointer is the correct 'this' pointer for the unmanaged function. Then, the offset of the method in the interface is determined. This offset is used to obtain the function pointer via the vtable of the COM object. That function pointer is called using a stdcall calling convention with the 'this' pointer pushed as the first argument on the stack. Most COM methods return an HRESULT (int) that indicates success/failure. This will be translated into a managed exception.
The second case occurs when an interface pointer is returned from a method/property, or when a RCW is cast to an interface that is not listed in the metadata. RCW's are special in that a cast can succeed to an interface other than those listed in the class's metadata. The runtime calls QueryInterface on the underlying COM object, and if that call succeeds the runtime allows the cast to occur. At this point, the runtime knows nothing of the COM object's identity except that it supports the given interface. The managed object's type becomes a generic type that wraps COM objects, System.__ComObject.
To handle this in mono, a solution was built on top of the remoting infrastructure. First, a new internal class was defined, System.ComProxy, that derives from System.Runtime.Remoting.Proxies.RealProxy. The constructor for the ComProxy class takes an IntPtr argument, the pointer to the IUnknown interface of the COM object. System.ComProxy also implements System.Runtime.Remoting.IRemotingTypeInfo, which allows for special handling of casting via the CanCastTo method. When a COM interface pointer is returned, a new instance of ComProxy is created for the type System.__ComObject. Then a call to GetTransparentProxy is made. The transparent proxy (tp) object that is returned is cast to the expected managed interface. This in turn causes a call to CanCastTo, which calls QueryInterface on the COM object to determine whether the target interface is supported. If the interface is supported, the remoting infrastructure dynamically adds the interface and its methods to the proxy's vtable. Any method calls on this interface are now handled as in the previous case.
And there it is. Too much information for the casual reader, and not enough for those of you who care. Code will hopefully follow shortly, as I have time.
COM Interop is a large topic. The definitive book for COM Interop, .Net and COM: The Complete Interoperability Guide, is a solid 1500+ pages). So I'll skip any general explanation, and instead focus on the implementation in mono. COM Interop is bidirectional. It provides a mechanism to expose unmanaged COM objects to managed clients. The managed wrapper is called a runtime callable wrapper (RCW). It also provides a way to expose managed objects as COM objects to unmanaged clients. The wrapper exposed to unmanaged code is called a COM callable wrapper (CCW). My initial focus has been on the former.
Currently, I am using the same interop assemblies that MS does. An interop assembly is generated from a COM type library, and essentially converts COM type information into metadata. This interop assembly is what is referenced by managed code. This assembly contains managed interfaces that correspond to COM interfaces, as well as managed classes that correspond to COM coclasses. The methods on each interface are in the same order as they are in the COM interface. Thus, we can determine the layout of the COM interface vtable. The usage of COM objects in managed code can be divided into two cases.
The first case is when an class in the interop assembly is created in managed code. When the user creates a class that is marked with the ComImportAttribute (all classes and interfaces in the interop assembly are marked with this attribute), extra space is reserved in the class for a pointer to the unmanaged COM object.
All methods on the RCW are marked as internal calls. When the runtime tries to resolve the internal call in mono, instead of looking it up in the internal call tables, a trampoline (forgive me if I'm using this term wrong) is emitted that will call the method on the underlying COM object. That trampoline first calls a helper method and passes it the MonoMethod and the MonoObject that the call is for. The pointer to the COM object is obtained from the MonoObject (stored in the extra space reserved for the pointer). Next, the interface that the method was defined on is determined. The GuidAttribute on this interface is used to call QueryInterfaces on the COM object. The returned interface pointer is the correct 'this' pointer for the unmanaged function. Then, the offset of the method in the interface is determined. This offset is used to obtain the function pointer via the vtable of the COM object. That function pointer is called using a stdcall calling convention with the 'this' pointer pushed as the first argument on the stack. Most COM methods return an HRESULT (int) that indicates success/failure. This will be translated into a managed exception.
The second case occurs when an interface pointer is returned from a method/property, or when a RCW is cast to an interface that is not listed in the metadata. RCW's are special in that a cast can succeed to an interface other than those listed in the class's metadata. The runtime calls QueryInterface on the underlying COM object, and if that call succeeds the runtime allows the cast to occur. At this point, the runtime knows nothing of the COM object's identity except that it supports the given interface. The managed object's type becomes a generic type that wraps COM objects, System.__ComObject.
To handle this in mono, a solution was built on top of the remoting infrastructure. First, a new internal class was defined, System.ComProxy, that derives from System.Runtime.Remoting.Proxies.RealProxy. The constructor for the ComProxy class takes an IntPtr argument, the pointer to the IUnknown interface of the COM object. System.ComProxy also implements System.Runtime.Remoting.IRemotingTypeInfo, which allows for special handling of casting via the CanCastTo method. When a COM interface pointer is returned, a new instance of ComProxy is created for the type System.__ComObject. Then a call to GetTransparentProxy is made. The transparent proxy (tp) object that is returned is cast to the expected managed interface. This in turn causes a call to CanCastTo, which calls QueryInterface on the COM object to determine whether the target interface is supported. If the interface is supported, the remoting infrastructure dynamically adds the interface and its methods to the proxy's vtable. Any method calls on this interface are now handled as in the previous case.
And there it is. Too much information for the casual reader, and not enough for those of you who care. Code will hopefully follow shortly, as I have time.
Labels: mono


2 Comments:
Could you give us an example (like a hello world) of using COM in Linux.
Yes, that would be helpful.
Post a Comment
<< Home