The performance of my implementation seems about as good as can be. There are no unnecessary copying of memory, no more advanced data manipulation than quick table lookups and with the Host Controller Interface and the Link Manager in one symbol we save the extra traffic and the need for memory buffers in both stages.
Altogether the result of the LMP and HCI implementation are very satisfactory. The implementation lies very close to my interpretation of the protocol description and I could not imagine the speed and size to be much better. In our implementation of the protocol stack the Link Manager and Host Controller are fully sufficient and even offer functionality that as of yet is completely unused but might come in handy for future users of the protocol stack.