I have written quite a lot of multi threaded code in my time and I have the sneaking suspicion that the reason NT cannot breach 30% very efficiently on anyone's computers is because they have too many cross-thread shared resources (mutexes, semaphores, etc - anything that can be locked to create atomic access.) Even when its just for a millisecond here and there it adds up and the cores cannot ever get up to full load.
----
My guess is that instead of saving the results inside each thread in their own private results list, then joining them up at the end; NinjaTrader is using a global result list across all the threads, access locked by a mutex, and each thread is storing the results of each run (while blocking the others via a mutex lock during this moment) creating a scenario where nobody can get above 30% usage because they are constantly getting into a blocking mutex lock (waiting for each other to write/read to the result list.) In this case results should be stored in the thread and only joined up into a global structure at the end.
Man I kind of wish I had the NT code, I feel like a week cracking at it and I'd have it burning down my 16 core machine.
----
This really should not be the case. If i was architecting NJ multi core support I would simply count the cores, divide up all the possible cases (when bruteforce method) by the number of cores, and blast through with only the occasional (i'm talking maybe once every 30 or 60 seconds) cross-communication to rebalance a core load if one finished early. This is how I implemented some computer vision pattern detection that churned away over 100s of hours of video and had Zero Problems keeping any machine fully loaded.
It'll all about properly preparing your data so you don't need go fetch anything mid-computation, properly dividing it in a way that requires 99.99999% of the time the threads are completely independent of each other and require no cross communication. And using mutexes only very very rarely.
I must repeat to great multicore code you must NEVER USE MUTEXES UNLESS ABSOLUTELY NECESSARY. If it can be done without a mutex it must be done that way. Mutex is just meant to mean any threadlocking structure.
Comment