I love using Python to play with data, solutioning or just prototyping. If I need to come up with some tricky algorithm, I often prototype in Python. Python is great for it, especially with Jupyter added. No compilation time, easy scripting, lots of libraries, especially those backend by native code written in C/C++. Using numpy and similar libraries makes things pretty fast comparing to just raw Python.
But then, any time you need to do lots of processing in Python itself, especially looping through amounts of data, you get hit by a performance issues, that make it inefficient to use Python codein production. Just recently, I needed to do some math and processing of 50MM-100MM elements in 2D array, and without numpy, that would take many hours if not days. Numpy helped to get it to 10-20 minutes. Significant reduction, but still too slow for me, if I want to make similar processing for tens of thousands of times.
I tried to re-implement this in Rust. Took me sometime given I'm pretty new in Rust, but it was a huge satisfaction to see that processing time dropped to 3-4 minutes, and after a few basic optimizations to 2-2.5 minutes. That sounds much better. Then I realized, I'm running this in debug mode. Switched to release mode, which added a bunch of own optimizations, and the time droped to 20-25 seconds. Wow!
But, I think, I still can do better. Can I use CUDA?..
No comments:
Post a Comment