Part 2 – Handling the data

Note: all the notebooks for this course have been updated on 4.4.2018. The material has been restructured to a more book-like form, with its own index and is now available as a single-file download . I will be updating the material on the site in coming weeks.

Introducing Pandas

I love researching strategies, but usually even before I can start, I need to spend quite some time on filtering and aligning data. A couple of years ago, when I was working in Matlab, about 80% of my time was spent on this mind numbing work (no fun!). There had got to be a better way than hacking all the filtering code myself, and there is!

The data analysis toolkit pandas is especially suited for working with time series like financial data. The package is  developed by Wes McKinney  who has an ambition is to create the most powerful and flexible open source data analysis/manipulation tool available. Well, I think he has done a terrific job! Sole availability of the pandas library made me abandon Matlab in favor of Python and I have never had any regrets. The time needed for data filtering is now cut in half, freeing more time for strategy research. This video from Wes himself will give you a 10-minute overview of Pandas. It goes much deeper than you really need at this moment, so don’t worry if you don’t understand every bit of it.