Kafka Streams Topology Testing: Mocked Streams 3.3 is out.

Wind Prediction with WindML/Python

I share my opinion on scientific programming and WindML.

The last months in my workplace, I learned a lot about scientific programming. The challenges I have in mind when I'm thinking about scientific programming are: Writing experiments, parallelize experiments to run on multiple cores, saving results in a manageable way, statistically analyze the results, visualize with plots etc. I encountered different tools which help me accomplishing these different tasks. At work, we are using Python with the SciPy toolchain, like Matplotlib and NumPy. In my opinion, Python is a great programming language (with some quirks like any other language), especially with those killer-libraries, for solving conceptual problems like improving an algorithm or as a tool to transform or analyze data. It is well adopted in the science community, see the PyVideo collection of talks. For a good overview of free Python learning ressources check out this collection of links.

Nevertheless, in this post I would like to write about our wind forecasting machine learning library WindML (on github), which is based on the previously mentioned amazing libraries. WindML provides versatile tools for various learning tasks like time-series prediction, classification, cluster, dimensionality reduction and related tasks. In addition, the access to the publicly available data sources NREL and AEMO is a piece of cake. The data sources are automatically parsed and stored in NumPy format and can be processed as seen in the examples. If you want to familiarize yourself with our statistical short-term forecasting method using different regressors, or further techniques like latent embeddings, I like to refer to the techniques page which gives a brief overview. For a detailed description, have a look at the associated publications. The project is relatively new, the first release was about a month ago, but we're actively using it for research and continously extend its functionality mainly towards our research questions. Even though WindML isn't available on PyPI yet, we already received some feedback, which is great! I'm happy that I can support this open source framework and I hope others can use it as a tool for their research as well.

One or two mails a month about the latest technology I'm hacking on.