Ideas and Future Directions

Feat is currently in alpha status (no pun intended) and what comes next for the project is still being determined. If there’s something you’d like to see, give us a shout!

Here are some ideas we are evaluating for working on next.

Realtime/Performance

Generating data for training and experimentation is neccessary but not sufficient. Models deployed to production must have the most recent data possible in order to be useful, and being able to generate samples and features more quickly in production opens up more possibilities for trading.

To that end, one direction we’d like to eventually pursue with Feat is to enable a streaming mode that will ingest and process data continuously. For instance, it could use Kafka brokers as intermediaries instead of files, emitting new samples and features constantly.

Likewise, CSV is a very inefficient format to use, but we started with it because it is common, easy to inspect, and because many data providers distribute their historical data in CSV. Feat could be updated to support more formats, including binary protocols such as msgpack for additional improved performance. This could also improve performance when loading data into, say, Pandas.

Python Bindings

While a command line is a good first step, we’d like folks to be able to use Feat as transparently as possible within their existing code and infrastructure. To that end, we might like to explore adding support for Python bindings so that the high performance Rust code can be leveraged without needing to change existing pipelines or shell out to the command line.

More Sample Types

Currently Feat only supports generating dollar bars from the underlying data. We reasoned that this was a good first step, since they are straightforward to produce, while still being more desirable than good old fashioned time bars.

However, dollar bars are only the beginning. de Prado outlines other bar types such as tick and volume bars, and more excitingly, imbalance bars that try to detect large sweeps of the order book and other meaningful divergences. Being able to generate these other types of bars, and maybe novel sampling techniques too, is a direction we’re looking into.

More Input Data

There are endless data sources available, and smooth integration for ingesting and processing that data is valuable. We’re eyeballing tardis.dev, a crypto data provider, for the next source after IQFeed.

Feat also focuses primarily on processing filled trades at the moment - but there’s no reason it couldn’t generate samples from orders submitted and cancelled or not hit (L2 style data) as well. Similarly, we have mused about adding support for “ETF trick”-ing options data to make it more approachable to conduct research on common options strategies such as buying or selling straddles, performing covered calls, etc.

More Features

We could add support for computing strutural break tests, computationally expensive features such as autocorrelation, or features based on the market microstructure that are not otherwise readily available, such as the volume filled on the bid or ask.