In specific, I need an opensource platform that supports my model to run on terabytes of data

I am exploring different open source platforms to support my custom ML models that simply takes an input and emits an output.

I came across Spark's rdd.pipe(my_model). But looks like that isn't suited to build pipelines with scheduling options.

Looking for recommendations in any opensource tool/technology

Related posts

Recent Viewed