Extend model training engines

Hi Datagrok team! I’ve recently started looking into the platform by building a module for advanced regression of continuous variables. I’ve learned it is possible today to train predictive models using either of H2O, Chemprop and Caret engines, compare their quality and use in grok interactive applications. It is also possible to extend grok ecosystem with own packages written in one of python, R, Julia and JS.

With that being noticed, I didn’t find a means to add an own predictive modeling engine to use in “Predictive modeling” Datagrok facility. Are there some plans in the roadmap to provide for such means? Or maybe there is a demo-code which I missed. The reason I am asking is that, essentially, what I am building with Python scripts is a kind of predictive model engine, which generalises in the future to something like a “scikit-learn engine”.

Without being able to add own engines I’d have to implement models comparison dialogues and forms from scratch as a separate package. Then there’d be a build-in grok’s “Predictive modeling” dialog, but also 3-rd party packages for the same task. Instead of this, it feels like there should be a good way forward with making it more uniform across platform facilities.

E.g., introduce a “Predictive modelling” kind of app and “Predictive models comparer” kind of app. S.t. perhaps new modeling engines may be discovered as packages and embedded by the platform into its build-in facilities. This would allow to not multiply different types of entities across the platform serving same purpose.


Hi Dan, welcome to the community!

Currently, there is no way to implement a predictive engine as an extension. While the mechanism itself is extensible, changes have to be made to the core of the platform.

Having said that, conceptually there is nothing wrong with having an engine implemented in a package, or even as a number of functions (train, infer, performance) with the appropriate signatures and metadata - this way, the function could be implemented in any language. I actually like this idea a lot, and I do agree that it might be a very powerful addition to the platform. Let’s look at it closer in the near future, and see how feasible it is.


1 Like