The perennial battle between IT vendors that invest in proprietary software and open source advocates is starting to play out among providers of tools and platforms for managing machine learning operations (MLOps).
A small startup company based in Munich, Germany, dubbed maiot has launched ZenML, an open-source MLOps framework for creating the machine learning pipelines employed to construct AI models. The goal is to democratize the building of AI models by making it simpler for data science to reuse pipelines across multiple AI projects. Specifically, ZenML provides tools for caching pipelines to make them easier to find and reuse, manages the version control process for pipelines, provides visualization tools, and can be employed to pre-process large data sets.
“ZenML allows data science teams to manage pipelines at a higher level of abstraction,” said Hamza Tahir, co-creator of ZenML.
At the same time, there’s a nascent open source community building around Feast, an open source feature store that provides a registry that makes it easier to reuse the components of an AI model, known as a feature, amongst a data science team. In fact, entire AI models can be stored in a Feast repository that a data science team could modify and extend for different use cases as they see fit. Contributors to Feast include Google Cloud, Zulily, Agoda, Gojeck, Farfetch, and Cimpress.
Automating AI Models
Most data science teams today employ scripts, notebooks such as Jupyter, and custom code to construct the data pipelines they use to train AI models. It’s a painstaking effort that in part accounts for why most data science teams are fortunate to roll out two AI models a year.
It’s still early days as far as platforms for automating the construction of AI models is concerned, but the whole area is now a major area of focus. Amazon Web Services (AWS) is expanding the capabilities of its Amazon SageMaker platform to automate the construction of AI models, while vendors such as Splice Machine are moving forward with platforms that look to automate the building and deployment of AI models on an end-to-end basis.
“Data science teams need to be able to move quickly from an experimentation phase to deploying an AI model is a way that is repeatable, transparent and when necessary easily rolled back,” says Splice Machine CEO Monte Zweben.“If God forbid something goes wrong with that AI model there needs to be a way to roll it back.”
Vendors that build these platforms are, of course, trying to recoup their investment. However, builders of AI models already rely heavily on open source libraries as TensorFlow to build and deploy the machine learning algorithms that make up an AI model. As such, there’s already a significant bias toward open source tools in the AI community. The issue IT organizations will need to come to terms with onces again is to what degree to rely on open source tools and platforms that they need to integrate and manage versus relying on an integrated platform provided by an IT vendor.
Speeding Up Innovation
Regardless of how AI models are constructed there’s a general consensus the current process is too slow. Organizations are looking to expand the number of AI models they can build concurrently as part of their overall efforts to accelerate digital business transformation initiatives. That’s not going to occur unless the process for building and deploying AI models becomes a lot more automated than it is today.