I'm working on BentoML, a python framework for ML model serving.
It makes it easy for data scientists to ship their trained machine learning models into prediction services for production use.
Key Features:
- Model packaging and dependency management
- Distribute your model as a docker image, CLI tool, PyPI package
- Adaptive Micro-batching in online API model server - this gives you an average 10-20x increase in throughput/overall performance compared to a regular flask API server implementation
- Model Management for teams
- Automated model deployment to AWS Lambda, AWS SageMaker and more
It is very different than sagemaker SDK - BentoML is a flexible and all-in-one solution for model serving. You package the model once and can easily test it, do batch/offline serving and online API serving. And even if you just package your model with BentoML and deploy it to Sagemaker, you get a 10-20x performance improvement out-of-the-box comparing to doing it with the sagemaker sdk.
It makes it easy for data scientists to ship their trained machine learning models into prediction services for production use.
Key Features:
- Model packaging and dependency management
- Distribute your model as a docker image, CLI tool, PyPI package
- Adaptive Micro-batching in online API model server - this gives you an average 10-20x increase in throughput/overall performance compared to a regular flask API server implementation
- Model Management for teams
- Automated model deployment to AWS Lambda, AWS SageMaker and more
https://github.com/bentoml/BentoML