Skip to content

una-ai-mlops-agency/ML-Batch-Serving

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Batch serving

Batch inference is about using data distributed processing infrastructure to carry out inference asynchronously on a large number of instances at once.

What to optimize: throughput, not latency-sensitive

End user: usually no direct interactions with a model. User interacts with the predictions stored in a data storage as a result of the batch jobs.

Validation: offline


Where to start

Learn MLOps general concepts:

Next learn how to build and run pipelines for batch serving on Azure cloud:

or overall:


Next step: Advanced workshop: Azure Batch Serving Pipelines

This workshop is WIP

It will cover a real-life use case of building, publishing, scheduling and troubleshooting Batch Serving pipelines on Azure with Python runtime.