If you want to automate infrastructure provisioning for AI model development, training and deployment on multiple cloud services, you could look at dstack. dstack is an open-source engine that automates AI workload management with concepts like dev environments, tasks, services and pools. It can run on a variety of cloud services, including AWS, GCP, Azure, OCI, Lambda, TensorDock, Vast.ai, RunPod and CUDO, as well as on your own servers. That means you can concentrate on your data and research while saving money.
Another good option is Anyscale, a service based on the open-source Ray framework. It can schedule workloads, run on multiple cloud services, manage instances automatically and split GPUs and CPUs for efficient use of computing resources. Anyscale supports many AI models and can save you money, with direct integration with popular integrated development environments and a free tier with flexible pricing tiers.
Another option is RunPod, a globally distributed GPU cloud service that lets you run any GPU workload. It lets you spin up GPU pods instantly, run ML inference with serverless computing and autoscale, and supports frameworks like PyTorch and TensorFlow. RunPod charges by the type of GPU instance and usage, with prices ranging from $0.39 to $4.89 per hour.
If you prefer a more code-centric approach, Pulumi offers an infrastructure as code (IaC) SDK that lets developers create, deploy and manage infrastructure across multiple clouds using languages they're already familiar with. Pulumi supports AWS, Azure, Google Cloud and Kubernetes, and can be integrated with existing software delivery pipelines, making it a good option for boosting productivity and scaling infrastructure operations.