If you're looking for a DEKUBE alternative, Anyscale is worth a look. It's a platform for building, deploying and scaling AI workloads, with abilities like workload scheduling, cloud flexibility, automated instance management and GPU and CPU fractioning for better resource utilization. Anyscale, which is based on the open-source Ray framework, can run a variety of AI models, including LLMs, and promises cost savings of up to 50% on spot instances.
Another top contender is Together, which is geared for fast and efficient training and deployment of generative AI models. It includes new optimizations like Cocktail SGD, FlashAttention and Sub-quadratic model architectures, and supports a variety of models for different AI tasks. Together offers scalable inference and collaborative tools, and promises big cost savings compared with AWS and other suppliers.
Salad is another contender. This cloud-based service lets you deploy and manage AI/ML production models at scale, with a focus on cost. It's got a fully-managed container service, a global edge network, on-demand elasticity and support for multiple clouds. Salad supports a range of GPU-hungry workloads, and pricing starts at $0.02/hour for GTX 1650 GPUs, though you can get discounts if you're using it at large scale.
If you're looking for a service that lets you provision on-demand and reserved NVIDIA GPU instances and clusters, Lambda is a good option. It offers a range of GPUs, including NVIDIA H100 and GH200 Tensor Core GPUs, and offers preconfigured ML environments with popular frameworks like TensorFlow and PyTorch. The service is geared for ML-first user experiences, letting developers quickly provision and manage GPU instances for their projects.