If you're looking for something to replace Klu, Humanloop is a good option. It's a collaborative environment for managing and optimizing the development of Large Language Models (LLMs) applications. Its features include a prompt management system, version control, evaluation and monitoring suite, and tools for customization and optimization. It integrates with popular LLM providers and has integration SDKs in Python and TypeScript, so it's good for developers, product managers and domain experts.
Another good option is HoneyHive, which offers a full environment for AI evaluation, testing and observability. It automates CI testing, prompt management and monitoring for production pipeline failures. HoneyHive supports more than 100 models and has a variety of tools including automated evaluators, human feedback collection and custom charting. Its flexible pricing means it's good for individual developers and big enterprise teams.
If you're looking for a lifecycle management tool, Freeplay could be a good choice. It automates the development process with features like prompt management, automated batch testing, AI auto-evaluations and human labeling. Freeplay offers a single pane of glass for teams and can be deployed in a variety of ways for compliance. It's geared for enterprise teams trying to increase efficiency and speed up development velocity.
Last, you could look at LastMile AI, which is geared for helping engineers productionize generative AI applications. It offers tools for debugging and evaluating RAG pipelines, optimizing prompts and managing models. With tools like Auto-Eval, RAG Debugger and a notebook-inspired environment for prototyping, LastMile AI is designed to make it easier to deploy reliable AI applications.