If you need a utility that can extract features and normalize data for a wide variety of input data, Dataloop could be the ticket. It handles data curation, model training, pipeline orchestration and human feedback to speed up AI application development. The service can handle a variety of unstructured data, including images, videos and text, and is designed to improve collaboration and speed up development while maintaining high security.
Another contender is Lume, an AI-powered data mapping automation tool that can be integrated directly into systems to eliminate manual data processing. It can handle new schema changes and has features for reviewing and editing mapping logic. Lume is geared for industries where data mapping and normalization are key, like financial services, e-commerce and manufacturing, and can help reduce manual labor and errors.
Neum AI is an open-source framework for building and managing data infrastructure for Retrieval Augmented Generation (RAG) and semantic search. It offers scalable pipelines to process millions of vectors and can handle real-time data embedding and indexing. The framework is geared for large-scale and real-time data use cases, so it's good for applications where data is constantly being updated and search results need to be very accurate.