DataStax, a leading contributor to the open-source Apache Cassandra database, has brought vector database search to multicloud with its Astra DB offering. Astra DB is a commercially supported cloud Database-as-a-Service (DBaaS) that brings the benefits of Cassandra to organizations. With the latest update, Astra DB now supports vector capabilities, expanding DataStax’s AI and machine learning (ML) capabilities.
The vector capability, which was first previewed on Google Cloud Platform in June, is now generally available on Amazon Web Services (AWS) and Microsoft Azure as well. This update allows organizations to leverage DataStax’s widely deployed and trusted database platform for both traditional workloads and AI workloads.
Vector databases are essential for AI and ML operations, as they enable content to be stored as a vector embedding, which is a numerical representation of data. Vectors are particularly useful for representing the semantic meaning of content and have broad applicability in large language models (LLMs) and content retrieval.
DataStax’s vector search uses vector columns as a native data type in Astra DB. This means that Astra DB users can query and search vector data just like any other type of data. The availability of vector database capabilities in Astra DB precedes its availability in the open-source Cassandra project. However, the feature has been added to the open-source project and will be part of the upcoming Cassandra 5.0 release later this year.
Cassandra’s extensible data type architecture allows for the incorporation of additional native data types over time. As a native data type, vectors (and other data types) are seamlessly integrated with Cassandra’s distributed index system. This scalability enables organizations to handle large datasets with millions or even trillions of vectors without any concerns.
Astra DB now also supports LangChain, an open-source technology that enables developers to use multiple LLMs together. This integration allows Astra DB’s vector search results to be fed into LangChain models, enabling the generation of responses and recommendations based on vector search results.
The availability of vector capabilities in Astra DB is a significant step toward making generative AI a reality for enterprise users. DataStax is excited about this development and ready to support customers looking to incorporate generative AI into their production environments this year.
With its commitment to AI and ML, DataStax continues to enhance its platform to meet the evolving needs of organizations. By providing a trusted and scalable database platform that supports both traditional and AI workloads, DataStax empowers businesses to unlock the full potential of their data.
In the vector database space, there are various approaches and vendors available. Purpose-built vendors like Pinecone and open-source projects like Milvus offer vector database solutions. Additionally, some existing database platforms, including MongoDB and PostgreSQL, have added support for vector search.
As the demand for AI and ML continues to grow in enterprises, the availability of vector capabilities in Astra DB gives organizations a powerful toolset to leverage their data and enhance their AI initiatives. With DataStax’s ongoing commitment to innovation and the integration of advanced technologies, the future of AI and ML looks promising for enterprises using Astra DB.