assistants api Agency Swarm with Third Party and Open Source Models I have recently been answering a bunch of questions from agency swarm users looking to leverage Astra Assistants. Agency swarm is built ontop of OpenAI's Assistants API and folks have been voicing desire to use it with other model providers and with open source models.
openai What is Astra Assistants Astra Assistants is a drop in replacement for OpenAI's Assistants API that supports third party LLMs and embedding models and uses AstraDB / Apache Cassandra for persistence and ANN. You can use our managed service on Astra or you can host it yourself since it's open source.
astra Connecting to Astra from DataGrip via JDBC A few users have been asking me lately about connecting to DataStax Astra from different developer tools. As a result I am planning to do a series of quick posts around these starting
akka-persistence-cassandra Astra and Akka-Peristence I've been chatting with a few folks that run akka-persistence backed by cassandra using either the akka-persistence-cassandra project directly or via the Lagom micro services framework. These folks are often big fans of
DataStax Proxy for DynamoDB™ and Apache Cassandra™ - Preview Yesterday at ApacheCon, our very own Patrick McFadin announced the public preview of an open source tool that enables developers to run their AWS DynamoDB™ workloads on Apache Cassandra. With the DataStax Proxy
Integrate Spark Metrics using DSE Insights Metrics Collector Metrics and visibility are critical when dealing with distributed systems. In the case of DSE Analytics we are interested in monitoring the state of the various Spark processes (master, worker, driver, executor) in
DSE Gremlin Queries: Good, Better, Best Why Some Gremlin Queries Run Faster than Others Intro With great power comes great responsibility. --Spiderman's Uncle The Gremlin language gives users great power in the form of traversal expressivity. With the dozens
Large Graph Loading Best Practices: Tactics (Part 2) The previous post introduced DSE Graph and summarized some key considerations related to dealing with large graphs. This post aims to: describe the tooling available to load large data sets to DSE Graph
Large Graph Loading Best Practices: Strategies (Part 1) This post is an intro to DSE Graph with a focus on the strategies that should be used to load large graphs with billions of vertices and edges. For those familiar with DSE
Cluster Migration - Keeping simple things simple I often get asked about the different ways to move data across DSE clusters (prod to qa, old cluster to new cluster, multi-cluster ETL). There are different options for these ranging from custom
C* schema changes and compatible types All the schema operations that can be done in c* are done without downtime. You should limit these actions as a best practice to 1 client (not multiple concurrent clients) to avoid schema
On Cassandra Collections, Updates, and Tombstones update I was chatting with a user today who referenced this old post. Most of it is still relevant but sstable2json is no longer supported in modern c*. The new tool is sstabledump.
Tuning DSE Search - Indexing latency and query latency Introduction DSE offers out of the box search indexing for your Cassandra data. The days of double writes or ETL's between separate DBMS and Search clusters are gone. I have my cql table,
Things you didn't think you could do with DSE Search and CQL Intro CQL and DSE Search promise to make access to a lucene backed index scalable, highly avaliable, operationally simple, and user friendly. There have been a couple of developments in DSE 4.8
Minimizing DSE Search (solr) Indexes Intro / why? Search query performance depends on our ability to utilize the OS page cache effectively to keep search indexes hot. The smaller the size of your indexes, the easier it will be
Interpreting Cassandra Repair logs and leveraging the OpsCenter repair service Introduction to repairs and the Repair Service Cassandra repairs consist of comparing data from between replica nodes, identifying inconsistencies, and streaming the latest value for mismatched data. We can't compare an entire cassandra
cassandra Cassandra Deletes - Understanding Range Tombstones Cassandra Deletes : An Introduction In a distributed Database, replication is key for ensuring high availability and performance. Once data gets deleted from a node in Cassandra, having a good understanding of c* deletes
cassandra Using the Cassandra Data Modeler to Stress and Size C*/DSE Instances Summary The main drivers behind Cassandra performance are: Hardware Data Model Application specific design and configuration (quick link to the modeler Cassandra Data Modeler for those that are just looking for the tool)
Using Brian's cassandra-loader/unloader to migrate C* Maps for DSE Search compatibility Intro Using map collections in DSE Search takes advantage of dynamic fields in Solr for indexing. For this to work, every key in your map has to be prefixed with the name of
How to use tobert's effio Summary The most critical OS subsystem for the performance and stability of a database like Cassandra is disk. Tobert wrote an excellent go utility called effio that harnesses the power of fio and
Welcome Here's some things you may be interested in: Cassandra Data Modeler Matches OpsCenter Import/Export - The tool OpsCenter Import/Export - The Blog Post DataStax Startup Program