Datos IO: Reinventing Data Protection for 3rd Platform Era of Next-Gen Applications and Cloud Databases
Tarun Thakur, Co- founder & CEOWith the introduction of cloud, mobile, and Big Data, CIOs are adopting next generation applications, such as analytics and IoT that gather large amounts of data at a high ingestion rate, and process that data in real-time to deliver actionable insights. These applications are deployed either on distributed databases on premise, such as Apache Cassandra (DataStax), MongoDB, or databases that are cloud native, such as Amazon DynamoDB and Google BigTable. The goal of these databases is to successfully run mission-critical applications without introducing latency. The challenge is that non relational databases that support these applications lack enterprise class data protection solutions, putting enterprises at risk of data loss.
Silicon Valley based Datos IO has developed the industry-first distributed data protection software built from the ground up for next generation distributed databases. With scalable versioning, enterprises can protect their data at any backup interval and at the granularity of a column family (MongoDB) or a collection (Cassandra). Reliable and orchestrated recovery at scale helps restore database tables from a point-in-time version to a running database, which could be of a different topology than the original source cluster. The product’s industry-first semantic deduplication capability disrupts the traditional notion of block level deduplication by providing space efficient backups for scale out databases. Developed as a scale-out software platform, Datos IO provides high availability of data protection infrastructure, as well as high throughput performance required to meet low recovery point objective (RPO) for large-scale (30 nodes and above) cluster. “Together, these features reduce capital storage costs and improve operational efficiency, while empowering enterprises to confidently deploy and scale next generation applications without having to worry about data loss,” Tarun Thakur, Cofounder and CEO of Datos IO.
According to Thakur, database administrators (DBAs) responsible for managing Apache Cassandra clusters and MongoDB databases are forced to maintain a series of manual scripts to provide backup and recovery services. “The consequence is an operationally-heavy, error-prone process that is insufficient for enterprise grade needs,” he adds.
Using Datos IO, application admins and database administrators can manage and recover data at scale for their next generation databases
“Using Datos IO, application and database administrators can manage all backup and recovery policies for their next generation databases.” The ability to reliably recover data at scale enables DBAs and DevOps teams to efficiently use their storage and compute resources, while meeting stringent business and application uptime requirements.
One of the largest consumer financial institutions in the U.S., and a key early adopter of Datos IO, tackled their database issues successfully using the company’s software. The customer built an event hub—a collection of all touchpoints with their customers collected from various devices. All of this data was stored on their on-premise Cassandra cluster in large batches and updated multiple times a day. This customer observed that some of the batches could contain corrupted data and would be difficult to identify the bad batches. They turned to Datos IO to help create backups at every instance a new batch was to be injected into the Cassandra cluster. The solution also assisted in performing checks after the ingest, and in case of any errors they would be able to quickly recover the data without impacting response time for their service. Similar examples exist around a broad range of verticals such as financial services, retail, e-commerce, Security, and Education.
Datos IO aims to ease any and every enterprises transition to the new data centric world of high-volume, high-ingestion rate and real-time applications and distributed databases. “We want to continue defining and innovating the way data is protected and managed at scale, enabling application developers and IT admins to confidently deploy and scale their applications without worrying about data loss,” concludes Thakur.