A farewell to Dune v1

Performance challenges, manual reprovisioning, and some of the challenges that led us to sunset Dune v1

Now that Dune v1 is fully gone, it's time to reflect on its legacy.

When I joined Dune, the infrastructure powering Dune v1 was not called “Dune v1”, it was just, well, the databases. Every blockchain had its own PostgreSQL database. We did not use Amazon RDS as it would have been extremely expensive. Instead, we created individual EC2 instances for each database, and provisioned Postgres on th,em via Ansible.

After a while, we ran into some interesting problems!

One of those problems was performance. We had one EC2 instance per blockchain, but Ethereum was by far the most queried. We were hitting a wall in terms of input/output operation per second (IOPS) and our EBS disks just couldn't keep up with the volume of data being read and written at the same time. So we built a primary/replica set up, where data would be written into the primary database backed by EBS for long-term storage and replicated to a bunch of read-only replicas. And these read replicas needed performance, but could be replaced easily, which made them a great fit for EC2 instances with extremely fast local NVMe disks, EC2 Instance Store.

We created the primary/replica setup via PostgreSQL physical replication + WAL-shipping to S3 with wal-g on the write side, and a combination of pgbouncer and haproxy to fan out read only queries to several replicas on the query side. Adding several new moving pieces at once seemed a bit dangerous at first. With the right amount of monitoring & alerting though, we were confident we could quickly find and debug issues. And it was actually rock-solid until the end! At Dune v1’s peak, we were running over a million IOPS across our ethereum read-replicas without breaking a sweat.

For comparison, the maximum IOPS a standard EBS volume can sustain is 16k, which is about a thousandth of a percent of this number.

What we lost sleep over at night was that all of these boxes were precious, little snowflakes, and the recovery setup was "rerun Ansible and hope the instance restarts," 🤞. It was okay – we had backups and EBS snapshots, and in the worst case we would just rerun ingestion from the start of the blockchain, but it was not a great situation to be in.

We also tried using spot instances for read-replicas, as they were much cheaper than on -demand instances. AWS Spot Instance Advisor showed the risk of that instance type being reclaimed was "less than 5%" over a month. Turns out, we had our first instance reclaimed after five days. The next one got reclaimed the following day (during an offsite in Lisbon no less! I believe Fredrik still has the photo of me restoring from backups during dinner). The next one got reclaimed two days later. The manual process to re-provision a read-replica took 2 hours and was not automated, which would normally be okay once a year but not twice a week. So we quickly gave up on spot instances.

Then, the database storing data for the Binance Smart Chain (BSC) reached the maximum size we could use for an EBS volume at 16TB, so we investigated using ZFS to compress the data on-disk. It was a huge failure! The disk savings were real, but so were the performance hits. We spent some time trying to optimise it all, and eventually gave up, as it was a big time sink and we had already planned on moving to Dune v2. So we migrated the BSC database from EBS to instance store as well, where any EC2 issue would have meant losing the data and starting from S3 backups as we knew we could no longer rely on EBS snapshots.

These were exciting times, which is an insult in SRE-land. Glad that Dune v1 lives on in tales of yore rather than on our AWS accounts!

Contact Us

Ready to elevate your data strategy? Have questions about our solutions? We're here to help.

Ready to get started?

Individuals + Small Teams

Create and explore queries, dashboards and trends with 500,000+ data analysts.

Enterprise

Tailored solutions designed for the largest crypto teams and premier organizations.