Creating 0.3 returns per request

How Onspring maintains high-performance across the globe

We’ve been asked, on many occasions, how Onspring goes about being so insanely fast and responsive as a product. The most recent encounter related to a customer who has users spread all throughout the world. Their specific query was how their users in Asia were seeing absolutely no performance impact in the way one would expect when hitting our data center in the heart of the United States. Their average response time when making requests from their Asian offices to Onspring was less than 1/3rd of a second. That’s below 0.3 seconds per request. How is that possible? Certainly the latency alone, the time it takes an electron to travel from one hemisphere to the next, would have a major impact. So, how does Onspring do it?

In order to accomplish an overall average of 1/10th of a second per request (0.108 seconds, to be exact), we have to go back to the beginning:

One of the two primary focuses we’ve had for Onspring, even before our first line of code, was the performance and responsiveness of every user interaction. I’ll admit that personal, somewhat intense, focus on “performance as a feature” wasn’t initially driven by some altruistic love for soon-to-be customers. It was all about personal frustrations with software, and the learned ability to solve those problems. As an added bonus, we had time to hone frustration with software performance while working on an outside project with a group of friends. We had a very limited budget and had to keep up with millions of requests per hour during our live-coverage. We spent hours and days and weeks researching and honing our performance tuning skills. We even developed some really cool technology to handle our specific use case. Sadly, that tech didn’t have a ton of use for Onspring…. but the skills we learned did.

It’s no mistake that some of the most well-used and well-liked products on the internet are known and loved for their performance.

Onspring Results

More important than the average response time is the percentiles of response time. These results include the time it takes Onspring to receive the request and send the last byte of the response.

99th Percentile: < 1.010 seconds
95th Percentile: < 0.421 seconds
90th Percentile: < 0.275 seconds
50th Percentile: < 0.025 seconds

“Did I read that right?”

Yes, 99% of our requests are served in under 1.01 seconds… and half of them are served in under 0.025 seconds. That’s 1/40th of a second!

“Does that include data to the entire world, even mainland China, India, and Australia?”

Absolutely. The only requests we’ve excluded are the ones which are intentionally slow. That’s basically anything related to passwords and other proprietary security functions which must prevent brute-force attempts.

No secret rules for software performance

Build performance into your design

From the moment we begin to write a design specification, or create a bug entry, all of our employees consider the performance and usability implications of each request. These are further expanded upon as we progress through the stages of development. Every design element is created in a way that minimizes as much of the performance hogging resources as possible. Back-end changes are reviewed for possible caching implications. Performance as a part of usability and functionality is a requirement of all features.

When everything you build starts with usability and performance in mind, you rarely ever create poor performing features.

Test early, test often

When our development team encounters a decision point with more than one option for implementation, we immediately test the options against known scenarios in order to pick the most effective solution. Many times, both solutions are equally as performant. However, in our time, we’ve encountered some absolutely massive performance improvements from things as simple as a single line change to how we handle cached arrays. One specific change involved how we enumerated over a list. The change alone saved about twenty-percent of any page request that contained a grid of items.

Performance testing tools

We run all of our code through performance-testing tools. We use several of them. We’re concerned both with line-level timings in code, latency in HTTPS requests, narrow caching, deep caching and database latency. You may have a method that takes 1/2 a second to run, but only runs once an hour. Spending time on it may be a waste. You’ll gain a ton more by fixing that line of code that’s called 4 million times an hour, by just improving it 0.00001 seconds per call.

Compression and minification

When transferring data via the web, it’s always best to ensure that you’re doing everything possible to compress the output to the browser. You can validate how well you’re doing this with some simple online tools, such as the HTTP Compression Test. In most web-servers this is relatively easy to configure. We have a different way of doing it that’s a bit faster. However, we’ve found the web-server based solutions are nearly as fast as you can get. Just remember to make sure you’re compressing everything, even your dynamic content.

Of course, before you bother with compression, you should also do your best to minify your content. Minification of static resources should be done prior to deploying them into production. For dynamic resources, there are a number of methods you can use. In our case, there was already a wonderful library in the open-source community that was the best performing solution we could find (or create).

Latency busting

Often the most difficult issue to solve is that of latency. TLS negotiation, network latency and slow systems (old laptops, phones), can create major latency that’s nearly impossible to solve. Each of these has their own unique solutions, which are entirely dependent upon your user base. If you’ve got tons of people who are potentially on old hardware, you’ll want to ensure you’re offloading most of the processing to your systems, and not theirs.

You can solve network latency by using CDNs. However, in Onspring’s case, those were out of the question for Information Security purposes. You can also solve it by simply having the user make fewer requests to your system. Consolidate the data you’re sending into fewer requests for a big win.

For TLS and others, we created a proprietary mix of solutions that got as much of the gunk out of the system as possible without having any impact on the level of security. It wasn’t easy, but one of our security vendors found a solution that worked wonders.

Sampling of global resonse times from Kansas City

With the items from the prior section, you can get about 80% of the way there with performance. In fact, nothing in there should be a surprise for anyone who’s spent months doing performance research. All of that information is on the internet, in much greater detail than I’ve provided. The rest will require a lot of dedication, prototyping and time.

In our case, dedication helped us to create a very unique set of additional “features” that are what separate us from all the rest. As you can imagine, we’re not going to share that on our public blog.