After 8 years of working on high-volume web sites, I’ve seen a lot of interesting scaling techniques. While it’s true that very specific strategies can eke out enormous efficiency gains, complication doesn’t come without a cost. In my experience, scaling problems that a moderately-sized site will encounter can be solved with: Asynchronous work queue Proper database use Appropriate caching The examples below are written in the context of Ruby on Rails, but should translate well to other languages and frameworks. Asynchronous Work Queue As a web site grows, controllers can easily become bloated and slow down with all the additional tasks they might have to perform, such as: Analytics tracking Sending emails Creating additional database records Inadvertent N+1 query side effects For example, imagine a user wants to delete his profile and thousands of records associated with it — this can take a while. And it really doesn’t have happen ; it could instead acknowledge that the request has received and they’ll be sent email when it’s finished. You could then perform the actual delete in an . An async work queue consists of: immediately asynchronous work queue Queue of performable jobs and job parameters (often in Redis, MongoDB, MySQL, etc). Pool of workers that pluck jobs from the queue and perform them. Basically, anything that can be deferred from the controller should be: These are a must-defer because their response time could be variable/slow and the web request shouldn’t block rendering the response. Even worse, an API provider that is failing could bring down your site if you don’t properly detect and abort slow responses. Sidestep the problem and perform as many API calls as possible in async workers. API requests to 3rd party services. Basically an API request. Email. While often orders of magnitude faster than an API request, inserts can still take 10–100 milliseconds because of factors like database load and the number of indexes and foreign key constraints. Database record creation. The only reason not to defer is if the result is important to the response; e.g. a payment request might block so that the customer is immediately notified that payment has failed. Outside of the web request cycle, async work queues are very useful to parallelize large work loads. For example, imagine a daily scheduled script that announces new products to a 10k email distribution list. If this script sends all the emails in sequence, this could take upwards of an hour. With a dozen async workers this might take closer to 5 minutes. It’s also simple to temporarily scale the worker count if you need to handle job/traffic spikes. Asynchronous workers are tremendously effective for a variety of problems and have a low implementation cost, making them a pragmatic solution in a variety of situations. Proper Database Use The programming language your web site is written is probably very general purpose — it’s optimized for flexibility at the cost of efficiency. A database is a highly optimized computation engine for . Literally hundreds (if not thousands) of books have been written about databases, so I’m only going to mention a few common scenarios I run across. relational algebra N+1 Query Problems One of the biggest complaints I’ve heard of Object Relational Mappers (ORMs) — especially ActiveRecord in Rails — is that they make it really easy to write bad queries/sets of queries without knowing. Consider for example: User.each { |u| puts u.address } Seems pretty innocuous. But what’s not obvious is that Rails is doing one query to load the relation, and then in each iteration of the loop querying the table: User Address SELECT * FROM users; SELECT * FROM addresses WHERE user_id = 1; SELECT * FROM addresses WHERE user_id = 2; ... This class of performance problem is called an N+1 query problem and is simple to identify by a simple analysis of your SQL log. The fix is as straightforward too: instead of querying in a loop, collect the IDs of your users and query all of them in one statement: SELECT * FROM addresses WHERE user_id in (SELECT id FROM users); SELECT * FROM users INNER JOIN addresses ON (users.id = addresses.user_id); Ruby on Rails provides a few “eager loading” mechanisms to make it easy to avoid N+1 query patterns: User.includes(:address).each {|u| puts u.address} Under the hood, this instructs Rails uses pattern to select against addresses with the user_ids from user. SELECT..FROM..WHERE id IN (?) Missing Indexes When querying a table by any column the database engine has two options: look at each record in the table or use an index — a highly-optimized data structure that can be used to select or eliminate records before looking at the table. On tables smaller than 10k records, it can be easy to not realize you’re missing an index because it’s fast enough. Beyond 10k though, queries will become noticeably slow. Foreign key relationships ( ) are an obvious place to put an index. Covering indexes, where all information you need to return is in the index, can speed up queries by obviating the need to access the table itself at all. If you’re not sure if you’re using an index, you can use your database engine’s command: other_table_id EXPLAIN EXPLAIN SELECT * FROM foo; QUERY PLAN ---------------------------------------------------------Seq Scan on foo  (cost=0.00..155.00 rows=10000 width=4)(1 row) EXPLAIN SELECT sum(i) FROM foo WHERE i < 10; QUERY PLAN --------------------------------------------------------------------Aggregate  (cost=23.93..23.93 rows=1 width=4)->  Index Scan using fi on foo  (cost=0.00..23.92 rows=6 width=4)Index Cond: (i < 10)(3 rows) This tool will quickly show you the strategy your database engine will use. The documentation for should be followable even if you’re using another database. The excellent database administrator tool will also automatically surface index add/drop recommendations based on usage statistics. Postgres’ EXPLAIN PgHero Serialized Data Relational algebra works on structured data: pre-defined tables, columns, and relationships. Semi-structured data such as XML and JSON are not always predefined though. The workaround has traditionally been to serialize your XML or JSON and store them as a text record. Recent database releases have added native column support for JSON (original text stored) and JSONB (binary structure). API responses are popular JSON structures to store in a database. Using a native JSON format, API response details can be easily be queried in the database instead of in your programming language after deserialization: SELECT * FROM stripe_charges WHERE (transfer->'amount')::int > 1000; Or you can pluck individual values from a large JSON object: SELECT (transfer->'created'), (transfer->'amount')::intFROM stripe_charges; Aggregate in the Database Imagine you wanted to select all users who have purchased more than $100: users = User.includes(:payments).select do |u|u.payments.map(&:amount).sum > 100end This will instantiate an ActiveRecord object for every user in the table, every payment in the table, and then throw away records it doesn’t need. Instead, we can instruct the database to do the filtering and only return User records that meet the criteria: User.joins(:payments).group('users.id').select("users.*, SUM(payments.amount) AS total_amount").having("SUM(payments.amount) > 100") This while generate the appropriate , , and : JOIN HAVING GROUP BY SELECT users.id, users.name, SUM(payments.amount) AS total_amountFROM usersINNER JOIN payments ON payments.user_id = users.idGROUP BY users.idHAVING SUM(payments.amount) > 100 Database Wrap-up Databases are mature, highly-optimized data stores. Whenever possible, let databases do what they’re good at instead of doing it more slowly in your web programming language. Appropriate Caching A common saying in the development community is that there are only two hard things in computer science: naming, caching, and off-by-1 errors. Caching can be hard to get right, leading to hard-to-track-down stale value bugs. I’m definitely not claiming that caching is a silver bullet; but it is certainly a lead bullet that should be in your arsenal. Often, entire responses can be cached, such as: API endpoints with the same response for every user (e.g. product catalog, search typeahead database) Logged out homepage sitemap.xml These cacheable responses are some of my favorite low-hanging fruit for performance improvements. In the case of a cache hit, no template engines needs to be invoked and the web server immediately begins serving content to the client. It’s hard to beat a .1ms response. If 50% of traffic enters via the logged out homepage, caching the entire page can be a huge win. Especially because Google considers page response time as a signal when they rank search results. Beyond user and ranking benefits, the cached responses will also lighten server load. If at first it doesn’t seem like the entire response is cacheable, think again. A logged-in homepage might just be a logged out homepage with a logged-in menu bar; in which case you can quickly serve the logged-out homepage, and then use AJAX to replace the logged-out menu with the logged-in menu. Over 50% of requests to were served in under 10ms www.cameralends.com Even if an entire response isn’t cacheable, expensive bits can be: results of expensive queries (e.g. aggregation queries like , complicated joins) SELECT COUNT(*) view partials that require heavy computation, are used with many times on the same page, or can be cached for a very long time 3rd-party API responses A common problem with caching is the problem: cache stampede _A cache stampede is a type of cascading failure that can occur when massively parallel computing systems with caching…_en.wikipedia.org Cache stampede The problem deals with cache expiration; imagine a JSON endpoint that takes 10s to generate and is about to expire: Until expiration, it returns in .1 ms It expires and is evicted The first process that tries to access the cached value finds that it is missing, and begins performing the 10s calculation .1s seconds later, the second process tries to access the cached value, finds it missing, and also kicks off a 10s calculation If requests come in every .1s, 100 expensive requests will have begun before the original 10s request finishes. If it ever finishes! If it’s an expensive database query, the load increases drastically. It could take 20s (or more!) for the first query to finish, and that could be enough to cause downtime. If there are only 100 web workers, then it cause downtime. will A common solution is to use a scheduled job to cache the value before it expires (or maybe even expire the value, so there’s always a fast response). This strategy can be a bit of boilerplate to add, which can discourage common use. That’s why I favor an asynchronous cache recompute strategy. never Something like a product catalog might only change a few times a day or week. And when it does, it’s acceptable for tens-hundreds of requests respond with a slightly-stale catalog. The asynchronous caching strategy I’ve used works like that: Until expiration, it returns in .1 ms It soft expires: evicted but marked as stale not Soft expiration queues an async job to refresh the cached value Processes that access the stale value before the async job finishes will receive the stale cached value in .1ms After the async job finishes, the cached value is updated and stale flag is cleared This strategy will let you always return quickly unless it’s the initial access and the value needs to be computed. Although, you could also have it return and degrade gracefully, and queue an async job to fill the cached value for future responses. nil Rails provides a simple solution to cache stampedes via in . When this parameter is set, the first process that accesses the stale value will: race_condition_ttl [ActiveSupport::Cache::Store#fetch](http://api.rubyonrails.org/classes/ActiveSupport/Cache/Store.html#method-i-fetch) Extend the expiration on the cache by race_condition_ttl Recompute the cached value The second and further process that access that cache key will be returned the slightly stale value. If the recomputed value hasn’t been cached by , the current process will bump the expiration and recompute. It’s worth noting that some unlucky requests will be stuck with the task of recomputing, so this strategy will have an effect on user experience. race_condition_ttl Static caching With heavy amounts of unchanging content, static caching can be a good option. Static site generators like can be used to build the entire file structure to serve through Nginx or Apache. On Heroku, you can to serve a static directory. Jekyll simply use Rack Concluding Thoughts This collection of easy wins is far from a comprehensive optimization guide, but can take the scalability of your web site surprisingly far. Before adding a complicated performance solution, it’s worth checking to see if dumber, more straightforward solution will work.

Apache

Community Is

Fetch

Google

Practical scaling techniques for web sites

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

Timing-based Blind SQL Attacks

10 Ways to Optimize Your Database

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

10 Principles of Proper Database Benchmarking

10 Minute Guide to Fixing Damaged SQL Databases - No Recovery Required!

Timing-based Blind SQL Attacks

10 Ways to Optimize Your Database

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

10 Principles of Proper Database Benchmarking

10 Minute Guide to Fixing Damaged SQL Databases - No Recovery Required!

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps