Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.

Tuning tips

Abe Stanway edited this page Oct 27, 2013 · 8 revisions

Okay, so you've got everything all set up, data is flowing through, and...what? You can't consume everything on time? Allow me to help:

  1. Try increasing settings.CHUNK_SIZE - this increases the size of a chunk of metrics that gets added onto the queue. Bigger chunks == smaller network traffic.

  2. Try increasing settings.WORKER_PROCESSES - this will add more workers to consume metrics off the queue and insert them into Redis.

  3. Try decreasing settings.ANALYZER_PROCESSES - this all runs on one box (for now), so share the resources!

  4. Still can't fix the performance? Try reducing your settings.FULL_DURATION. If this is set to be too long, Redis will buckle under the pressure.

  5. Is your analyzer taking too long? Maybe you need to make your algorithms faster, or use fewer algorithms in your ensemble.

  6. Reduce your metrics! If you're using StatsD, it will spit out lots of variations for each metric (sum, median, lower, upper, etc). These are largely identical, so it might be worth it to put them in settings.SKIP_LIST.

  7. Disable Oculus - if you set settings.OCULUS_HOST to '', Skyline will not write metrics into the mini. namespace - this should result in dramatic speed improvements.

At Etsy, we have a flow of about 5k metrics coming in every second on average (with 250k distinct metrics). We use a 32 core Sandy Bridge box, with 64 gb of memory. We experience bursts of up to 70k TPS on Redis. Here are our relevant settings:

CHUNK_SIZE: 7000  
WORKER_PROCESSES: 2
ANALYZER_PROCESSES: 25  
FULL_DURATION: 86400  
Clone this wiki locally