I use memcachedb in my deployment of A/Bingo (it isn't limited to that), which avoids using a SQL database because I wanted it to be blazing fast for people who having significantly higher traffic sites than I do. The typical case performance on my site is fantastic. The worst case performance is an abomination -- when it flushes the site essentially pauses for all users for five seconds.
This is tolerable for me, but only just. The reason I use memcachedb as opposed to vanilla memcached is because I was worried about persistence in the event of a server failure, but given that I can count resets of my server over 3.5 years on a single hand, I might just decide "In the event of a server failure, I lose any A/B tests in progress and have to start over. Oh well!"
Or Tokyo Cabinet / Tyrant, which is still a high performance key/value store but doesn't need to fit everything in RAM. Depends on how much they're storing.
If the data isn't "optional" (i.e. if there's a cache miss, you can't just go to a traditional database for a somewhat higher - but acceptable - cost), "memcached with persistence" approach isn't going to cut it. You now have a distributed system with state, which is a much more difficult problem.
That's why there are so many distributed storage systems: there's no "one fits all" solution that can handle every theoretical corner case (even Google hasn't solved that). In my (biased) view, the eventually consistent stores (e.g. Dynamo-inspired ones or even "Friendfeed/Facebook model" of sharded and replicated MySQL databases storing Blobs) do seem most reasonable for web-type problems.
Virtual memory support is currently being added to Redis so that it won't need to fit everything in RAM either. I'm sure antirez can provide some better info on how it'll work.
Hello, VM is already in alpha on Git actually. There is some more work to do, but I don't think in the Reddit use case there is the need of Redis VM: they are using MemcacheDB as a persistent cache, if it's a cache it should match very high performances delivered by memcached and is not required to cache everything. Redis is as fast or faster than memcached (using clients with the same performances) and is persistent, so probably it's a good fit for this problem.
Instead of VM Reddit should use Redis EXPIRE I guess, that is, time to live in cached keys so they auto expire.
Btw for people that don't know what Redis Virtual Memory is: with VM Redis is able to swap out rarely used keys in disk. This makes a lot of sense when using Redis as a DB. When using Redis as a cache, the way to go is EXPIRE: rarely used things in cache should simply go away and be expired instead of being moved into disk.
EDIT: It would be very interesting to know where the Reddit performance problem is, but Redis Sorted Sets are a very good match to create social-news alike sites like HN or Reddit in a scalable distributed way, with a few workers processing recent news to update their score into the sorted set. The home page can be generated with ZREVRANGE without any computation.
It's a shame reddit is not sharing how this cache is used.
When Redis is used as a persistent cache, the clients implementing consistent hashing are perfect fit for a distributed-redis, as you don't need all this data safeness. You can lose a node if there is a disaster without too much problems usually (like it happens in memcached), but you want, in normal conditions, that the cache is not volatile.
It sounded like they want to use it as more of a "real" database than a cache, since rebuilding data in case of a hardware failure is so painful. <shrug>
Hard to say, but we increased speed pretty much across the board about 50% from 0.4 to 0.4.2, by 50% (compounded :) to 0.5, and looking at 100% already for release-after-0.5... and we're ready to help configure on IRC :)
(Also, I'm not sure when you were looking at it, but bootstrap -- adding nodes without any downtime -- is done now.)
Seriously?
Wow.