Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As author of the cited post on HBase lemme clarify:

0. Here are more details on that: http://hadoop-hbase.blogspot.de/2013/07/protected-hbase-agai...

1. By default HBase "flushes" the WAL. Flush here means to make sure that at least 3 machines have change in memory (NOT on disk). A datacenter power outage can lose data.

2. As HDFS closes a block it is not by default forced to disk. So as HBase rewrites old data during compactions by default, old data can be lost during a power outage. Again, by default.

3. HDFS should be configured with sync-on-close, so that old data is forced to disk upon compactions (and sync-behind-writes for performance)

4. HBase now has an option to force and a WAL edit (and all previous edits) to disk (that's what I added in said jira).

5. This is post is 4 years old for chrissake :)... Don't base decisions on 4 year old information.

HBase _is_ a database and it will keep your data safe. Unfortunately it requires some configuration and some knowledge.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: