Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Short story: storing identities in an integer field instead of a bigint.

Now would be a good time to check your own IDs and get the max value of each. Sure is easier to fix this ahead of time than when your app is down.



Consider using GUIDs/UUIDs instead of integers. GUIDs are friendlier for replication, fake/mock data, and other reasons. The only situations where I'd recommend against using GUIDs is if total row size is a concern (GUIDs are usually larger on disk than integers) or if there is some legacy requirement to use them (e.g., an integration with a third-party).

Another thing I've done is use GUIDs for the primary key, then have an identity/sequence column called something like Sequence, which retains the sequencing capabilities of a regular integer identity column.


The flip side is that GUIDs take up a ton of space and are full of entropy, which makes them very inefficient to work with, are not cache-friendly, and not sortable. Certain database engines (e.g., InnoDB on MySQL) also limit the length of primary keys for performance reasons, which disqualifies GUIDs from being used. With clustering, short keys are better as they reduce IOPS required to fetch data.

See https://www.percona.com/blog/2006/10/03/long-primary-key-for... for an interesting discussion of the topic.

In general, it's unwise to make recommendations about what data types to use, especially for primary keys, without thorough research on their implications.


I found https://tomharrisonjr.com/uuid-or-guid-as-primary-keys-be-ca... has a good rundown of the pros and cons.


Use twitter snowflake ids — they pulled their implementation as it was out of date but Sony has one:

https://github.com/sony/sonyflake


Vast majority of projects will not reach two billion row tables(ID>0), but it should be something considered when planning, and regularly revisited in the future. BIGINT from the start uses twice as much space, if it's not needed, it's just a waste. You have to think about memory, indexing, scans in the future. Hundreds of millions of rows x 4 more bytes, is A LOT.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: