我有一个缓存应用程序,它使用CRC64值来确保数据完整性。 我正在考虑放一个额外的领域,一个时间戳与数据一起传递 在各种缓存服务器之间,并相比看数据是否已更改。 但是,这需要协议更改。虽然这不是一个巨大的交易,但我已经拥有了 一个可以用作有一些改变的指标的CRC64。

任何人都知道两个数据块的统计数据,产生相同的CRC64?如果没有,我如何计算它或估计它的可能性?

有帮助吗?

解决方案

If you assume that crc64 is 'perfect', then the numbers are pretty reasonable:

For a 1% probability of collision, you need 6.1 × 10^8 entries. For a 50% probability of collision, you need 5.1 × 10^9 entries.

Of course, if the data is potentially supplied by malicious sources, then collisions in a hash as simple as crc64 can be generated easily, and collisions could be rampant. So whether or not you go this route depends on the source of input data and the potential ramifications of collisions.

其他提示

The probability of any two given blocks colliding is 1/264, or 1 in about 1.8 × 1019.

However, the probability rapidly becomes more likely if you are interested in the rate of collision out of any two blocks from a population of size N.

For more information, see Birthday Problem on Wikipedia, which has formulas and approximations.

The probability of two CRC64s over different random data being identical would be something close to 1 chance in 2** 64. But since CRCs are somewhat sensitive to data patterns, there could be degenerate cases where you'd lose several binary orders of protection. It's probably not possible to come up with a hard number, but you'd likely be safe in assuming the worst case chance of collision would be less than 1 chance in 2** 50 or so.

You'd be assured of getting closer to the theoretical limit if you used a cryptographic hash instead of a CRC64, but the crypto hash is generally much more expensive to compute.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top