¿Qué tan probable son dos bloques de datos que puedan producir el mismo valor CRC64?

StackOverflow https://stackoverflow.com/questions/6025445

  •  14-11-2019
  •  | 
  •  

Pregunta

Tengo una aplicación en caché que utiliza un valor CRC64 para garantizar la integridad de los datos. Estoy pensando en poner un campo extra, una marca de tiempo que se debe pasar con los datos. entre los diversos servidores de caché y en comparación con ver si los datos han cambiado.

Sin embargo, esto requiere cambios de protocolo.Si bien eso no es un trato enorme, ya tengo un CRC64 que podría usarse como un indicador de que algo ha cambiado.

¿Alguien conoce las estadísticas alrededor de dos bloques de datos que producen el mismo CRC64?Si no, ¿cómo podría calcularlo o estimar que es probatividad?

¿Fue útil?

Solución

If you assume that crc64 is 'perfect', then the numbers are pretty reasonable:

For a 1% probability of collision, you need 6.1 × 10^8 entries. For a 50% probability of collision, you need 5.1 × 10^9 entries.

Of course, if the data is potentially supplied by malicious sources, then collisions in a hash as simple as crc64 can be generated easily, and collisions could be rampant. So whether or not you go this route depends on the source of input data and the potential ramifications of collisions.

Otros consejos

The probability of any two given blocks colliding is 1/264, or 1 in about 1.8 × 1019.

However, the probability rapidly becomes more likely if you are interested in the rate of collision out of any two blocks from a population of size N.

For more information, see Birthday Problem on Wikipedia, which has formulas and approximations.

The probability of two CRC64s over different random data being identical would be something close to 1 chance in 2** 64. But since CRCs are somewhat sensitive to data patterns, there could be degenerate cases where you'd lose several binary orders of protection. It's probably not possible to come up with a hard number, but you'd likely be safe in assuming the worst case chance of collision would be less than 1 chance in 2** 50 or so.

You'd be assured of getting closer to the theoretical limit if you used a cryptographic hash instead of a CRC64, but the crypto hash is generally much more expensive to compute.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top