Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Linked http://shattered.io/ has two PDFs that render differently as examples. They indeed have same SHA-1 and are even the same size.

  $ls -l sha*.pdf 
  -rw-r--r--@ 1 amichal  staff  422435 Feb 23 10:01 shattered-1.pdf
  -rw-r--r--@ 1 amichal  staff  422435 Feb 23 10:14 shattered-2.pdf
  $shasum -a 1 sha*.pdf
  38762cf7f55934b34d179ae6a4c80cadccbb7f0a  shattered-1.pdf
  38762cf7f55934b34d179ae6a4c80cadccbb7f0a  shattered-2.pdf
Of course other hashes are different:

  $shasum -a 256 sha*.pdf

  2bb787a73e37352f92383abe7e2902936d1059ad9f1ba6daaa9c1e58ee6970d0  shattered-1.pdf
  d4488775d29bdef7993367d541064dbdda50d383f89f0aa13a6ff2e0894ba5ff  shattered-2.pdf 

  $md5 sha*.pdf
  
  MD5 (shattered-1.pdf) = ee4aa52b139d925f8d8884402b0a750c
  MD5 (shattered-2.pdf) = 5bd9d8cabc46041579a311230539b8d1


As someone who knows only the very basics of cryptography - would verifying hashes of a file using SHA-1 and a "weak" hash function like MD5 provide any additional protection over just SHA-1? I.e. how much harder would be it be to create a collision in both SHA-1 and MD5 than in just SHA-1?

My common sense intuition is that it would be a lot harder, but I'm guessing theoretically/mathematically it's only a little bit harder given how easy collisions in MD5 are?


It’s actually not much harder at all, as first noted by Joux, “Multicollisions in iterated hash functions. Application to cascaded constructions”, 2004.

https://www.iacr.org/archive/crypto2004/31520306/multicollis...


Thanks, this is fascinating!


It would certainly be harder, but as with breaking any hash, it becomes more feasible every day.

It's likely easier to replace "sha1" with "sha256" in your code than it is to construct a Frankenstein's monster of hashing.


The differing bytes are interesting

http://imgur.com/a/wI4xb


The PDF example given is somewhat ridiculous.

"For example, by crafting the two colliding PDF files as two rental agreements with different rent, it is possible to trick someone to create a valid signature for a high-rent contract by having him or her sign a low-rent contract. "

Talk about ridiculous scenarios only people living in a tech bubble could come up with.

How many landlords do you imagine know what sha-1 checksums even are, let alone would try and use it as proof the evil version of the rental agreement is incorrect?

"I would have gotten away with it, if it wasn't for you meddling kids and your fancy SHA-256 checksums!"


Obviously the landlord doesn't know or care what SHA-1 is, but he may care if his Adobe Reader says "This file was signed by the Department of Housing" at the top of the screen when he opens the PDF. As Reader fully supports signed PDFs and they do get used, believe it or not, this attack is not theoretical.


This is not as far fetched as you think.

In UK, rental contracts are often digitally signed by the renter and landlord.

I am sure in finance world many other types of contracts are signed digitally, also under the assumption that both parties sign the same thing.


The "signing" usually doesn't involve any cryptography. You just express your agreement, which can be as trivial as typing your initials in a box.

It's purely a legal, not technical thing, so if you cleverly forge the document using collisions, you'll be shouting "but the SHA-1 matched!" from behind the bars.


The legal thing does make reference to the technical thing in Europe[1] (and probably elsewhere too), by making digital signatures (which use crypto) legally binding. The question is more how courts would rule in a case where a colliding document is signed. That would probably depend on whether you can prove which of the two parties authored the colliding document (since that's a requirement for this particular attack).

(Note: I don't know whether this attack is practical for qualified electronic signatures as used by EU countries.)

[1]: https://en.wikipedia.org/wiki/Qualified_electronic_signature


There's a difference between digital signatures and sha-1 checksums of the files, which is what they seem to have been demonstrating.

Mysteriously the text I quoted has now vanished from the original blog post.


Presumably they're talking about digital signatures. I don't know how common that is in the U.S., but it's used quite a bit in many European countries as a means to sign contracts and such. (I also don't know enough about the implementation details to say whether a colliding pdf file would be sufficient to trick these systems, but let's assume so for the sake of this argument.)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: