Verifying predictions with the Bitcoin blockchain
By
Whenever we share our predictions, like when Seclytics’ Predictive Analytics predicted IPs used in ProjectSauron months before it was known, the first question we get is: How do I know you made that prediction on that date?
We needed a trusted third party that we could easily store our predictions and make it simple for our users to verify themselves. After a suggestion from a colleague, we decided to store proof of our predictions in the bitcoin blockchain.
Bitcoin transactions allow you to store some arbitrary data using the OP_RETURN op code. There are many services built around this concept like Proof of Existence. But since the bitcoin protocol is pretty straightforward and distributed in nature, we decided to manually create our transactions instead of depending on a centralized service
Currently, it costs us about eight cents to store a single hash in the blockchain. We have over 5 million predicted IPs which means storing each IP would be cost prohibitive and too cumbersome for a someone to verify all our predictions. To work around this obstacle, we group all our predictions made in the day and create a hash of the group. Not only is this more cost effective, it also allows users to verify that we have not hidden any false positives because removing a single prediction would completely invalidate the hash.
Let’s walk through the process of storing a prediction in the blockchain:
We predicted 45[.]32[.]196[.]115 to be malicious on August 27.
- We create a salted SHA256 hash for the individual prediction:
- 45.32.196.115
---BEGIN HASH SALT---
$2b$12$n/DsF2cZcvST6WsAcBf70e - Which gives us a hash of: 2646e2918253883d85bb696f7a2e623ad112b4b16a023367f0f3c1aa0d87ef5b
- We then calculate the hash for each of the 7500 ips predicted in the day and create an alphabetically sorted list of their hashes and calculate that group’s hash.
- ....
2646e2918253883d85bb696f7a2e623ad112b4b16a023367f0f3c1aa0d87ef5b
c9d92d07f9c5f0d9f2429f8872d578256fe2f77abf025e6e585b8ea59b1caebf
dfd080784078874884a6375e7754438b7a73f45743226566dacc22b3397ff866
ffb4e693c34ad28da31d96b761ea70964a5e8ec7cbe1cc4f0b195167cf10d1d8
.... - The hash of this group of predictions is 693dfe8d6c2594e1197b31991c22de0e55332a55b67e296522b9d3e20b2ad304
- The prediction group's hash is stored in the bitcoin transaction using the OP_RETURN op code and submitted to the blockchain. Once the transaction is confirmed, we can’t revoke or change the hash.To verify, search for the hash of the group (693dfe8d6c2594e1197b31991c22de0e55332a55b67e296522b9d3e20b2ad304) in the bitcoin transaction page.
To verify our predictions, we aggregate over 140 premium and open source threat feeds and check to see if any of the IPs were predicted have been reported or seen. In this case the IP was reported by Phishtank, 10 days after we predicted it.
Ultimately, we are using the bitcoin blockchain to establish trust which is especially needed when we are making predictions for events that have not yet happened or detected by other vendors.