Blockchain across Oracle
上QQ阅读APP看书,第一时间看更新

Hashes

Until now, we have explored the concepts that make up a transaction and how you can send transactions yourself. One of these concepts is that transactions are signed. When a transaction is signed, it generates a hash or digital signature that is based on the content of the transaction. The hash verifies that the data sent in a transaction is not compromised or has not changed during its travel.

A hash is calculated by executing a hash function, which is called hashing. Before I go into the detail of the hash functions that are used within different blockchains, I'm first going to take you through the concept of such functions. A hash function takes any input data, and produces an output, which, based on the algorithm used, has a different fixed length.

Dissecting a hash

A hash has a fixed length, which, in terms of computer data, is represented in bits. A bit is the smallest possible data type, and it can be either a 0 or a 1. Think of it as a light bulb, where the light bulb can either be on (1) or off (0). Computer data can be represented as a series of light bulbs that have different patterns. Each pattern represents different data. A series of 8 bits or light bulbs form 1 byte or row, so a 256-bit string has 32 rows of 8 or 16-by-16 light bulbs. 



A computer's memory can store billions or even trillions of these light bulbs. Nevertheless, a 256-bit string is more than enough for hashing since the number of mathematical possibilities is 2 256, which is an astronomical number. 

The output of a hash function is always a string, for example, an MD5 hash is a 128-bit (16-byte) string of the input data. The following example is an MD5 hash that is commonly used to check whether a file has been compromised:

echo "<Order xmlns='urn:hash'><id>123</id></Order>" | md5sum
168be32ad9a5c7c5764bc6e73690f2d9

This function takes the input string, an XML message, and it produces an output, in this case a sequence of random letters and numbers – 168be32ad9a5c7c5764bc6e73690f2d9. This hash is known as the digital fingerprint of the input string, and it is referred to as the message digest. The message digest for this input string is always the same, but if one character changes, the hash will be completely different:

echo "<Order xmlns='urn:hash'><id>223</id></Order>" | md5sum
e729cddd99382c3426b78ba749abaf0c

All good hash functions are one-way only, and to find the same hash, all combinations of inputs need to be executed until the correct input is hashed. The output of hash functions should be random. As in the previous example, changing a single character will lead to a different hash, or else guessing it would be very easy. When comparing the message digests of two proclaimed versions and they match, you can be sure that you received the same version that the sender sent to you. Also, a hash function should be collision-resistant, which means that two different inputs can't lead to the same output. The importance of these properties will be more evident when we look at the hash functions that are used in the context of blockchain.