What is a blockchain?
When talking about blockchains, we always refer to it as a distributed ledger technology (DLT) that established the underlying, open source technology behind Bitcoin. A blockchain is a digital system of recording transactions of assets in a list that is replicated across available nodes in a network, rather than being stored in a central data store, as is the case with traditional databases.
In a distributed ledger such as a blockchain, the data is distributed to all nodes in a trustless manner (meaning without a trusted third party such as VISA, MasterCard, or your bank) using a peer-to-peer protocol in near real time. Each node individually processes and verifies every transaction redundantly, bundles the verified transactions into a block, and broadcasts them to all other nodes in the network. Through a consensus mechanism, the block of transactions is validated by other nodes in which the majority has to approve the block before it becomes final and is added to the blockchain. The blockchain uses a combination of digital signatures and cryptography to prove your identity and authenticity and to enforce read/write and execute permissions (access rights). This makes it possible to permit write access for certain participants and read access to other participants, or even to a wider audience; that is, everybody.
If you loosely compare a blockchain to a traditional database, a blockchain is a system that contains an ordinary database and some extra software that corroborates that submitted records conform to previously agreed-upon rules before adding the new records to the database. This extra software listens and broadcasts new records to all nodes, or peers, participating in the network, ensuring that each peer has the same data in its database. The following diagram is an overview of the capabilities that make up blockchain technology:
Technically, a blockchain is a new method of data storage. It is actually just a file with a predefined data structure (that is, how the data is logically put together). It can be compared with other data structures, such as relational databases (tables, columns, and rows), XML files, comma-separated values (csv), Excel database files, and binary files (images and videos). An analogy that I often use is that blocks in a chain are the same as pages in a book. Each page in a book, just like this one, has a bunch of text structured in paragraphs, and information about its context (also called metadata), such as the chapter number, chapter title, and page number. Similarly, in a blockchain, each block consists of a collection of content, for example, the list of transactions, and a header, which contains technical information about the block, a reference to the previous block, and a digital signature (hash) of the data contained in the block.
A blockchain, where blocks are linked to each other to make a chain, is analogous to pages in a book. Pages use sequential numbering that makes it easy to know their order. If pages were to be pulled out of the book and thrown into a pile, it would be easy to put them back in order. A blockchain, though, is cleverer. The following diagram shows that each block links back to the previous block via the block's fingerprint. The fingerprint is determined by the individual block's content and the fingerprint of the previous block, as demonstrated in the following diagram:
In a book, the ordering of pages is implicitly built on a page whose number is one less; that is, page 13 follows page 12 (13-1), whereas blocks are represented by fingerprints or hashes that are built upon each other. For example, block 3 with hash 8ec6cc0 is determined by hashing its data together with hash 9a59c5f of the previous block. By using a fingerprint that is determined based on the previous one, it can be used for validating the internal consistency of the data.
This scenario is shown in the following screenshot:
You can check whether the data is consistent within a block by generating the fingerprint yourself and comparing it to the one that is part of the block's header. If someone wants to change the information stored in one of the earlier blocks, they need to regenerate all of the fingerprints from that point until the end of the chain. However, the blockchain will appear to be altered, and it is instantly noticeable by others. Depending on the consensus method used, the creation of these fingerprints can be a very difficult and slow process, which makes it very problematic to rewrite the blockchain. Furthermore, the number of blocks already present in the blockchain can be huge, for example, for Bitcoin (June 3, 2018: 512253 blocks with a size of 156 GB). The following screenshot shows that when changing the data, the hash is also changed and the block becomes invalid: