1 | Introduction to Blockchain and Big Data Robbi Rahim, Rizwan Patan, R. Manikandan and S. Rakesh Kumar |
CONTENTS
1.1 Blockchain Basic Technologies
1.1.1 Basics of Blockchain and Its Architecture
1.1.1.1 Block Header
1.1.1.2 Block Identifiers
1.1.1.3 Merkle Trees
1.1.1.4 Features of Blockchain
1.1.2 Blockchain and Bitcoin Transactions
1.1.3 Hyperledger Frameworks
1.1.4 Smart Contract Framework and Its Working
1.2 Big Data Source for Blockchain
1.2.1 Blockchain and Big Data to Secure Data
1.2.2 Blockhain and Big Data Technologies for Data Analysis
1.2.3 Blockchain for Private Big Data Management
1.2.4 Confidentiality, Data integrity, and Authentication
1.2.4.1 Data Confidentiality
1.2.4.2 Data Integrity
1.2.4.3 Data Authentication
1.2.4.4 Security Management Scenario for User Big Data in Blockchain
1.3 Blockchain Use Cases in Big Data
1.3.1 Ensuring Data Integrity
1.3.2 Preventing Malicious Activities
1.3.3 Predictive Analysis
1.3.4 Real-Time Data Analysis
1.3.5 Managing Data Sharing
1.4 Applications of Blockchain Technology with Big Data Analytics
1.4.1 Anti Money Laundering
1.4.2 Cyber Security
1.4.3 Supply Chain Monitoring
1.4.4 Financial AI Systems
1.4.5 Medical Records
References
1.1 BLOCKCHAIN BASIC TECHNOLOGIES
The first journey of blockchain technology was Bitcoin, It was a form of cryptocurrency designed by Nakamoto to design a swift, an inexpensive and translucent peer-to-peer money transaction. With the acceleration and speedy movement of internet era, the future industrial revolution also demanded for the requirement of to improve the privacy of data-driven enterprise architecture, including but not limited to decentralization, persistency, anonymity, and auditability.
A blockchain [1] consisting of two words block and chain refers to a continuous growing list digital record in the form of packets also called as blocks that are linked and secured with the aid of cryptographic mechanism. The blockchain, also referred to as the digitally recorded blocks of data are secured and stored in the form of a linear chain. Each block in the linear chain comprises of several data, i.e., Bitcoin transaction.
This Bitcoin transaction on the other hand is secured via cryptographically hashed following time stamped technology. When a new block is formed, it will contain a hash of the previous block. These blocks are chronologically ordered initiating from the first block since the inception in the entire blockchain to the newly formed block. This process is repeated until it grows and maintains the network.
1.1.1 BASICS OF BLOCKCHAIN AND ITS ARCHITECTURE
A blockchain simply is referred to as the chain of blocks in a digital format. It is also referred to as the decentralized ledger that records all transactions. The blockchain has already been utilized for management of individual identity by several researchers in the field of research community. However, a new set of regulations has been brought into by several researchers while dealing with the personal information of users concerning blockchain [2].
Each and every time whenever a user or customer make a purchase of digital coins via decentralized exchange, sells or transfers coins, a digital ledger records that specific transaction in an encrypted format, not understandable to anyone. In this way, without the need of the third party, the transaction recorded in digital format is said to be safeguarded from cybercriminals. The figurative representation of blockchain is shown in Figure 1.1.
As shown in Figure 1.1, four blocks are included with each block connected in chain with the other blocks. Here, block 1 is linked to block 2, block 2 is lined to block 3, block 3 is linked to block 4 and block 4 is linked to block 1, forming a chain. As illustrated in this figure, the block in other words is considered as the container for data, where the data is said to be stored in that specific container. The structure of a block is given below.
As far as Bitcoin blockchain, each block is composed of data referring to the Bitcoin transactions, Block Header, Block Identifies, and Merkle Trees. This section provides a detail description of the block structure (Table 1.1) [3]. Some of the normally used idioms of blockchain technologies involving the design of blockchain architecture [4] are shown below in Figure 1.2 along with the description.
• Blockchain types
• Node
• Consensus algorithm
• Block
• Header
• Transaction counter
• Transaction data
Blockchain types: Based on the operation, blockchain is divided into three types—public blockchain, private blockchain, and consortium blockchain. In the organization level, both private and consortium blockchain are used. On the other hand, in case of public blockchain, security is said to be in the increasing level; however, privacy is said to be in the decreasing level.
Node: Node is alternatively represented as a computer. It is said to be possessed by an organization participating in the blockchain network. It is also referred to as a user. Node remains the central owner of any blockchain. Its task remains in verifying the transactions with other nodes or computer. The node in the blockchain framework forms as the association point between blockchain technology and user.
Consensus algorithm: The consensus algorithm is also referred to as the agreement between the blockchain and the user. It is used in the framework with the purpose of approving the decisions for nodes or machines. Some of the familiar consensus algorithm is proof of work (PoW), proof of stake (PoS), proof ofburn (PoB), and so on.
Block: A block is referred to as the transaction decision included in the current chain after effective consent.
Header: Block version denotes the presently obtained block version in blockchain network. In the header, besides the predecessor block hash, successor block hash is also kept in the header. A nonce is utilized to modify the block hash output.
Transaction counter: The serial number of the present and previous block is represented via a transaction counter.
Transaction data: The meaning of this field, i.e., transaction data changes based on the usability. It either refers to a Bitcoin transaction, records, user personal information, information pertaining to healthcare, and so on.
The block header is composed of metadata about that specific block. A block header is utilized to recognize a specific block on a complete blockchain. It is hashed in a repeated manner with the purpose of creating proof of work for mining rewards. As the blockchain comprises of sequences of several blocks that are utilized to store information pertaining to specific transactions occurring on a blockchain network, with the aid of block header, differentiation between the blocks are made.
The blockchain network consists of a unique header. Each block is identified with the aid of block header hash. The specifications of the header include, an 80-byte long string supporting a 4-byte long Bitcoin version number with a 32-byte previous block hash. It also includes a 32-byte long Merkle root supporting a 4-byte long timestamp with a 4-byte long nonce utilized by the cryptography miners. Along with the above description, the block header includes the following:
• Cryptographic hash
• Mining competition
• Data structure to summarize the transactions in the block
1.1.1.2 Block Identifiers
The block identifiers are specifically the cryptographic hash. With these block identifiers, the specific block is said to be identified in a unique manner. Usually, two block identifiers are said to exist—block header hash and block height. As far as block header hash are concerned, the block’s hash is evaluated by each node as soon as the block is received from the network. The second method to obtain a block is via its position in the blockchain, which is simply called as the block height.
The Merkle tree refers to the framework of transactions ...