How to Install Blocksci and Trouble shooting it.

Installing Blocksci

https://citp.github.io/BlockSci/compiling.html

Requirements :

If you don’t have enough memory it won’t install easily and

  • Cmake version should be 3.9
  • GCC and G++ should be 7 version

Complete instruction to install Cmake 3.9, GCC, G++ :

https://gist.github.com/black13/5c951d3073f8f57e9efa8e1a874b25f1

If the official manual didn’t work you could follow the below steps:

  • sudo add-apt-repository ppa:ubuntu-toolchain-r/test -y
  • sudo apt-get update
  • sudo apt install libtool autoconf libboost-filesystem-dev libboost-iostreams-dev
    libboost-serialization-dev libboost-thread-dev libboost-test-dev libssl-dev libjsoncpp-dev
    libcurl4-openssl-dev libjsoncpp-dev libjsonrpccpp-dev libsnappy-dev zlib1g-dev libbz2-dev
    liblz4-dev libzstd-dev libjemalloc-dev libsparsehash-dev python3-dev python3-pip
  • sudo apt-get install g++-7
  • install cmake 3.9 or higher instead of 3.5 (the cmake 3.5 version installed by apt-get install) (or follow this link to remove old version of cmake and get the new version.)
  • sudo update-alternatives –install /usr/bin/gcc gcc /usr/bin/gcc-7 60 –slave /usr/bin/g++ g++ /usr/bin/g++-7
  • git clone https://github.com/citp/BlockSci.git
  • cd BlockSci
  • mkdir release
  • cd release
  • cmake -DCMAKE_BUILD_TYPE=Release ..

Install make :

  • sudo make install
  • cd ..
  • sudo -H pip3 install -e blockscipy (this takes around 2 hours)

 

How to write a clean code – 6 Tips

This is a tiny post which talks about writing a clean code for a fresher who is trying hard to get success in the IT Industry.

1) Use the meaningful variable names

example
$elapsed_time_in_days;
$days_since_deletion;

2) Adding comments is one of the good habits. write comments for every code modifications.

3) Code scouting – When picking up the new code from the Internet, Take some time to read it and try to add methods or split the required functionalities.

4) Functions and sub-functions. Make sure functions are doing its job that its meant to do.

5) Testing – Unit testing is very important, testing the functionality and isolation intensively.

6) Final tips are Practice, Practice, Practice.

Goog Luck!

Introduction to Apache Hadoop

What is Hadoop?

Apache Hadoop is an open-source software which was born out of a need to process an avalanche (sudden arrival or occurrence of something in overwhelming quantities) of big data.

The web was generating more and more information on a daily basis, and it was becoming very difficult to index over one billion pages of content. Hadoop is for storage and large scale processing of data.

Basically, it’s a way of storing enormous data sets across distributed clusters of servers and then running “distributed” analysis applications in each cluster.

Logo :

Founders :

Hadoop was created by Doug Cutting and Mike Cafarella in 2005.Named after his son’s yellow color toy elephant named as Hadoop.

Working:

Normal DB – store data, retrieve data using SQL Query

Hadoop – stores data, retrieve data but no queries involved

Two main parts – a distributed file system for data storage and a data processing framework.

Distributed file-system  – HDFS

Data processing framework – MapReduce

HDFS :

HDFS writes data once to the server and then reads and reuses it many times.

Namenode – master
Datanode – slaves

Input data was splitted into blocks (equal size 128kbs) and stored in Datanode. Name node determines the mapping of blocks to the DataNodes.

MapReduce :

MapReduce is the heart of Hadoop. It is this programming paradigm that allows for massive scalability across hundreds or thousands of servers in a Hadoop cluster.

Map – which takes a set of input data and converts it into another set of data tuples (Key & values),

Reduce – takes the output from a map as input and combines those data and produce the final output.

MAP – Job tracker – that splits the job submitted by the client into small sub-tasks. (do job scheduling & task tracker monitoring)

Reduce – Task tracker – that actually do the tasks in parallel in a distributed manner on data stored in datanodes.

Instead of MapReduce, we can use querying tools like Pig Hadoop and Hive Hadoop that gives the data hunters strong power and flexibility.

Example for MapReduce :  Find maximum temperature of each City

YARN : Used in Hadoop V2. (Advanced of MapReduce)

Apache Yarn – “Yet Another Resource Negotiator” is the resource management layer of Hadoop V2.

Yarn allows different data processing engines like graph processing, interactive processing, stream processing as well as batch processing to run and process data stored in HDFS simultaneously with Mapreduce.

YARN Architecture :

Resource Manager – (Master of YARN) (Similar to Job tracker)

RM manages the global assignments of resources (CPU and memory) among all the applications. It arbitrates system resources between competing applications.

  • Scheduler

The scheduler is responsible for allocating the resources to the running application.

  • Application manager

It manages running Application in the cluster, i.e., it is responsible for starting application masters and for monitoring and restarting them on different nodes in case of failures.

Node Manager – (Slave of YARN)  (Similar to Task tracker)

NM is responsible for containers, monitoring their resource usage and reporting the same to the Resource Manager. Manage the user process on that machine.

Application Master

One application master runs per application. It negotiates resources from the resource manager and works with the node manager. It manages the resource needs of an application.

Advantages of Hadoop :

1. Resilient to failure – In HDFS data was stored on every machine in the cluster. If any machine slows down data can be retrieved from another.

2. Speedy Output – Output can be fetched in a fraction of a second.

3.  Cost effective – Bigdata can be stored and processed in a very effective cost.

4.  Scalable – Data can be fetched from hundreds or thousands of servers.

Limitations of Hadoop :

1.  Missing encryption – Data can be read by everyone.

2. Code are in Java –  Java has been heavily exploited by cybercriminals.

3. Lacks the ability to efficiently support small files.

Companies currently using Hadoop :

  • Facebook
  • Twitter
  • Amazon
  • Ebay
  • Apple
  • AOL
  • Yahoo

Mysql InnoDB Cluster Setup On CentOS 7 X64

Installing Mysql 5.7 in CentOS 7 X64 bit operating system

Get the yum repository from here as per your OS config.
https://dev.mysql.com/downloads/repo/yum/
Run the below command
yum localinstall mysql57-community-release-el7-11.noarch.rpm

Grep the mysql 5.7 releases from our OS Repo

yum repolist enabled | grep “mysql.*-community.*”

yum install mysql-community-server

After Installation, Restart the mysql 5.7 server and grep the temporary password from mysql log files.

service mysqld restart

grep “password” /var/log/mysqld.log

Login into Mysql 5.7 server using temporary password and change the root password as per your wish.

mysql -uroot -p

ALTER USER ‘root’@’localhost’ IDENTIFIED BY ‘<Password@123>’;

Mysql InnoDB Cluster Setup On CentOS 7 X64

Part 1 – InnoDB Cluster – Configuring Mysql 5.7

We have to enforce group replication configurations in the server. It is one of the mandatory pre-validation to initiate clustering setup.

Open /etc/my.cnf and paste the following lines

default-storage-engine=InnoDB
log_bin=mysql-bin
server-id=1
log_slave_updates=1

binlog_format=ROW
enforce_gtid_consistency=1
gtid_mode = ON
master_info_repository=TABLE

transaction_write_set_extraction=XXHASH64
binlog_checksum=NONE
relay_log_info_repository=TABLE

So far you have finished the Mysql configuration requirement to start Cluster set up.

Installing Mysql shell and Mysql router

You need to install the following in all the instances you wish to connect with InnoDB cluster.

yum install mysql-shell

yum install mysql-router

To access the Mysql Admin Shell. Type the following

mysqlsh –log-level=DEBUG3

mysql-js> dba.verbose=2

The above options will help us to see extended log or error messages while configuring Cluster.

Let’s say you want to connect 3 Mysql server in 3 different machine. You need to configure local instance in all the machines.

Execute the following in all machines.

mysql-js> dba.configureLocalInsance()

mysql-js> dba.checkInstanceConfiguration(‘<user1>@<hostname>:3306’)

The above command will display the message something like this.

You are right, it will show error.

Now, restart the mysql server after you did the above steps in all the machines check the instance configuration and move the clustering process.

dba.checkInstanceConfiguration(“'<user1>@<hostname>:3306’”)

Deploying Cluster

From the primary node. Create the Innodb cluster and add our instances into it

var cluster=dba.createCluster(‘mycluster’,{ipWhitelist:”192.168.1.251, 192.168.1.252, 192.168.1.253″});

If you receive “The group replication applier thread was killed’” in mysqld.log messages

You can run the following query in mysql root

SET GLOBAL group_replication_ip_whitelist = ‘192.168.1.251,192.168.1.252,192.168.1.253’;

Note: Check the SELinux status, Firewalld and Iptables. Make sure it does not affect your mysql connection between nodes. Using the whitelist method you should not get any errors.

Add

cluster.checkInstanceState(“clusteruser@192.168.1.252:3306”)

cluster.checkInstanceState(“clusteruser@192.168.1.253:3306”)

You will get Status:OK output. Awesome!

cluster.addInstance(“clusteruser@192.168.1.252:3306”)

cluster.addInstance(“clusteruser@192.168.1.253:3306”)

If the output is START GROUP REPLICATION error you should whitelist the instance ip address in all the 3 servers.

Check the cluster status.

mysql-js> cluster.status()

{

“clusterName”: “mycluster”,
“defaultReplicaSet”: {

“name”: “default”,
“primary”: “192.168.1.251:3306”,
“status”: “OK”,
“statusText”: “Cluster is ONLINE and can tolerate up to ONE failure.”,
“topology”: {

“192.168.1.251:3306”: {

“address”: “192.168.1.251:3306”,
“mode”: “R/W”,
“readReplicas”: {},
“role”: “HA”,
“status”: “ONLINE”
},

“192.168.1.252:3306”: {

“address”: “192.168.1.252:3306”,
“mode”: “R/O”,
“readReplicas”: {},
“role”: “HA”,
“status”: “ONLINE”
},

“192.168.1.253:3306”: {
“address”: “192.168.1.253:3306”,
“mode”: “R/O”,
“readReplicas”: {},
“role”: “HA”,
“status”: “ONLINE”
}
}
}
}

Good work, You have created a InnoDB cluster network with 3 instances into it.

“statusText”: “Cluster is ONLINE and can tolerate up to ONE failure.”

The above line states that If a server is offline the other 2 servers can manage and serve the data to the client or customers.

 

Convert HTML table tags to CSV using javascript

Recently, I was working in an old Coldfusion environment and I wanted to export a data from a database to CSV.

Instead of writing CSV parsing in Coldfusion I used this Javascript code.

function downloadCSV(csv, filename) {
    var csvFile;
    var downloadLink;

    // CSV file
    csvFile = new Blob([csv], {type: "text/csv"});

    // Download link
    downloadLink = document.createElement("a");

    // File name
    downloadLink.download = filename;

    // Create a link to the file
    downloadLink.href = window.URL.createObjectURL(csvFile);

    // Hide download link
    downloadLink.style.display = "none";

    // Add the link to DOM
    document.body.appendChild(downloadLink);

    // Click download link
    downloadLink.click();
}


HTML Code and Calling the above function
I displayed the data from table in a web page and called this function
I set text color as white.
setTimeout(function(){
exportTableToCSV('support_request.csv');
document.getElementById("table").style.visibility="hidden";
 }, 1000);

I agree. this is an indirect approach but this will be useful when your data is less than 1 MB and when working with old programming language or frameworks like Coldfusion, Adobe, COBOL 🙂

Reference: https://www.codexworld.com/export-html-table-data-to-csv-using-javascript/