Extract
SQOOP, Kafka, Flume, Spark Streaming, Storm, AWS - AWS Glue, AWS Data Pipeline, Azure – Azure Data Factory, Azure EventHub, etc
Store
HDFS, Data Lake store from Microsoft Azure, Casandra, MongoDB, HBase, AWS – S3 storage, Azure – Data Lake Store, Blob storage, etc
Clean
Scala, Python, R, PIG, Hive-SQL {HQL}, AWS – Elastic, MapReduce, Azure – ADF Data Flow, Stream Analytics, etc
Mine
Python and Spark, R
Visualize
Qlik, Sisense, PowerBI, Tableau, AWS QuickSight, etc