Customer Insight Application Integration Hadoop/Spark
Environment: Hadoop YARN, Spark Core, Spark Streaming, Spark SQL, Scala, Python, Kafka, Hive, Sqoop, Amazon AWS, Elastic Search, Impala, Cassandra, Tableau, Talend, Oozie, Jenkins, Cloudera, Oracle 12c, Linux.
Skills: Java, Scala, Python, SQL, PL/SQL, Pig Latin, HiveQL, Unix, Java Script, Shell Scripting, HDFS, YARN, MapReduce, Hive, Pig, Impala, Sqoop, Flume, Spark, Kafka, Zookeeper, and Oozie, Storm, Spark, Kafka, Yarn and Zookeeper, Spark Streaming, Spark SQL, Spark MLib, Spring RDDs, AWS(EC2&EMR).
Description: The primary objective of this project is to integrate Hadoop (Big Data) with the Relationship Care Application to leverage the raw/processed data that the big data platform owns. It will provide an enriched customer experience by delivering customer insights, profile information and customer journey.
Hadoop Multinodecluster & DATA Ingestion & Monitoring to HDFS
Involved in Clustering of machines through Hadoop fully distributed mode.
Use of PigLatin & Hive0.8.0 to simplified MapReduce Task. Administration, Managing and Monitoring 20 node each two Hadoop clusters, cluster tune, settings and cluster maintenance.
Developing parser and loader map reduce application to store and retrieve data from HDFS and store to Hbase and Installed & Configured Hadoop for storing and retrieving data.