Industry exposure, POC and Academic Project

There are many aspects involved in the development of a qualified student. For me, one of the most important ones has been industry exposure. A student who has been exposed to the industry is a student who is more aware of the opportunities that are out there in the field and who demonstrates a greater level of commitment and dedication to what he or she is interested in.” –Gaby Baylon,  Major, City College transfer to UCSD, Fall 2007

 

DataView.in provides industry exposure training in emerging technologies like, Big Data processing, Testing methodologies, tools and techniques, project management, Scrum and Agile development and more.

Real time scenarios and project exposure help you stay ahead and relate and apply the things learnt.

We have the expertise to guide/mentor with subsequent design and development on project coursework for the undergraduate, postgraduate and research scholar who wants to expose himself as an absorb able resource to the IT industry by executing the project with the real-time use cases.
Here are few real-time use cases for academic project development

  • Streaming data from Flat file using Apache Kafka

This video explains how we can stream data continuously from flat file into multi-broker Apache Kafka cluster and subsequently into other consumers or multiple data pipe lines by anchoring Apache Kafka Connect framework.

  • Install MySQL Database as Metastore for Apache Hive-3.1.2

This video shows the steps how to install MySQL Database on multi-node Hadoop Cluster running on top of Ubuntu as an OS for Apache Hive-3.1.2 as a Metastore.

 

  • Apache Hadoop-3.2.0 Installation on the Multi-Node Cluster using OpenJDK 11

In this video,it has been explained how we can set up multi-node cluster for latest version of Apache Hadoop-3.2.0. Besides this video, you can find the step by step description at here.
Prior to that, Hadoop-3.2.0 has been deployed into single-node cluster and it was successfully running.  If you wish to create a single-node cluster using Ubuntu 14.04 LTS and Java version as OpenJDK 11. here you find the steps.
This single-node system I have modified to act as NameNode or Master Node and registered all the individual DataNode in worker file (/hadoop-3-2-0/etc/hadoop)

  • Please click here to see the  issues faced during installation/running of apache-hadoop-3.2.0 while configuring single-node cluster on Ubuntu 14.04 LTS inside VM Workstation 15 Player as guest on top of Windows 10

 

  •  Click here to watch and understand better regarding installation of Apache Hive-3.1.2 on multi-node Hadoop-3.2.0 cluster including MySQL Database configuration as metastore. Also how to downgrade JDK version on multi-node Hadoop cluster

 

  • Use Case – 1 Structured data Ingestion from E-Commerce application (Oracle 11g as Database) by Hadoop Distributed File System (HDFS) installed on single node cluster.

We can analyze order placement data in multi-node Hadoop cluster that generates on larger size multi-channel enterprise e-commerce application for future business strategy. But can’t be executed or achievable on traditional data warehousing system due to the limitation on structured data format, volume etc. The relevant information related to products, user comments, review/ratings etc. also generates continuously on various social media tools, blogs, and emails in unstructured data format. We can blend and process this unstructured data also with previously ingested data in Hadoop cluster/Data Lake. As an outcome of processed data analysis informs of the dashboard, the business/marketing team can take smart decision to retain the existing customer with more purchase.

To execute the above as a real-time POC/Project with additional combination of social media, blogs, emails etc, Please write to us at [email protected] for Design, development, environment setup and input data. This POC can be adopted by the Degree, Master Degree or Research Scholar for their academic project development. RSS Feed