Home

Proof of concept to analyse huge application log files using Hadoop cluster on IBM Cloud Platform

Analysing the application log files generated on production environment are very challenging. Data in the log files are in unstructured format and hence to leverage the query functionality, they can’t be stored in RDBMS/traditional database systems without conversion to structured format. Hence if an application behaves abruptly for very short duration, troubleshooting the application based on the information recorded in a large log file, probably of size hundreds of terabytes, is nearly impossible.As part of our POC development, we found that from an E-Commerce application running on Oracle Web Commerce platform (ATG), sometimes for order fulfilment asynchronous communication was not established to a third party vendor. JMS messaging protocol was responsible to delivered the order submission message from ATG third party vendor and vice versa, but periodically it was unable to do that. Using Hadoop cluster with customized Map-Reduce programming model, we extracted the exact recorded warnings and errors from log files produced from out of box ATG component. After performing the intricate analysis within the framework component, based on the analysed reports produced by Hadoop framework, we concluded that the issue was lying within the ATG framework itself. The same was communicated to the software vendor and subsequently received the patch from them.

Next Effective Image Analysis on Twitter Streaming using Hadoop Eco System on Amazon Web »

Previous « Effective Usage of ISO 8583 Messaging System in Payment Gateway

AI on the Fly: Real-Time Data Streaming from Apache Kafka To Live Dashboards

In the current fast-paced digital age, many data sources generate an unending flow of information,… Read More

1 week ago

Tech Threads

Real-Time at Sea: Harnessing Data Stream Processing to Power Smarter Maritime Logistics

According to the International Chamber of Shipping, the maritime industry has increased fourfold in the… Read More

4 weeks ago

Tech Threads

Driving Streaming Intelligence On-Premises: Real-Time ML with Apache Kafka and Flink

Lately, companies, in their efforts to engage in real-time decision-making by exploiting big data, have… Read More

3 months ago

Tech Threads

Dark Data Demystified: The Role of Apache Iceberg

Lurking in the shadows of every organization is a silent giant—dark data. Undiscovered log files,… Read More

3 months ago

Tech Threads

The Role of Materialized Views in Modern Data Stream Processing Architectures + RisingWave

Incremental computation in data streaming means updating results as fresh data comes in, without redoing… Read More

6 months ago

Tech Threads

Unlocking the Power of Patterns in Event Stream Processing (ESP): The Critical Role of Apache Flink’s FlinkCEP Library

We call this an event when a button is pressed, a sensor detects a temperature… Read More

7 months ago