How Hadoop/Bigdata works in real industry

These days Bigdata has been in the state of “Raise and Spread“. There are many reasons behind this. But there is a long-term plan being plotted by almost all the companies and they see Hadoop Eco system as the best tool to achieve forecasted industry goals

Here are some of the actual bullet points for taking hadoop into consideration

  1. There is a good separation between admin and development prospective in hadoop
  2. Good distributions such as Hortonworks, Cloudera or MapR has been well-established to ease the development and maintenance
  3. Of course establishment and maintainince cost is quite minimal as the data nodes use for data-storage are generally commodity lowcost hardware
  4. Even the hardware is lowcost, the robostness is established by its unique nature of distrubutive replication of data ( data recovery is every easy )
  5. Millions of dollers is being saved by the scale of organization establishment and at the same time returns are high
  6. Processing of structured, semi structured and unstructured data(chip , sensor data) is possible in Hadoop  with ease
  7. Cluster Health check can be done very easily as Hadoop itself manages  switch processing, if it encounters an ill-node, migrating processes towards a well-node with zero latency
  8.  Data copy replication is done automatically and also user is given flexibility to set number of replication
  9. All hadoop basic code including the APIs can be availaied free of cost, taking the advantage of freeware


10. Hadoop by default processes in parallel and in distributive mode, so even if the data is very high, the processing time would be almost same.

11. Mapreduce is the basic for all hadoop ecosystems functionality, but the good news is that , now its almost gone and is replaced by easy to learn and use wrappers for faster development

12. People with different technical backgrounds can migrate Hadoop, if not atleast to hadoop admin side

13. In realtime software-industries, Hadoop Admins are in very high demand and paid well

14.  Start Hadoop from very basics and can easily add the eco system components to our technology stack without much difficulty

Hadoop comprises of some components, all its components together is called as “Hadoop Eco System”. Following are the basic components of hadoop eco system.

  • Hive
  • Pig
  • Spark
  • Flume
  • Kafka
  • Oozie
  • Zookeeper
  • Solr
  • Hbase


Let me prepare a real-time Industrial level in-depth information on Components in the upcoming sessions

—– Will discuss about the Hadoop Realtime Industrial Standards in next session.


If you like my explanation, please Subscribe for free!!


Please leave a comment as it really matters. Your comments are our energy boosters

This site uses Akismet to reduce spam. Learn how your comment data is processed.