How Hadoop/Bigdata works in real industry

How Hadoop/Bigdata works in real industry

These days Bigdata has been in the state of “Raise and Spread“. There are many reasons behind this. But there is a long-term plan being plotted by almost all the companies and they see Hadoop Eco system as the best tool to achieve forecasted industry goals

Here are some of the actual bullet points for taking hadoop into consideration
  1. There is a good separation between admin and development prospective in hadoop
  2. Good distributions such as Hortonworks, Cloudera or MapR has been well-established to ease the development and maintenance
  3. Of course establishment and maintainince cost is quite minimal as the data nodes use for data-storage are generally commodity lowcost hardware
  4. Even the hardware is lowcost, the robostness is established by its unique nature of distrubutive replication of data ( data recovery is every easy )
  5. Millions of dollers is being saved by the scale of organization establishment and at the same time returns are high
  6. Processing of structured, semi structured and unstructured data(chip , sensor data) is possible in Hadoop  with ease
  7. Cluster Health check can be done very easily as Hadoop itself manages  switch processing, if it encounters an ill-node, migrating processes towards a well-node with zero latency
  8.  Data copy replication is done automatically and also user is given flexibility to set number of replication
  9. All hadoop basic code including the APIs can be availaied free of cost, taking the advantage of freeware


10. Hadoop by default processes in parallel and in distributive mode, so even if the data is very high, the processing time would be almost same.
11. Mapreduce is the basic for all hadoop ecosystems functionality, but the good news is that , now its almost gone and is replaced by easy to learn and use wrappers for faster development
12. People with different technical backgrounds can migrate Hadoop, if not atleast to hadoop admin side
13. In realtime software-industries, Hadoop Admins are in very high demand and paid well
14.  Start Hadoop from very basics and can easily add the eco system components to our technology stack without much difficulty
Hadoop comprises of some components, all its components together is called as “Hadoop Eco System”. Following are the basic components of hadoop eco system.
  • Hive
  • Pig
  • Spark
  • Flume
  • Kafka
  • Oozie
  • Zookeeper
  • Solr
  • Hbase


Let me prepare a real-time Industrial level in-depth information on Components in the upcoming sessions

—– Will discuss about the Hadoop Realtime Industrial Standards in next session.


Please leave a comment as it really matters. Your comments are our energy boosters

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s