The digital data is growing
exponentially.
Most
of these data are not structured (images/text /comments). Processing unstructure and semi-structure large data is a challenge.
For example, the Internet data
generated every day by Google /Facebook /Twitter is in tera bytes.
Hadoop is one of the technology to
process this Big data ( consist of structured, unstructured and semi
structured data). It has features which makes it standout wrt
other distribiuted & parallel processing technologies few of them are,
- Taking code to data ( Data lacality)
- Use of commodity hardware ( commanly available computers and not vendor specific)
- Fault taulrant
- Opn source (Apache)
Hadoop is in growing phase and
it has believed to be huge potential
No comments:
Post a Comment