Knowing the software tools and analytic skills that have emerged to handle massive data volumes will help you navigate today’s technological landscape.
In only a few years, Big Data has moved from just Gigabytes to whopping Zettabytes! But what is behind the hype and what characteristics make regular old data become “Big Data”? What type of data is Big Data? And, given these characteristics of Big Data, how is useful information extracted from this tremendous growth in data?
The 5 Vs of Big Data:
The characteristics of Big Data are succinctly described using these concepts:
- Volume: just moving data files of 10s of Gigabytes or more around will require non-traditional methods.
- Velocity: Data streams are enormous, so network and processing speed is critical
- Variety: There is no definite structure; data can be anything from audio & video to unstructured text
- Veracity: if we hope to learn something from the data, it better be right; remember – “garbage in – garbage out”
- Value: captures whether the data actually increases information content and therefore providing downstream inferred correlations
Big Data and the emerging Science of Data
Dealing with all these Vs of Big Data involves a wide mix of Technology Skills, summarized as:
- Data Analytics, Warehousing, and Database engineering
- Programming languages
- Statistics, machine learning modeling, and algorithm testing/tuning
- Data visualization
Those possessing these broad skill set are called Data Scientists. This new brand of scientists are discovering deeply hidden relationships from the constant streams of data produced every day.
Are you confused about which tools and platforms to use in order to get started as a Data Scientist or a Big Data Engineer?
Don’t worry, because here is a simple infographic that explains all you need to know about Big Data and Data Science together with tools and platforms.
Infographic brought to you by Digital Vidya