Big data is the default format for all data that has been collected in a quantity that is large, unstructured, and unorganized, making it difficult to access if only standard data management tools or traditional. data applications are used. Technology and information advancement, particularly in the digital era, progresses quickly. Talking about data in the digital age doesn’t necessarily mean talking about big data. Some of you may already be familiar with this style, but there are still some of you who are unaware of the types and characteristics of big data. Consider the next sentence in order to understand it more easily.
Big data characteristics cannot be separated from the various types of data that are used in a single program. Structured data, semi-structured data, and unstructured data are the three main types of big data. Here is the explanation:
- STRUCTURED DATA
The first type of data is structured data that is both accurate and well-defined. Both computers and people can easily understand this data. This data may also be expanded, analysed, and presented using appropriate formatting.
- SEMI STRUCTURED DATA
There is semi-structured data moving forward. Obviously, the data in this instance is structured data, but it isn’t complete or meets the criteria for structured data, like RDBMS. The data in this example is a CSV file and a NoSQL document with kunci characters for pemrosesan.
- UNSTRUCTURED DATA
The final category of big data is unstructured data, or information that is neither well-structured nor precisely defined, making it more difficult to access, comprehend, and analyze. Unstructured data examples include comments made on social media sites, tweets, posts, and likes.
BIG DATA: 10 CHARACTERISTICS
Big data has unique, variable characteristics that are different from one another at the moment. Generally speaking, this characteristic of big data is related to just one factor. Here is the summary.
1. VOLUME
Volume has become the main trait of big data. This is not necessarily a concerning issue because more than 90% of the data for this day were calculated using prior year’s salient data. Data that is available today almost certainly experience significant scaling. For instance, on YouTube, there are roughly 300 hours of videos that are watched every hour.
2. VELOCITY
Velocity is the rate at which newly created, moving forward, and generalized data are processed. This pattern is extremely detrimental to later big data. In this article, we can see an example of Google, which processes about 40 billion pieces of data every day. If this is implemented, there will be more than 3,5 million pencarian each day.
3. VARIETY
This data characteristic represents the diversity of big data across all platforms. There are three types of data: real-time, semi-real-time, and not real-time. For example, there are social media outlets that disseminate a variety of data types, ranging from photos, videos, and forms to filters and more. In the business world, this data is also presented in a variety of formats, including documents, tables, and other items.
4. VERACITY
Veracity is the data’s accuracy when running a program. If a large amount of data lacks a rising acurasi, just be cautious. It may result in errors and other serious consequences, such as the deletion of an account’s name from Facebook, Instagram, and other websites.
5. VALUE
With this characteristic, value could be described as big data that was created with user or human benefit in mind. In line with what Hootsuite announced, 500 million people use Instagram Stories every day. It is true to say that the aforementioned feature is extremely valuable and well-designed.
6. VISUALIZATION
Big data’s single most challenging C to visualize this one frequently cited as an example. In this, it is possible to obtain data in a single peta that has the colors white, coordinates, and points, which makes it the only example of data that can be visualized.
7. VOLATILITY
Volatility is defined as data that is collected once a day or several days ago that may differ from data currently available. In the end, this characteristic is related to data changes that could have an impact on homogeneity in data.
8. VALIDITY
This character trait is actually very similar to truth. Any data that is presented here must be current and relevant before it can be used, especially for the purpose of the main objective in order to proceed successfully.
9. VULNERABILITY
Big data’s less important characteristic is its tendency to downplay security. In this situation, big data must be able to distinguish between desired and undesirable outcomes. For instance, the work that the various organizations do is not done with complete focus.
10. VARIABILITY