{"id":178,"date":"2021-03-22T14:51:38","date_gmt":"2021-03-22T14:51:38","guid":{"rendered":"https:\/\/ibigworld.ath.edu.pl\/?p=178"},"modified":"2022-06-02T11:33:46","modified_gmt":"2022-06-02T11:33:46","slug":"big-data-nowadays","status":"publish","type":"post","link":"https:\/\/ibigworld.ath.edu.pl\/index.php\/en\/2021\/03\/22\/big-data-nowadays\/","title":{"rendered":"Big Data nowadays"},"content":{"rendered":"<p style=\"line-height: 20.4pt; background: white; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">The Big Data is a relatively new area of research and it merges various areas like cloud computing, data science and Artificial Intelligence. Its definition it was proposed in 2012 [1], where large Volume of data, Variety in kind was processed taking under consideration its Velocity (thus 3V). Nowadays definition evolved to 5V [2] and takes under consideration both veracity (treated as quality of captured data) and its value (as usefulness).<\/span><\/p>\n<p style=\"line-height: 20.4pt; background: white; text-align: start; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">Over the last decade the Big data processing pattern was established and takes under consideration the following elements: ingest (data collection stage), store (managing and storing data \u2013 also in real time), process (managing data), analyse (obtaining vital information) and insight (data consumption in form of information or data for further applications).<\/span><\/p>\n<p style=\"line-height: 20.4pt; background: white; text-align: start; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">The process usually starts with data collection (data are meeting 5V definition). The data are usually processed as data logs (e.g., Flume), bulk data (e.g., Sqoop), messages (e.g., Kafka), dataflow (e.g., NiFi). Then large data are processed using computing engine as batches (e.g., MapReduce) or streams (e.g., Flink, Spark, Storm, Flink). Data (structured or not) are analysed using machine Learning methods (e.g., Caffe, Tensorflow, Python), statistic approach (SparkR, R) and then visualized (e.g. \u00a0Tableau, GraphX). It is worth to keep in mind that created solution is constantly changing and solution should be updated (e.g., Oozie, Kepler, Apache NiFi). The obtained data can be managed by various solutions e.g., Apache Falcon, Apache Atlas, Apache Sentry, Apache Hive. The important issue is also data security (e.g., Apache Metron or Apache Knox) or new technology that changes ways and types of data like InfiniBand or 5G.<\/span><\/p>\n<p style=\"line-height: 20.4pt; background: white; text-align: start; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">The Big data has over 10 years and it is reaching new heights, thanks to the vast adaptation and companies that are delivering new tools. Looking at the summary [3] the number of technology and solutions is overwhelming (<a href=\"http:\/\/mattturck.com\/wp-content\/uploads\/2020\/09\/2020-Data-and-AI-Landscape-Matt-Turck-at-FirstMark-v1.pdf\"><span style=\"color: #607d8b; text-decoration: none;\">look hire<\/span><\/a>).<\/span><\/p>\n<p style=\"line-height: 20.4pt; background: white; text-align: start; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">During our research we are looking for competencies required by the international and local market. Based on our analysis and trends [4] we identified classic opensource Hadoop, Spark and Storm solutions and technologies that gain popularity. Our research is focusing on open-source solutions that could be used on dedicated infrastructure or Big data cloud services provided by leading platforms like AWS, Microsoft Azure or Big Query by Google.<\/span><\/p>\n<p style=\"line-height: 20.4pt; background: white; text-align: start; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">In our research we keep in mind that the market is flooded with new mechanism and pipelines constantly to allow to tackle with big data in simpler and unified way. The solutions tend create solutions which simplifies the Big Data analysis and make it easier to use. There are several solutions [3], which shows current trends:<\/span><\/p>\n<p style=\"line-height: 20.4pt; background: white; text-align: start; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">\u2022 visual analytical tools that allow to focus on data analytics using simple calculations or point-and-click approach, while gaining support in big data storage, real-time management and security. The services allowing this are Arcadia Enterprise 4.0, AtScale 5.0 or Dataguise DgSecure 6.0.5;<br \/>\n\u2022 frameworks that allow to create application based on Big Data using DevOps capabilities and big data transformation support. They allow to utilize known languages as R, Python or SQL. They are Attunity Compose 3.0, Cazena Data Science Sandbox as a Service or Lucidworks Fusion 3. Some solutions like Couchbase suite are directed for web, mobile, and Internet of Things (IoT) applications based.<br \/>\n\u2022 solutions that are helping to provide data as service for applications. They are using pipelines like Microsoft Azure or Hadoop ecosystem and change them into information platform (Paxata Spring \u201917, Pentaho 7.0 or Qubole Data Service).<\/span><\/p>\n<p style=\"line-height: 20.4pt; background: white; text-align: start; margin: 0cm 0cm 18.0pt 0cm;\"><span style=\"font-size: 13.5pt; font-family: 'Helvetica',sans-serif; color: #424242;\">References<br \/>\n[1] Wu, X., Zhu, X., Wu, G.-Q. and Ding, W. (2014) Data Mining with Big Data. IEEE Transactions on Knowledge and Data Engineering, 26, 97-107.<br \/>\nhttps:\/\/doi.org\/10.1109\/TKDE.2013.109<br \/>\n[2] Nagorny K., Lima \u2013 Monteiro, P. Barata J., Colombo A.W., Big Data analysis in smart manufacturing. Int.J.Commun.Netw.Syst.Sci.10(2017)31\u201358<br \/>\n[3] The Big data technology map: http:\/\/mattturck.com\/wp-content\/uploads\/2020\/09\/2020-Data-and-AI-Landscape-Matt-Turck-at-FirstMark-v1.pdf<br \/>\n[4] Yesheng Cui and Sami Kara and Ka C. Cha . Manufacturing big data ecosystem: A systematic literature review. Robotics and Computer-Integrated Manufacturing, 62: 101861, 2020.<br \/>\n[5] Article online: https:\/\/www.readitquik.com\/articles\/digital-transformation\/10-big-data-advances-that-are-changing-the-game\/<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Big Data is a relatively new area of research and it merges various areas like cloud computing, data science and Artificial Intelligence. Its definition it was proposed in 2012 [1], where large Volume of data, Variety in kind was processed taking under consideration its Velocity (thus 3V). Nowadays definition evolved to 5V [2] and &hellip; <\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/ibigworld.ath.edu.pl\/index.php\/en\/2021\/03\/22\/big-data-nowadays\/\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[17],"tags":[56],"_links":{"self":[{"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/posts\/178"}],"collection":[{"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/comments?post=178"}],"version-history":[{"count":3,"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/posts\/178\/revisions"}],"predecessor-version":[{"id":190,"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/posts\/178\/revisions\/190"}],"wp:attachment":[{"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/media?parent=178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/categories?post=178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ibigworld.ath.edu.pl\/index.php\/wp-json\/wp\/v2\/tags?post=178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}