Category: News

Erasmus+ project iBIGworld: Short-term Teacher’s Training C2 has been conducted at the University of Nis in Serbia

Erasmus+ project iBIGworld: Short-term Teacher’s Training C2 has been conducted at the University of Nis in Serbia

During the period May 16 – May 20, 2022 Teacher’s Training C2 was held. The training was organized by the University of Nis (UNi).

5 days 35 hours joint staff training was aimed to transfer and share the knowledge regarding the developed intellectual outputs in the Big Data area and prepare the trainers for pilot training. The agenda of the event can be found through the link https://docs.google.com/document/d/1M0URP5od4wvHT4cs44Wr0pTlLmCJ6Kqx/edit?usp=sharing&ouid=101046534126028994548&rtpof=true&sd=true

University of Bielsko-Biala (UBB), University of Library Studies and Information Technologies (ULSIT) have sent four trainees each who are their staff members / lecturers in ICT area, database and machine learning as well as experienced in continuous professional development training. Three trainees from Taras Shevchenko University of Kyiv (TSNUK, Ukraine) have participated at the event online with the help of MS Teams.

The training has been conducted using the methodology already created and the training materials that have been developed as a result of Intellectual Outputs O1, O2, and O3 with the participation of members from partner universities. The teams were formed at the second project meeting M2 in Kyiv.

The format was organized as follows – morning sessions with conceptual work and presentations and afternoon workshop sessions where the 15 trainees were splitted into smaller groups to work on particular tasks, then all will get together again for presentation of their results and discussion.

The topics of the training included:

– DataBased with good practices BigData case (covering 01 – A1.1. Data Collection and A1.2. Analysis)

– How to analyze the Big Data Requirements and how to find the best solution to problems (covering 02 and 03)

– How to better prepare BigData specialists in the Data Lake ecosystem (outputs O3 and O4)

– How to help managers find the best way to use their BigData resources

All details can be found in ReportC2

 

We build possibilities grounded in data

We build possibilities grounded in data

Precisely is today’s leading data integrity company with a remarkable heritage.

With more than 50 years of unmatched data expertise, we empower businesses to make more confident decisions based on data that’s trusted to have maximum accuracy, consistency, and context.

We accomplish that through our unique combination of software, data enrichment, and strategic services – a portfolio that’s earned high recognition from leading industry analysts and customers alike.

Today, Precisely powers better decisions for more than 12,000 global organizations, including 99 of the Fortune 100 – and the possibilities continue to grow.

We help to achieve data integrity by ensuring the accuracy, consistency and context of your data. This means you can make better, faster, more confident decisions based on a deeper understanding of data you can trust.

With unmatched data expertise, we support you through a unique combination of software, data enrichment and strategic services.

And our Data Integrity Suite, with its flexible, modular approach, can meet your needs no matter where you are on your journey to data integrity.

This leaves you to focus on what matters most – building new possibilities.

Our web pagehttps://www.precisely.com/

 

Big Data nowadays

Big Data nowadays

The Big Data is a relatively new area of research and it merges various areas like cloud computing, data science and Artificial Intelligence. Its definition it was proposed in 2012 [1], where large Volume of data, Variety in kind was processed taking under consideration its Velocity (thus 3V). Nowadays definition evolved to 5V [2] and takes under consideration both veracity (treated as quality of captured data) and its value (as usefulness).

Over the last decade the Big data processing pattern was established and takes under consideration the following elements: ingest (data collection stage), store (managing and storing data – also in real time), process (managing data), analyse (obtaining vital information) and insight (data consumption in form of information or data for further applications).

The process usually starts with data collection (data are meeting 5V definition). The data are usually processed as data logs (e.g., Flume), bulk data (e.g., Sqoop), messages (e.g., Kafka), dataflow (e.g., NiFi). Then large data are processed using computing engine as batches (e.g., MapReduce) or streams (e.g., Flink, Spark, Storm, Flink). Data (structured or not) are analysed using machine Learning methods (e.g., Caffe, Tensorflow, Python), statistic approach (SparkR, R) and then visualized (e.g.  Tableau, GraphX). It is worth to keep in mind that created solution is constantly changing and solution should be updated (e.g., Oozie, Kepler, Apache NiFi). The obtained data can be managed by various solutions e.g., Apache Falcon, Apache Atlas, Apache Sentry, Apache Hive. The important issue is also data security (e.g., Apache Metron or Apache Knox) or new technology that changes ways and types of data like InfiniBand or 5G.

The Big data has over 10 years and it is reaching new heights, thanks to the vast adaptation and companies that are delivering new tools. Looking at the summary [3] the number of technology and solutions is overwhelming (look hire).

During our research we are looking for competencies required by the international and local market. Based on our analysis and trends [4] we identified classic opensource Hadoop, Spark and Storm solutions and technologies that gain popularity. Our research is focusing on open-source solutions that could be used on dedicated infrastructure or Big data cloud services provided by leading platforms like AWS, Microsoft Azure or Big Query by Google.

In our research we keep in mind that the market is flooded with new mechanism and pipelines constantly to allow to tackle with big data in simpler and unified way. The solutions tend create solutions which simplifies the Big Data analysis and make it easier to use. There are several solutions [3], which shows current trends:

• visual analytical tools that allow to focus on data analytics using simple calculations or point-and-click approach, while gaining support in big data storage, real-time management and security. The services allowing this are Arcadia Enterprise 4.0, AtScale 5.0 or Dataguise DgSecure 6.0.5;
• frameworks that allow to create application based on Big Data using DevOps capabilities and big data transformation support. They allow to utilize known languages as R, Python or SQL. They are Attunity Compose 3.0, Cazena Data Science Sandbox as a Service or Lucidworks Fusion 3. Some solutions like Couchbase suite are directed for web, mobile, and Internet of Things (IoT) applications based.
• solutions that are helping to provide data as service for applications. They are using pipelines like Microsoft Azure or Hadoop ecosystem and change them into information platform (Paxata Spring ’17, Pentaho 7.0 or Qubole Data Service).

References
[1] Wu, X., Zhu, X., Wu, G.-Q. and Ding, W. (2014) Data Mining with Big Data. IEEE Transactions on Knowledge and Data Engineering, 26, 97-107.
https://doi.org/10.1109/TKDE.2013.109
[2] Nagorny K., Lima – Monteiro, P. Barata J., Colombo A.W., Big Data analysis in smart manufacturing. Int.J.Commun.Netw.Syst.Sci.10(2017)31–58
[3] The Big data technology map: http://mattturck.com/wp-content/uploads/2020/09/2020-Data-and-AI-Landscape-Matt-Turck-at-FirstMark-v1.pdf
[4] Yesheng Cui and Sami Kara and Ka C. Cha . Manufacturing big data ecosystem: A systematic literature review. Robotics and Computer-Integrated Manufacturing, 62: 101861, 2020.
[5] Article online: https://www.readitquik.com/articles/digital-transformation/10-big-data-advances-that-are-changing-the-game/

Big Data specialists needed?

Big Data specialists needed?

According to three popular Polish job portals [1] [2] [3] the Polish marked is in need of IT specialists. There are lot of job opportunities for programmers (3988/830/3570 offers), web developers (2625/ 98/ 355 offers) and data analytics (668/ 356/ 115 offers). Recently, new job opportunities are rising concerning Big Data specialist (335/ 48/ 22 offers). This gives on average 5% of Polish market waiting for new employees with Big Data skills. However, the particular Big Data skills (languages, data analysis and machine learning skills) can be found even more often. What’s more according to Hays’s report [4] average salary in Poland for Big Data engineer is 25% higher than programmers and is equal on average 17 thousand PLN. What skills are needed? The report will be available under this project soon on this site.

 

Reference

[1] praca.pl [access 22.03.2021]

[2] pracuj.pl [access 22.03.2021]

[3] jobs.pl [access 22.03.2021]

[4] Hays report 2021: hays.com

Research on the expectations and knowledge of Big Data issues

Research on the expectations and knowledge of Big Data issues

In the past months, the project team (project no. 2020-1-PL01-KA203-082197 “Innovations for Big Data in a Real World”) conducted research on the expectations and knowledge of Big Data issues among students and lecturers.

Information was collected on the competencies required for Big Data specialists among employers. Graduates were also monitored.

The results of the research are being compiled and their conclusions will be presented soon.

The same research is carried out in the partner countries i.e. Bulgaria, Ukraine, Serbia.

Start!

Start!

Welcome to Erasmus+ Innovations for Big Data in a Real World
(iBIG World) project website!

The Department of Computer Science and Automatics launched a new Erasmus+ Project entitled Innovations for Big Data in a Real World (iBIG World). The project aims to join together HEIs, business in order to address the competencies and compatible job profile. This collaboration will provide innovative solutions to develop BigData experts. The project’s learning framework is based on IEEE guidelines for big data in Machine Learning.

The project is carried out by a consortium of four universities: the University of Bielsko-Biała, University of Library Studies and Information Technology (Bulgaria), University of Nis (Serbia), and Taras Shevchenko National University of Kyiv. The consortium is coordinated by professor Vasyl Martsenyuk.  

 Find out more about the project: http://ibigworld.ni.ac.rs/

Contact the project team: erasmusibigdata@ath.edu.pl

iBIGWorld on ICICT’2021

iBIGWorld on ICICT’2021

The results of the project iBIGWorld have been presented at the 6th International Congress on Information and Communication Technology (ICICT’2021), February 25-26 London

The work was co-funded by the European Union’s Erasmus + Programme for Education under KA2 grant (project no. 2020-1-PL01-KA203-082197 “Innovations for Big Data in a Real World”)​
The conference was held through digital platform ZOOM.
The work considers ML problems in medical application and presents a minimax approach for developing ML models that would be resistant to aleatoric and epistemic uncertainties.
The main methods applied are based on linear regression, SVM, random forest for ML, PCA for dimension reduction, cross-validation as a resampling strategy.
The approach which is offered is presented with the help of the flowchart which includes basic steps of ML model development under uncertainties, including import and primary processing the clinical data, the statement of task, resampling strategy including the dimension reduction, the choice of methods (learners), tuning their parameters and models comparison on the basis of minimax criterion.
The work is concerned with the iBIGWorld project since it will be used for the development of tutorials for BidData Analytics.