Technological solutions for intelligent analysis of Big Data. Programming languages

We consider the problems arising during in the process of application of  data analysis methods to Big Data. Modern programming languages are analyzed from the point of view of efficiency of their application for development of machine learning (ML) tools focused on Big Data.We analyzed the main typ...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2019
Автори: Grishanova, I.Y., Rogushina, J.V.
Формат: Стаття
Мова:Ukrainian
Опубліковано: Інститут програмних систем НАН України 2019
Теми:
Онлайн доступ:https://pp.isofts.kiev.ua/index.php/ojs1/article/view/334
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Problems in programming

Репозитарії

Problems in programming
Опис
Резюме:We consider the problems arising during in the process of application of  data analysis methods to Big Data. Modern programming languages are analyzed from the point of view of efficiency of their application for development of machine learning (ML) tools focused on Big Data.We analyzed the main types of machine learning tasks associated with information acqusition from Big Data that can be useful for practical use. This analysis shows that these tasks are solved by methods of statistical processing and training of neural networks. Therefore, it is advisable to have appropriate libraries in software tools aimed at solving these problems.Availability of the large number of ML algorithms that are focused on the different types of input information and different representations of result knowledge indicates the need for specialized libraries of machine learning implemented these algorithms. Another important factor in choosing a tool environment where  ML tasks are solved for Big Data is processing speed: this requirement is caused by the large volumes of data to be compiled.External services for ML and Big Data processing , proposed by Google, Amazon, etc., greatly simplify the process of developing of intelligent data analysis tools for those programming languages that support the use of such services.Thus, for creation of experimental prototypes that combine modern approaches to machine learning with elements of artificial intelligence (AI) the most suitable programming language is Python. This conclusion is also confirmed by the world's results of surveys of developers in the field of Data Sciences. But other programming languages analyzed in this paper can become more useful under certain additional conditions: for example, C++ for projects oriented on specific software and hardware or Java and Scala for corporate applications.Problems in programming 2018; 4: 45-58