Mining and cross correlating heterogeneous data sources implies the use of Machine learning and signal processing algorithms for the cross-correlation and analysis of dynamically evolving big heterogeneous data. Also Deep learning with deep sparse coding networks, deep recurrent neural networks and convolutional neural networks is one of the services we could offer.

Our machine learning innovations:

  • Statistical models for big heterogeneous data. Machine learning algorithms typically consider multivariate distribution models that impose marginal distributions of the same form and assume linear data dependencies. We have innovated statistical models based on copula functions, which have a long history in econometrics but are only recently explored in machine learning.
  • Sensing and classification of big heterogeneous data. Given the information abundance in big datasets, sensing and processing must be performed on optimized low-dimensional features. This allows for effective techniques that overcome the limitations due to sheer volume of Big Data. We explore novel formulations of optimization problems suited to inference tasks, such as recovery and classification. Our schemes adjust to the varying characteristics and the promptly shifting correlations in the data.
  • Distributed computing algorithms for big heterogeneous data. Big Data applications (e.g., recommendation systems, video search, financial decision support systems) are subject to delay constraints for user interactivity and must be optimized for energy efficiency. Addressing these requirements, we design novel scheduling methodologies that schedule the various processing or mining tasks in the available computational capacity. Our work has shown that exploiting hardware heterogeneity and accounting for the dynamically varying processing demands of a task bring substantial gains on the average energy consumption and processing delay.