Ninth European Big Data Management & Analytics Summer School (eBISS 2019)

Invited Speakers & Tutors


  • Albert Bifet

    Albert Bifet

    Télécom ParisTech, France

    Albert Bifet is Full Professor at Télécom ParisTech and University of Waikato. He is one of the leaders of MOA and Apache SAMOA software environments for implementing algorithms and running experiments for online learning from evolving data streams. He is co-author of a book on “Machine Learning from Data Streams” at MIT Press. Previously he worked at Huawei Noah's Ark Lab in Hong Kong, Yahoo Labs in Barcelona, and UPC BarcelonaTech. He was serving as Co-Chair of the Industrial track of IEEE MDM 2016, ECML PKDD 2015, and as Co-Chair of BigMine (2019-2012), and ACM SAC Data Streams Track (2019-2012).

    Email:   albert.bifet@telecom-paristech.fr
    Web:   http://albertbifet.com/

    Lecture: Machine Learning for Data Streams
    Big Data and the Internet of Things (IoT) have the potential to fundamentally shift the way we interact with our surroundings. The challenge of deriving insights from the Internet of Things (IoT) has been recognized as one of the most exciting and key opportunities for both academia and industry. Advanced analysis of big data streams from sensors and devices is bound to become a key area of data mining research as the number of applications requiring such processing increases. Dealing with the evolution over time of such data streams, i.e., with concepts that drift or change completely, is one of the core issues in stream mining. In this talk, I will present an overview of data stream mining, and I will introduce some popular open source tools for data stream mining.


  • Josep Carmona

    Josep Carmona

    Universitat Politècnica de Catalunya, Spain

    Josep is an Associate Professor at Universitat Politècnica de Catalunya. He received a PhD at the same university in 2004. His research interests include formal methods and concurrent systems, data and process science, business intelligence and business process management, and natural language processing. He has co-authored numerous research papers and organized various conferences and workshops. He is a member of the IEEE Task Force on Process Mining, and co-organizes the Process Discovery Contest.

    Prof. Carmona has served on the technical committees of several international conferences in different fields. He received best paper awards at the Int. Conf. on Application of Concurrency to System Design (ACSD 2009), at the Int. Conf. on Business Process Management (BPM 2013), and at the International Symposium on Data-driven Process Discovery and Analysis (SIMPDA 2016). A PhD supervised by Prof. Carmona won the IEEE Best Process Mining Dissertation Award (2015).

    Prof. Carmona was a Visiting Researcher at Leiden University (January-September, 2002), and a Visiting Professor at Mannheim University (January-July, 2015).

    Email:   jcarmona@cs.upc.edu
    Web:   https://www.cs.upc.edu/~jcarmona/

    Lecture: Conformance Checking of Business Processes
    Process mining bridges the gap between process modelling on the one hand and data science on the other. In many practical process mining applications, relating recorded event data and a process model is an important starting point for further discussion and analysis. Conformance checking provides the models and methods to analyze the relation between modelled and recorded behavior. In the course of the last decade, manifold approaches to conformance checking have been developed in academia. With the respective models and methods become more mature, the field of conformance checking is subject to consolidation. In this tutorial I will present the conformance checking field, with an eye on the applications that lie underneath. Specially I will describe the main techniques that enable to asses the conformance between processes and models.


  • Aymen Cherif

    Aymen Cherif

    Eura Nova, Belgium

    Aymen Cherif is a Data Scientist and R&D engineer at Eura Nova where he works as consultant for different clients (i.e. Huawei, AW Europe) and conduct internal research project for Eura Nova. He obtained a Master degree from a French Grand Ecole (Telecom Bretagne) and a PhD with focused on artificial neural networks gave advanced theoretical and practical knowledge in Machine Learning and Artificial Intelligence. Aymen has practical experiences in different sorts of learning problems covering supervised to unsupervised learning and with different kind of data (structured data, text data, images etc.). His research area at Eura Nova is focused on Deep Learning with application to Information Retrieval, Natural Langage Processing and Computer Vision.

    Email:   aymen.cherif@euranova.eu

    Lecture: Deep Learning: Current Applications and Future Trends
    The past few years have seen a dramatic increase in the performance of recognition systems thanks to the introduction of deep networks for representation learning. Important breakthroughs have been made in areas such as computer vision and speech recognition and more advances may occur in AI systems in future years. In this talk, we will learn that deep neural networks are not new to machine learning. We will see that most of the success is driven by technology and infrastructure advancement. Finally, we will discuss some future direction and open research questions. By the end of the lecture, attendees will have a solid understanding of most common neural networks architecture and will be able to implement simple models with Tensorflow 2.0.


  • Begüm Demir

    Begüm Demir

    TU Berlin, Germany

    Begüm Demir is a Professor and Head of the Remote Sensing Image Analysis (RSiM) group at the Faculty of Electrical Engineering and Computer Science, Technische Universität Berlin (TU Berlin), Germany. From 2013 to 2017, she was an Assistant Professor at the Department of Computer Science and Information Engineering, University of Trento, Italy, while in 2017 she became an Associate Professor at the same department. She received the B.S. degree in 2005, the M.Sc. degree in 2007, and the Ph.D. degree in 2010, all in Electronic and Telecommunication Engineering from Kocaeli University, Turkey. Her main research interests include image processing and machine learning with applications to remote sensing image analysis. She was a recipient of a Starting Grant from the European Research Council (ERC) with the project “BigEarth-Accurate and Scalable Processing of Big Data in Earth Observation” in 2017, and the “2018 Early Career Award” presented by the IEEE Geoscience and Remote Sensing Society.

    Dr. Begüm Demir is a senior member of IEEE since 2016. She is a Scientific Committee member of the Conference on Big Data from Space, Living Planet Symposium, International Joint Urban Remote Sensing Event and SPIE International Conference on Signal and Image Processing for Remote Sensing. She is the founder and the co-chair of Image and Signal Processing for Remote Sensing Workshop organized within the IEEE Conference on Signal Processing and Communications Applications since 2014. Currently, she is an Associate Editor of the IEEE Geoscience and Remote Sensing Letters.

    Email:   demir@tu-berlin.de
    Web:   https://www.rsim.tu-berlin.de/menue/team/prof_dr_beguem_demir/

    Lecture: Deep Earth Query: Advances in Remote Sensing Image Characterization and Indexing from Massive Archives
    Earth observation (EO) data archives are explosively growing as a result of advances in satellite systems. As an example, remote sensing (RS) images acquired by ESA’s Sentinel satellites (which are a part of EU’s Copernicus program) reach the scale of more than 10 TB per day. The “big EO data” is a great source for information discovery and extraction for monitoring Earth from above. Thus, accurate and scalable techniques for RS image understanding, search and retrieval have recently emerged.
    In this talk, a general overview on scientific and practical problems related to RS image characterization, indexing and search from massive archives will be initially discussed. Then, our recent developments that can overcome these problems will be presented. A particular attention will be given to introduce: (i) our multi-attention driven system that jointly exploits Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) in the context of multi-label RS image characterization and classification; and (ii) our deep hashing network that learns a semantic-based metric space, while simultaneously producing binary hash codes for scalable and accurate content-based indexing and retrieval of RS images. Finally, the BigEarthNet benchmark archive that we have constructed to drive the deep learning studies in RS will be introduced. BigEarthNet opens up promising directions to advance research for the analysis of large-scale RS image archives and has reached to more than 3000 users from space agencies, space industry and the science community, since its first release 4 months ago.


  • Agata Filipowska

    Agata Filipowska

    Poznan University of Economics, Poland

    Agata Filipowska is an Assistant Professor at Poznan University of Economics, Department of Information Systems in Poznan, Poland. Agata Filipowska has the PhD from Poznan University of Economics and Macquarie University in Sydney, Australia within Cotutelle Agreement (received in 2010, Summa Cum Laude). The PhD Thesis was acknowledged with a prizes from the Polish Association of Information Systems and Rector of Poznan University of Economics. Since 2004 she has been involved in projects funded by the European Commission under 6 and 7 Framework Programme: ASG, Insemtives, LOD2, Service Web 3.0, SUPER and USE-ME.GOV. Agata was also involved in several projects financed by Polish research agencies as well as by companies. Currently, she coordinates cooperation with Orange S.A. within the Innovation Farm Framework at Poznan University. The competence gained within projects enabled her to be invited as an expert of the European Commission. Agata Filipowska was a member of PC of over 70 workshops and conferences and an author (and lecturer) of few tutorials on Semantic Business Process Management. She is an author of numerous conference and journal papers.

    Prof. Carmona was a Visiting Researcher at Leiden University (January-September, 2002), and a Visiting Professor at Mannheim University (January-July, 2015).

    Email:   agata.filipowska@ue.poznan.pl
    Web:   http://www.kie.ue.poznan.pl/pl/content/agata-filipowska

    Lecture: Text Analytics
    Big data analytics concerns processing variety of data from variety of sources. Recent years, companies focus on users/customers to provide personalised user experience and therefore user-generated data is getting on importance. Such data is usually published online in textual formal. Customer reviews, blogs, comments on profiles of companies, make various entities not only retrieve, but also automatically analyse the content generated by customers, users, competitors, etc. This talk will focus on defining and studying various tasks of text analytics. Sources of textual data will be introduced and related challenges discussed. Then, along with the process of text analytics, various examples will be presented to demonstrate how text analytics should be carried out (steps to be presented include tokenization, lemmatization, disambiguation, etc.). Finally, potential applications of text analytics are to be discussed (including sentiment analysis or automatic generation of content). The lecture will be accompanied with examples showing the diversity of challenges, approaches and application scenarios.


  • Ralf-Detlef Kutsche

    Ralf-Detlef Kutsche

    Technische Universität Berlin, Germany

    Dr. Ralf-Detlef Kutsche, holds a position as Academic Director in the Database and Information Management DIMA group at TU Berlin. His hot topics in teaching and research are model building and modeling methodology, focusing on model-based semantic integration of heterogeneous information systems. Dr. Kutsche has many publications in the foundations and applications of this area, as well as comprehensive experience in national and international project leadership. In the past years, he was (co-)chair of several international conferences/workshops, and invited speaker in conferences, international cooperations and with industry, with a special emphasis on model-based software and data integration. He studied Mathematics at FU Berlin, focusing on Numerical Mathematics, and Computer Science (as a minor at TU Berlin). After the diploma degree he has worked as scientist in Theoretical Computer Science at TU Berlin, and then in applications of formal methods to clinical information systems at German Heart Institute Berlin DHZB, concluding his Ph.D. in this area. As senior scientist from 1994 at TU chair 'Computation and Information Structures CIS', he established the focus area 'Heterogeneous Distributed Information Systems'. At the same time he was scientific coordinator, project leader and member of the board of leaders with the Fraunhofer Institute for Software and Systems Engineering ISST. For an intermediate period of more than 30 months from 2005, Dr. Kutsche acted as provisional head of the CIS group. He is advisor of many diploma theses at TU Berlin and (co-)advisor of several Ph.D. dissertations within the Graduate School Distributed Information Systems, at TU Berlin, at Fraunhofer ISST, and at other universities. Presently, he is Science Chair of the BIZWARE initiative, after BIZYCLE a second large-scale project of TU Berlin and industry (SMEs) in the area of model based software and data integration in several business domains, funded by German BMBF. Beside his activities at TU Berlin, again he is engaged in coordination of 'Modeling' research at Fraunhofer FIRST, now Fraunhofer FOKUS institute.

    Email:   ralf-detlef.kutsche@tu-berlin.de
    Web:   https://www.dima.tu-berlin.de/menue/team/ralf_detlef_kutsche/

    Lecture: Science Methodology
    Scientific work (i.e. reading books and research papers, writing seminar and project reports, technical reports, publishing workshop or conference papers, and, of course, the Master's or PhD thesis) is the essence of the Master's and PhD students for the forthcoming resp. 2, 3, 5 years in education and research. Based on a 40 years teaching experience, on a thirty years research experience, and on a 20 years international teaching, research, and collaboration experience across the whole world, this tutorial will cover the essence of all “scientific activities” from the basics of qualified search for scientific information, unto the high-end of publishing first-class conference papers. The tutorial session will cover the main aspects of scientific reading, scientific writing and publishing, scientific presentations (how to be short and precise, how to focus on the essence, how to attract the audience, how to keep in touch with the audience, different styles and “psychologies” of the speaker) and scientific reviewing. Besides the lecture, we will have short “hands-on” parts, in order to take away very practical hints and results for “your future work and career”.


  • Sherif Sakr

    Sherif Sakr

    University of Tartu, Estonia

    Sherif Sakr is a Full Professor and the head of Data Systems Group at University of Tartu. He received his PhD degree in Computer and Information Science from Konstanz University, Germany in 2007. He received his BSc and MSc degrees in Computer Science from the Information Systems department at the Faculty of Computers and Information in Cairo University, Egypt, in 2000 and 2003 respectively. Previously, he worked at University of New South Wales Australia, CSIRO Australia, Microsoft Research, Nokia Bell Labs and King Saud Bin Abdulaziz University for Health Sciences. His research interest is data and information management in general, particularly, in big data processing systems, big data analytics and data science. Sherif is an ACM Distinguished Speaker and an IEEE Distinguished Speaker. He is a (co)-author of several books in the domain of big data technologies. He is serving as the Editor-in-Chief of the Springer Encyclopedia of Big Data Technologies. He is also serving as a Co-Chair for European Big Data Value Association (BDVA) TF6-Data Technology Architectures Group.

    Email:   sherif.sakr@ut.ee
    Web:   http://math.ut.ee/~sakr/, https://bigdata.cs.ut.ee/

    Lecture: Automated Machine Learning in The Big Data Era
    Nowadays, machine learning techniques and algorithms are employed in almost every application domain (e.g., financial applications, advertising, recommendation systems, and user-behavior analytics). They are playing a crucial role in harnessing the power of Big Data which we are currently producing every day in our digital world. In general, the process of building a high-quality machine learning pipeline is an iterative, complex and time-consuming process that requires solid knowledge and understanding of the various techniques that can be employed in each step of the pipeline. With the continuous and vast increase of the amount of data in our digital world, it has been acknowledged that the number of knowledgeable data scientists cannot scale to address these challenges. Thus, there is a crucial need for automating the process of building good machine learning pipelines. In this talk, I will present an overview of automated machine learning process, I will present some popular automated machine learning frameworks and I will discuss some of the open challenges and research directions in this domain.


  • Kristian Torp

    Kristian Torp

    Aalborg University, Denmark

    Kristian Torp is an associate professor at the Department of Computer Science, Aalborg University in Denmark. He received a PhD at the same university in 1998. Kristian worked for two companies for a four-year period focusing on many low-level aspects of data management, e.g., modeling, tuning, and security. In 2003, Kristian rejoined Aalborg University and has since been working in the database group (the Daisy group). Kristian's main research interest is tracking objects that move in a road network, e.g., bikes, buses, and cars (electrical and conventional). Kristian has coordinated the development of a number of research prototypes for handling very large quantities of GPS data from many different vehicle types. These prototypes are document in a series of demo, system, and research papers at for example the ACM SIGSPATIAL and MDM conferences. The first prototype is now more than 10 years old and have continually been evolved and extended. The datasets have passed 100 billion records and grow with 300-500 million records per week. These datasets are the foundation for most of Kristian's work.

    Email:   torp@cs.aau.dk
    Web:   http://www.cs.aau.dk/~torp

    Lecture: Handling very large GPS datasets
    The talk focuses on handling very large GPS datasets from different types of vehicles, e.g., electric and conventional vehicles. The usefulness of a GPS dataset depends significantly on the sampling period. As an example, if there are 60 seconds between GPS records from a vehicle it is often very difficult to convert the records into a trajectory where you can see the exact route followed by the vehicle. However, if you have 1 (one) second between GPS records you can make very detailed analysis of vehicle movements. Such high-frequent GPS records are converted to trajectory data, which is useful for a number of traffic analysis, e.g., turn-time in intersections, fuel consumption on motorways, the effect of slopes on electric vehicle energy consumption, and wind and temperatures influence on driving patterns. In the talk, the features of a number GPS datasets are introduced and it is shown what the datasets can been used for in the traffic-analysis area. Further, the talk will try to outline where we are facing problems (reads “where I have previously failed, e.g., due to too little data, too dirty data, or non-existing data”).


  • Laurenz Wiskott

    Laurenz Wiskott

    Ruhr-Universität Bochum, Germany

    Laurenz Wiskott studied physics in Göttingen and Osnabrück, Germany, and received his PhD in 1995 at the Ruhr-University Bochum. The stages of his career include three-years at The Salk Institute in San Diego, one year at the Institute for Advanced Studies in Berlin, and nine years at the Institute for Theoretical Biology, Humboldt-University Berlin, where he was heading a junior research group and became Professor in 2006. Since 2008 he is at the Institute for Neural Computation, Ruhr-Univerity Bochum. He has been working in the fields of Computer Vision (face recognition with elastic bunch graph matching), Neural Networks and Machine Learning (in particular unsupervised learning and reinforcement learning, he developed slow feature analysis), and Computational Neuroscience (visual system and hippocampus).

    Email:   laurenz.wiskott@rub.de
    Web:   https://www.ini.rub.de/PEOPLE/wiskott/

    Lecture: Spectral graph theory for data visualization and feature extraction
    Graphs are a very natural representation of many types of data, in particular if the data is non-vectorial and has no natural metric but only a notion of distance or relatedness. Graphs can be represented as matrices that represent the connectivity within the graph. Eigenvalues and eigenvectors of such matrices often give interesting insights into the structure of the data, which can then be used for visualization, e.g. low dimensional metric embedding of high dimensional non-metric data, or feature extraction for classification or regression.

News

22
The web site for the summer school was launched.

3
The application for the summer school was opened.

Sponsors