By Eleni Kamateri, Ioannis Tsampoulatidis, Stefanos Vrochidis, INFALIA
Aristeidis Bozas, Stelios Andreadis, Ilias Gialampoukidis, Centre for Research & Technology Hellas (CERTH)
DeepCube is an EU-funded research project that focuses on new and ambitious problems that imply high environmental and societal impact in order to enhance our understanding of Earth’s processes and the current and future Climate emergency. To achieve this, DeepCube will develop a number of ICT tools leveraging on advancements in the fields of AI and semantic web to unlock the potential of Earth Observation (EO) data, correlated with non-conventional (non-EO) data that include, among others, social media data.
Social media play an important role in the daily life of people around the globe. Every day, more and more people use social media to disseminate a wide range of information about themselves and things happened to them, or information about the social context which they live in. This large amount of information produced by the social media can be monitored, analyzed and visualized by dedicated Artificial Intelligence (AI) systems to uncover relevant public insights and trends. In DeepCube, publicly available social media data will be used to enhance the intensive Deep Learning (DL) models, which typically use only EO data.
During the project, social media data will be monitored and collected based on several criteria defined by the problem settings that the DeepCube aims to address. Then, the social media data will be processed by innovative AI mechanisms to extract high-level knowledge and insights, such as concepts detected on social media figures, expressed sentiments, locations, etc. This knowledge can later be utilized directly by end users or other internal or external platforms’ modules to enhance EO-related analysis results and provide additional learning features in the DL architectures.
In DeepCube, responsible for the designing and development of social media AI mechanisms is INFALIA, a spin-off company of the Centre for Research & Technology Hellas (CERTH). INFALIA will use, adapt and further extend based on specific use-case needs an open-source social media visualization technology that has been developed in the context of a past European project, named EOPEN (grant agreement 776019), which the M4D/MKLab (https://m4d.iti.gr) group of CERTH has technically coordinated.
Social media streams can also be visualized with advanced visual analytics equipped with several filtering options to improve the user’s interpretation and decision support. Such a visualization tool will be developed by DeepCube to explore the social media data in the most efficient and friendly manner. The social media visualization tool, served by an online User Interface (UI)-enabled Web application, will display and filter posts relevant to the project’s problem settings, in a variety of visualizations. The UI of the DeepCube social media visualization tool is depicted in Figure 1.
Users will be able to display the posts either as a scrollable list of posts or as pop-ups on a map. In both views, high-level knowledge extracted by social media data, such as detected locations and visual concepts, will also be displayed per post. Moreover, the visualization tool will provide filtering capabilities, e.g. the option to show data published in a certain time period, hide posts that do not contain an image, search for posts that convey a positive or negative sentiment, and fetch social media posts for a specific geographic location by choosing a predefined bounding box provided in the filter or drawing one on the map. Figure 2 shows the drawling of a bounding box around Ethiopia and the visualization of retrieved posts.
Additionally, the users can request for posts that their images have a specific concept. For example, in Figure 3 the user has selected the concept filter “animal” in order to visualize posts whose image depicts the concept “animal”.
Finally, the UI includes a chart section with a line chart and a pie chart, where the users can easily observe with an intuitive way the total amount of posts that are retrieved and how many positive and negative posts exist.
In the backend, the visualization tool will be served by an API, which will communicate with the database where collected social media data and analysis results are stored. The filtering options of the end users are considered in order to call the proper API endpoint and the API response is used for visualization.
Social media tools that will be developed in DeepCube are designed to serve the specific needs defined by end users for two specific Use Cases (UC).
UC1: Climate induced migration in Africa
The first use case, named “Climate induced migration in Africa”, deals with the environmental and socio-economic drivers of human mobility especially in sub-Saharan Africa. In order to understand these drivers and impacts, social media data will be used to extract concepts that reveal patterns and socio-economic information with respect to migration flows.
In this use case, the social media data will be collected from Twitter in almost real-time using the platform’s API. Initially, a social media post is scanned for words that concern areas of interest for the use case, such as Africa, Ethiopia, Somalia, and certain sub areas of those countries. Then, certain keywords are searched in the text of each post that are implying environmental disasters, droughts, hunger, conflicts and other various reasons of displacement. Examples of keywords include “droughts”, “flood”, “famine”, “conflict”, “unemployment”, “refugees”, “climate-change”.
UC2: EO4tourism – Copernicus services for sustainable and environmentally-friendly tourism
The second use case where social media analysis can provide an added-value is named EO4tourism, which stands for Copernicus services for sustainable and environmentally-friendly tourism. The use case aims to produce a pricing engine (for hotel rooms and tour packages) incorporating the environmental and sustainable tourism dimension. For setting-up the demand model, sentiment analysis data from social networks will be used revealing the interest in the region of interest from social media.
In this use case, the social media data will be collected not only from Twitter, but also from Instagram, as it is considered a well-suited social media platform to collect information about tourism. At start, in both platforms the text will be parsed in order to find words that are areas of interest for the use case, i.e. Brazil and the city of São Paulo. Afterwards, the posts on Instagram will be collected by hashtag search and on Twitter by keyword search in the text. Instagram hashtags refer to hotels, travelers and tourism in general in order to extract the sentiment of the people for the place they visit (e.g. #hotel, #voyage, #travelgram, #tourism, #vacation). Similarly, on Twitter the keywords used for the search are about hotels, travelers and tourism (e.g. “tourist destination”, “over-tourism”, “trip”, “voyage”, “hotel”).