Associate Professor at the Department of Computer Science, in the School of Engineering at PUC Chile. I am principal researcher at the National Center of Artificial Intelligence (CENIA) as well as principal research at the Millenium Institute for Intelligent Healthcare Engineering (iHealth). I am also adjunct researcher at the Millennium Institute for Research on Fundamentals of Data. I hold a professional title of Civil Engineer in Informatics in 2004 from UACh, Valdivia, Chile; and a Ph.D. in Information Science from University of Pittsburgh, USA, advised by Professor Peter Brusilovsky. I earned a Fulbright scholarship to pursue my PhD studies between 2008-2013.
My research interests are Recommender Systems, Intelligent User Interfaces, Applications of Machine Learning (Healthcare, Creative AI) and Information Visualization and I am currently leading the Human-centered AI and Visualization (HAIVis) research group as well as co-leading the CreativAI Lab with professor Rodrigo Cádiz. I am also Faculty member of the PUC IA Lab.
Indisputably, the Web has revolutionized how people receive, consume, and interact with information. At the same time, unfortunately, the Web offers a fertile ground for online harms like the spread of hateful content and false information; hence there is a pressing need to develop techniques and tools to understand, detect, and mitigate these issues on the Web. In this talk, I will present our work on understanding and detecting hateful content using recent Artificial Intelligence (AI) advancements. The talk will focus on how we can use AI models to detect hateful content across multiple modalities (text and images) and understand the spread and evolution of hateful content online. I will conclude the talk with ongoing work on how prone Text-to-Image models are (e.g., Stable Diffusion in generating unsafe content).
Savvas Zannettou is an Assistant Professor at Delft University of Technology (TU Delft) and an associated researcher with the Max Planck Institute for Informatics. Before joining TU Delft, he was a Postdoctoral Researcher at Max Planck Institute for Informatics. He obtained his PhD from Cyprus University of Technology in 2020. His research focuses on applying machine learning and data-driven quantitative analysis to understand emerging phenomena on the Web, such as the spread of false information and hateful rhetoric. Also, he is interested in understanding algorithmic recommendations on the Web, their effect on end-users, and to what extend algorithms recommend extreme content. Finally, he is interested in analyzing content moderation systems to understand the effectiveness of moderation interventions on the Web.
Non-Fungible Tokens (NFTs) are units of data stored on a blockchain that certifies a digital asset to be unique and therefore not interchangeable, while offering a unique digital certificate of ownership. Public attention towards NFTs has exploded in 2021, when their market has experienced record sales. For long, little was known about the overall structure and evolution of its market. To shed some light on its dynamics, we collected data concerning 6.1 million trades of 4.7 million NFTs between June 2017 and April 2021 to study the statistical properties of the market and to gauge the predictability of NFT prices. We also studied the properties of the digital items exchanged on the market to find that the emerging norms of NFT valuation thwart the non-fungibility properties of NFTs. In particular, rarer NFTs: (i) sell for higher prices, (ii) are traded less frequently, (iii) guarantee higher returns on investment (ROIs), and (iv) are less risky, i.e., less prone to yield negative returns.
Luca Maria Aiello
Associate Professor at the IT University of Copenhagen, Denmark
Recently there has been a surge of interest in designing graph embedding methods. Few, if any, can scale to a large-sized graph with millions of nodes due to both computational complexity and memory requirements. In this talk, I will present an approach to redress this limitation by introducing the MultI-Level Embedding (MILE) framework – a generic methodology allowing con-temporary graph embedding methods to scale to large graphs. MILE repeatedly coarsens the graph into smaller ones using a hybrid matching technique to maintain the backbone structure of the graph. It then applies existing embedding methods on the coarsest graph and refines the embeddings to the original graph through a graph convolution neural network that it learns. Time permitting, I will then describe one of several natural extensions to MILE – in a distributed setting (DistMILE) to further improve the scalability of graph embedding or mechanisms – to learn fair graph representations (FairMILE). The proposed MILE framework and variants (DistMILE, FairMILE), are agnostic to the underlying graph embedding techniques and can be applied to many existing graph embedding methods without modifying them and is agnostic to their implementation language. Experimental results on five large-scale datasets demonstrate that MILE significantly boosts the speed (order of magnitude) of graph embedding while generating embeddings of better quality, for the task of node classification. MILE can comfortably scale to a graph with 9 million nodes and 40 million edges, on which existing methods run out of memory or take too long to compute on a modern workstation. Our experiments demonstrate that DistMILE learns representations of similar quality with respect to other baselines while reducing the time of learning embeddings even further (up to 40 x speedup over MILE). FairMILE similarly learns fair representations of the data while reducing the time of learning embeddings. Joint work with Jionqian Liang (Google Brain), S. Gurukar (OSU) and Yuntian He (OSU)
In the first part we cover five current specific problems that motivate the needs of responsible AI: (1) discrimination (e.g., facial recognition, justice, sharing economy, language models); (2) phrenology (e.g., biometric based predictions); (3) unfair digital commerce (e.g., exposure and popularity bias); (4) stupid models (e.g., minimal adversarial AI) and (5) indiscriminate use of computing resources (e.g., large language models). These examples do have a personal bias but set the context for the second part where we address four challenges: (1) too many principles (e.g., principles vs. techniques), (2) cultural differences; (3) regulation and (4) our cognitive biases. We finish discussing what we can do to address these challenges in the near future to be able to develop responsible AI.
Ricardo Baeza-Yates is Director of Research at the Institute for Experiential AI of Northeastern University. Before, he was VP of Research at Yahoo Labs, based in Barcelona, Spain, and later in Sunnyvale, California, from 2006 to 2016. He is co-author of the best-seller Modern Information Retrieval textbook published by Addison-Wesley in 1999 and 2011 (2nd ed), that won the ASIST 2012 Book of the Year award. From 2002 to 2004 he was elected to the Board of Governors of the IEEE Computer Society and between 2012 and 2016 was elected for the ACM Council. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow, among other awards and distinctions. He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989, and his areas of expertise are web search and data mining, information retrieval, bias on AI, data science and algorithms in general.
In this talk I’ll present an overview of the challenges and opportunities for applying data mining and machine learning for tasks in personalized health, including the role of semantics. In particular, I’ll focus on the task of healthy recipe recommendation via the use of knowledge graphs, as well as generating summaries from personal health data, highlighting our work within the RPI-IBM Health Empowerment by Analytics, Learning, and Semantics (HEALS) project.
Mohammed J. Zaki is a Professor and Department Head of Computer Science at RPI. He received his Ph.D. degree in computer science from the University of Rochester in 1998. His research interests focus novel data mining and machine learning techniques, particularly for learning from graph structured and textual data, with applications in bioinformatics, personal health and financial analytics. He has around 300 publications (and 6 patents), including the Data Mining and Machine Learning textbook (2nd Edition, Cambridge University Press, 2020). He founded the BIOKDD Workshop, and recently served as PC chair for CIKM’22. He currently serves on the Board of Directors for ACM SIGKDD. He was a recipient of the NSF and DOE Career Awards. He is a Fellow of the IEEE, a Fellow of the ACM, and a Fellow of the AAAS.