Mohammed Zaki

Mohammed J. Zaki is a Professor and Department Head of Computer Science at RPI. He received his Ph.D. degree in computer science from the University of Rochester in 1998. His research interests focus on developing novel data mining and machine learning techniques, especially for applications in text mining, social networks, bioinformatics and personal health. He has around 300 publications (and 6 patents), including the Data Mining and Machine Learning textbook (2nd Edition, Cambridge University Press, 2020). He is the founding co-chair for the BIOKDD series of workshops. He is currently an associate editor for Data Mining and Knowledge Discovery, and he has also served as Area Editor for Statistical Analysis and Data Mining, and as Associate Editor for ACM Transactions on Knowledge Discovery from Data, and Social Networks and Mining. He was the program co-chair for SDM’08, SIGKDD’09, PAKDD’10, BIBM’11, CIKM’12, ICDM’12, IEEE BigData’15, and CIKM’18, and he is co-chairing CIKM’22. He is currently serving on the Board of Directors for ACM SIGKDD. He was a recipient of the National Science Foundation CAREER Award and the Department of Energy Early Career Principal Investigator Award, as well as HP Innovation Research Award, and Google Faculty Research Award. His research is supported in part by NSF, DARPA, NIH, DOE, IBM, Google, HP, and Nvidia. He is a Fellow of the IEEE and a Fellow of the ACM.

http://www.cs.rpi.edu/~zaki/

Luca Maria Aiello

I am an Associate Professor at the IT University of Copenhagen. I got my PhD in Computer Science from the University of Turin, Italy, in 2012. I conduct interdisciplinary research at the intersection of computational social science, digital health, network science, and urban informatics. I use large-scale digital data to quantify people’s well-being and build systems that can improve it. Currently, I am focusing on Natural Language Processing to quantify social and psychological experiences from text. I had a few past professional roles: Senior Research Scientist at Bell Labs in Cambridge, UK; Research Fellow of the ISI Foundation in Turin; Research Scientist at Yahoo Labs Barcelona and London; visiting scientist at the Center for Complex Networks and Systems at Indiana University.

http://www.lajello.com/

Ricardo Baeza-Yates

Ricardo Baeza-Yates is currently Director of Research (part-time) at the Institute for Experiential AI of Northeastern University, Silicon Valley campus, since January 2021. He is also a member of the DATA Lab at the Khoury College of Computer Sciences. The rest of the time he does consulting for tech startups, companies and non-profit international institutions, particularly in responsible AI.

He is actively involved as expert in many initiatives, committees or advisory boards related to Responsible AI all around the world: Global AI Ethics Consortium, Global Partnership on AI, IADB’s fAIr LAC Initiative (Latin America and the Caribbean), Spain’s Council of AI, and ACM’s US Technology Policy Committee. He is also a co-founder of OptIA in Chile, a NGO devoted to algorithmic transparency and inclusion, and member of the editorial committee of the new AI and Ethics Journal where he co-authored an article highlighting the importance of research freedom on AI ethics.

Between 2016 and 2020 he was CTO of NTENT, a search technology company based in Carlsbad, California. Previously, he was VP of Research at Yahoo Labs, based in Barcelona, Spain, and later in Sunnyvale, California, from January 2006 to February 2016. Between 2008 and 2012 he also supervised Yahoo Labs Haifa and between 2012 and 2014 Yahoo Labs London. Until 2005 he was the director of the Center for Web Research at the Department of Computer Science of the Engineering School of the University of Chile; and ICREA Professor and founder of the Web Science and Social Computing Research Group (formerly Web Research Group) at the Dept. of Information and Communication Technologies of Universitat Pompeu Fabra in Barcelona, Spain. He maintains ties with both mentioned universities as a part-time professor. Finally, he is also an adjunct professor at the CS department of the University of Waterloo, Canada.  

His research interests includes algorithms and data structures, information retrieval, web search and data mining, and data science and visualization.

He is ACM Fellow and IEEE Fellow.

LinkedIn    Twitter    Google Scholar   DBLP   

Mapping the NFT Revolution

Non-Fungible Tokens (NFTs) are units of data stored on a blockchain that certifies a digital asset to be unique and therefore not interchangeable, while offering a unique digital certificate of ownership. Public attention towards NFTs has exploded in 2021, when their market has experienced record sales. For long, little was known about the overall structure and evolution of its market. To shed some light on its dynamics, we collected data concerning 6.1 million trades of 4.7 million NFTs between June 2017 and April 2021 to study the statistical properties of the market and to gauge the predictability of NFT prices. We also studied the properties of the digital items exchanged on the market to find that the emerging norms of NFT valuation thwart the non-fungibility properties of NFTs. In particular, rarer NFTs: (i) sell for higher prices, (ii) are traded less frequently, (iii) guarantee higher returns on investment (ROIs), and (iv) are less risky, i.e., less prone to yield negative returns.

Luca Maria Aiello

Associate Professor at the IT University of Copenhagen, Denmark

http://www.lajello.com/
https://twitter.com/lajello

O Pensamento Analítico na Otimização e Solução de Problemas em Extração de Documentos

A MOST é especialista na automatização de processos cadastrais. Um dos nossos serviços, o mostQI, é focado na classificação e extração de campos de documentos complexos. Para isso, ele requer o uso de uma camada de leitura óptica de caracteres (OCR) muito robusta.
Nesta apresentação falaremos de alguns dos problemas encontrados durante o desenvolvimento do mostQI, assim como o modo de pensar que nos levou a soluções e algumas otimizações para estes problemas.

Marco Antônio Ribeiro @Most

Computer and Information Research Scientist

https://most.com.br/

Towards Democratizing AI: Scaling and Learning (Fair) Graph Representations in an Implementation Agnostic Fashion

Recently there has been a surge of interest in designing graph embedding methods. Few, if any, can scale to a large-sized graph with millions of nodes due to both computational complexity and memory requirements. In this talk, I will present an approach to redress this limitation by introducing the MultI-Level Embedding (MILE) framework – a generic methodology allowing con-temporary graph embedding methods to scale to large graphs. MILE repeatedly coarsens the graph into smaller ones using a hybrid matching technique to maintain the backbone structure of the graph. It then applies existing embedding methods on the coarsest graph and refines the embeddings to the original graph through a graph convolution neural network that it learns. Time permitting, I will then describe one of several natural extensions to MILE – in a distributed setting (DistMILE) to further improve the scalability of graph embedding or mechanisms – to learn fair graph representations (FairMILE).
The proposed MILE framework and variants (DistMILE, FairMILE), are agnostic to the underlying graph embedding techniques and can be applied to many existing graph embedding methods without modifying them and is agnostic to their implementation language. Experimental results on five large-scale datasets demonstrate that MILE significantly boosts the speed (order of magnitude) of graph embedding while generating embeddings of better quality, for the task of node classification. MILE can comfortably scale to a graph with 9 million nodes and 40 million edges, on which existing methods run out of memory or take too long to compute on a modern workstation. Our experiments demonstrate that DistMILE learns representations of similar quality with respect to other baselines while reducing the time of learning embeddings even further (up to 40 x speedup over MILE). FairMILE similarly learns fair representations of the data while reducing the time of learning embeddings.
Joint work with Jionqian Liang (Google Brain), S. Gurukar (OSU) and Yuntian He (OSU)

Srinivasan Parthasarathy

Professor of Computer Science and Engineering, The Ohio State University
https://web.cse.ohio-state.edu/~parthasarathy.2/

Responsible AI

In the first part we cover five current specific problems that motivate the needs of responsible AI: (1) discrimination (e.g., facial recognition, justice, sharing economy, language models); (2) phrenology (e.g., biometric based predictions); (3) unfair digital commerce (e.g., exposure and popularity bias); (4) stupid models (e.g., minimal adversarial AI) and (5) indiscriminate use of computing resources (e.g., large language models). These examples do have a personal bias but set the context for the second part where we address four challenges: (1) too many principles (e.g., principles vs. techniques), (2) cultural differences; (3) regulation and (4) our cognitive biases. We finish discussing what we can do to address these challenges in the near future to be able to develop responsible AI.

Ricardo Baeza-Yates

Ricardo Baeza-Yates is Director of Research at the Institute for Experiential AI of Northeastern University. Before, he was VP of Research at Yahoo Labs, based in Barcelona, Spain, and later in Sunnyvale, California, from 2006 to 2016. He is co-author of the best-seller Modern Information Retrieval textbook published by Addison-Wesley in 1999 and 2011 (2nd ed), that won the ASIST 2012 Book of the Year award. From 2002 to 2004 he was elected to the Board of Governors of the IEEE Computer Society and between 2012 and 2016 was elected for the ACM Council. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow, among other awards and distinctions. He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989, and his areas of expertise are web search and data mining, information retrieval, bias on AI, data science and algorithms in general.

LinkedIn    Twitter    Google Scholar   DBLP   

Mining, Learning and Semantics for Personalized Health

In this talk I’ll present an overview of the challenges and opportunities for applying data mining and machine learning for tasks in personalized health, including the role of semantics. In particular, I’ll focus on the task of healthy recipe recommendation via the use of knowledge graphs, as well as generating summaries from personal health data, highlighting our work within the RPI-IBM Health Empowerment by Analytics, Learning, and Semantics (HEALS) project.

Mohammed J. Zaki is a Professor and Department Head of Computer Science at RPI. He received his Ph.D. degree in computer science from the University of Rochester in 1998. His research interests focus novel data mining and machine learning techniques, particularly for learning from graph structured and textual data, with applications in bioinformatics, personal health and financial analytics. He has around 300 publications (and 6 patents), including the Data Mining and Machine Learning textbook (2nd Edition, Cambridge University Press, 2020). He founded the BIOKDD Workshop, and recently served as PC chair for CIKM’22. He currently serves on the Board of Directors for ACM SIGKDD. He was a recipient of the NSF and DOE Career Awards. He is a Fellow of the IEEE, a Fellow of the ACM, and a Fellow of the AAAS.

http://www.cs.rpi.edu/~zaki/

Ciência de Dados para Performance de Negócios

A Big Data, fundada em 2012, é pioneira na área de big data analytics no Brasil. Nessa palestra vamos trazer um pouco da nossa experiência na aplicação de soluções de IA e ML em grandes empresas e mostrar como a aplicação dessas tecnologias tem resultados reais. Vamos contar como vários produtos que você consome no bar, na farmácia e vários outros lugares, além do preço são determinados pelos nossos algoritmos.

Roberto Nalon @BigData

Sócio e Head of Data science na BigData

https://bigdata.com.br/