Márcio Silva

Selected Publications

For full list of publications, kindly check my Google Scholar

Conferences

Anatomy of Hate Speech Datasets: Composition Analysis and Cross-dataset Classification
In Proceedings of the 34th ACM Conference on Hypertext and Social Media. November, 2023.

Abstract: Manifestations of hate speech in different scenarios are increasingly frequent on social platforms. In this context, there is a large number of works that propose solutions for identifying this type of content in these environments. Most efforts to automatically detect hate speech follow the same process of supervised learning, using annotators to label a predefined set of messages, which are, in turn, used to train classifiers. However, annotators can create labels for different classification tasks, with divergent definitions of hate speech, binary or multi-label schemes, and various methodologies for collecting data. In this context, we examine the principal publicly available datasets for hate speech research. We investigate the types of hate speech (e.g., ethnicity, religion, sexual orientation) present in their composition, explore their content beyond the labels, and use cross-dataset classification to examine the use of the labeled data beyond its original work. Our results reveal interesting insights toward a better understanding of the hate speech phenomenon and improving its detection on social platforms.

Characterizing Early Electoral Advertisements on Twitter: A Brazilian Case Study
In Proceedings of the Social Informatics (SocInfo'22). October, 2022.

Abstract: Some countries impose strict regulations regarding the distribution of electoral advertising during election periods. This is the case of Brazil, where electoral ads distributed before a predetermined period (called early ad) are prohibited by law. Whereas the enforcement of such regulation on traditional mass media technologies (e.g., radio and TV) is common practice in the country, the same is a very challenging task for content shared on social media platforms, mostly due to the lack of proper tools to automatically identify content containing (early) electoral ads. This study aims to develop fundamental knowledge about characteristics of textual content containing early ads shared on Twitter, so as to drive the future design of effective detection tools. We offer a broad characterization of the textual content associated with a set of early electoral ads shared on Twitter in pre-election periods of three recent elections in Brazil, comparing their textual properties with those of other (non ads) tweets. Our main findings are that ads tend to have a negative or neutral sentiment, a certain syntactic structure, while most tend to explicitly mention a candidate or party to be chosen or avoided.

COVID-19 Ads as Political Weapon
In Proceedings of the The 36th ACM/SIGAPP Symposium On Applied Computing (SAC'21). March, 2021.

Abstract: In view of the emergence of mobility restrictions and social isolation imposed by the coronavirus or COVID-19 pandemic, digital media, especially social networks, become a breeding ground for fake news, political attacks and large-scale misinformation. The impacts of this ‘infodemic’can take even greater proportions when using sponsored content on social networks, such as Facebook ads. Using the Facebook ad library we collected more than 391K facebook ads from 90 different countries. Choosing ads from Brazil as the focus of research, we found ads with political attacks, requests for donations, doctors prescribing vitamin D as a weapon to fight coronavirus, among other contents with evidence of misinformation.

Analyzing the Use of COVID-19 Ads on Facebook
In Proceedings of the roceedings of the 24th Brazilian Symposium on Multimedia and the Web. San Francisco, USA. November, 2020.

Abstract: In view of the emergence of mobility restrictions and social isolation imposed by the coronavirus or COVID-19 pandemic, digital media, especially social networks, become a breeding ground for fake news, political attacks and large-scale misinformation. The impacts of this' infodemic'can take even greater proportions when using sponsored content on social networks, such as Facebook ads. Using the Facebook ad library we collected more than 236k facebook ads from 75 different countries. Choosing ads from Brazil as the focus of research, we found ads with political attacks, requests for donations, doctors prescribing vitamin D as a weapon to fight coronavirus, among other contents with evidence of misinformation.

A System for Monitoring Public Political Groups in WhatsApp
In Proceedings of the 24th Brazilian Symposium on Multimedia and the Web (Webmedia'18). Salvador, Brazil. October, 2018.

Abstract: In Brazil, 48% of the population use WhatsApp to share and discuss news. Currently, there are serious concerns that this platform can become a fertile ground for groups interested in disseminating misinformation, especially as part of articulated political campaigns. Particularly, WhatsApp provides an important space for users to engage in public conversations that worth attention, the public groups. These groups are suitable for political activism and social movement organization. Additionally, it is reasonable to assume that a malicious misinformation campaign might attempt to maximize the audience of a fake story by sharing it in existing public groups. In this paper, we present a system for gathering, analyzing and visualize public groups in WhatsApp. In addition to describe our methodology, we also provide a brief characterization of the content shared in 127 Brazilian groups. We hope our system can help journalists and researchers to understand the repercussion of events related to the Brazilian elections within these groups.

Facebook Ads Monitor: An Independent Auditing System for Political Ads on Facebook
In Proceedings of the Proceedings of The Web Conference 2020 (WWW'20). Taipei. April 2020.

Abstract: The 2016 United States presidential election was marked by the abuse of targeted advertising on Facebook. Concerned with the risk of the same kind of abuse to happen in the 2018 Brazilian elections, we designed and deployed an independent auditing system to monitor political ads on Facebook in Brazil. To do that we first adapted a browser plugin to gather ads from the timeline of volunteers using Facebook. We managed to convince more than 2000 volunteers to help our project and install our tool. Then, we use a Convolution Neural Network (CNN) to detect political Facebook ads using word embeddings. To evaluate our approach, we manually label a data collection of 10k ads as political or non-political and then we provide an in-depth evaluation of proposed approach for identifying political ads by comparing it with classic supervised machine learning methods. Finally, we deployed a real system that shows the ads identified as related to politics. We noticed that not all political ads we detected were present in the Facebook Ad Library for political ads. Our results emphasize the importance of enforcement mechanisms for declaring political ads and the need for independent auditing platforms.

Measuring the Facebook advertising ecosystem
In Proceedings of the NDSS 2019 - Proceedings of the Network and Distributed System Security Symposium (NDSS'19). California, USA. January 2019.

Abstract: The Facebook advertising platform has been subject to a number of controversies in the past years regarding privacy violations, lack of transparency, as well as its capacity to be used by dishonest actors for discrimination or propaganda. In this study, we aim to provide a better understanding of the Facebook advertising ecosystem, focusing on how it is being used by advertisers. We first analyze the set of advertisers and then investigate how those advertisers are targeting users and customizing ads via the platform. Our analysis is based on the data we collected from over 600 real-world users via a browser extension that collects the ads our users receive when they browse their Facebook timeline, as well as the explanations for why users received these ads. Our results reveal that users are targeted by a wide range of advertisers (e.g., from popular to niche advertisers); that a non-negligible fraction of advertisers are part of potentially sensitive categories such as news and politics, health or religion; that a significant number of advertisers employ targeting strategies that could be either invasive or opaque; and that many advertisers use a variety of targeting parameters and ad texts. Overall, our work emphasizes the need for better mechanisms to audit ads and advertisers in social media and provides an overview of the platform usage that can help move towards such mechanisms.

Brazilian Venues

WhatsApp Monitor 2.0 - Monitoring Brazilian Political Groups on WhatsApp.
In Proceedings of the 39th Brazilian Symposium on Databases (SBBD), 2024.

Abstract: WhatsApp has become a crucial tool in communicating and disseminating (mis)information in Brazil. Since 2018, the tool has been widely used for disinformation and hate speech campaigns. In this work, we propose WhatsApp Monitor 2.0, a web-based system that aids researchers and journalists in tracking, in real-time, the most popular content shared in public WhatsApp political groups. Our tool monitors, processes, and ranks images, videos, audios, and text messages posted in these groups, presenting the most popular content daily. WhatsApp Monitor 2.0 provides a valuable resource for identifying viral content on WhatsApp, thus helping to combat misinformation.

Legal Electoral Campaign: Detection of Electoral Propaganda and Coordinated Campaign Actions.
Companion Proceedings of the 30th Brazilian Symposium on Multimedia and Web (WebMedia), 2024.

Abstract: Spreading electoral propaganda using Online Social Networks (OSNs) during elections is an important problem and novel approaches are necessary to mitigate its effects. The lack of automatic electoral propaganda detection supports candidates which makes true digital podiums have emerged for candidates to spread their ideas, fight opponents, and ask for votes during the pre-electoral period. In Brazil, it is prohibited by law to declare candidacy in a political election and to make any (explicit or implicit) request to vote ahead of time. In this context, this work presents a system named Campanha Eleitoral Legal to help the detection of this type of propaganda on X (formerly Twitter) adopted by Ministério Público de Minas Gerais (MPMG). Our system is able to collect, categorize, and highlight posts that contain a high probability of being electoral propaganda. Thus, this system can be great tool for Brazilian authorities.

Characterizing Brazilian Political Ads on Facebook
In Proceedings of the Brazilian Symposium on Multimedia and the Web. Curitiba, PR - Brazil. Brazilian Computer Society, 2022.

Abstract: Most of politicians, public figures and political candidates use online advertising platforms to spread their political values and messages. Since 2018, Facebook has made available an Ad Library providing advertising transparency to prevent interference in elections and other political issues. However, it is not explicit how the ads are selected to incorporate this database and to what extent there is an artificial intelligence applied to this selection. In this work, we provide a categorization of the ads data in Brazil to understand the dynamic of political advertisements and what type of ads are present in this ad library. We analyze impressions, the money spent and who are the advertisers on ads from 2018 to 2021. Among our findings, we show that during the election months of 2018 and 2020 the volume of ads correspond to approximately 30% of the ads in the dataset and the moving average of the money spent per ads increases about 200% after the first round of brazilian elections.

Analysis of Early Electoral Advertisements on Twitter
In Proceedings of XI Brazilian Workshop on Social Network Analysis and Mining. Niterói, RJ - Brazil. Brazilian Computer Society, 2022.

Abstract: Electoral advertising are an essential part of an election. The popularization of online social networks has offered a promising way for candidates In fact, the use of these applito communicate with the electorate at large. cations to share electoral ads has already being pointed out, even outside the period allowed by Brazilian law. Yet, fighting this practice is hampered by the lack of a broader knowledge about the characteristics of this type of content, which allows effective detection solutions. This study aims to contribute to this knowledge through a broad characterization of the textual content associated with a set of early electoral advertisements shared on Twitter in pre-election periods associated with recent elections in Brazil (2016, 2018 and 2020). Our main findings are that ads tend to have a negative or neutral sentiment, a certain constant structure and more than half tend to explicitly mention a candidate or party to be chosen or avoided.

CaçaFake: A System for Monitoring and Analyzing Low Credibility Websites in Brazil
In Proceedings of the Workshop on Tools and Applications of the 28th Brazilian Symposium on Multimedia and WEB. Brazilian Computer Society, 2022.

Abstract: The popularization of the use of online social networks as platforms for political debate has brought new challenges like the improper propagation of electoral advertisements. On the one hand, voters use social networks to interact, inform themselves, and get to know their candidates. On the other hand, true digital podiums have emerged for candidates to spread their ideas, fight opponents and ask for votes. Thus, pre-candidates can use the platforms to request votes outside the electoral period, a practice known as early electoral propaganda. Although legislation on this exists, the lack of digital tools and detection methods for this practice can be exploited. In this context, this work presents a methodology to help the detection of this type of propaganda. We collect and characterize data from Twitter and Facebook during three pre-election periods in Brazil (2016, 2018, and 2020), presenting challenges and important finds about the use of social networks to create these ads.

Early Electoral Propaganda: A First Look at Pre-election Posts on Social Media
In Proceedings of X Brazilian Workshop on Social Network Analysis and Mining. Porto Alegre, RS - Brazil. Brazilian Computer Society, 2021.

Abstract: The popularization of the use of online social networks as platforms for political debate has brought new challenges like the improper propagation of electoral advertisements. On the one hand, voters use social networks to interact, inform themselves, and get to know their candidates. On the other hand, true digital podiums have emerged for candidates to spread their ideas, fight opponents and ask for votes. Thus, pre-candidates can use the platforms to request votes outside the electoral period, a practice known as early electoral propaganda. Although legislation on this exists, the lack of digital tools and detection methods for this practice can be exploited. In this context, this work presents a methodology to help the detection of this type of propaganda. We collect and characterize data from Twitter and Facebook during three pre-election periods in Brazil (2016, 2018, and 2020), presenting challenges and important finds about the use of social networks to create these ads.

Book Chapters

Disinformation on Digital Platforms: Concepts, Technological Approaches, and Challenges.
Learning Journey on Informatics 2023. 42ed.: SBC, 2023, v. , p. 10-59.

Abstract: Current scenario of studies in the context of disinformation on digital platforms, offering an introduction to the researcher who intends to explore this topic. To achieve this, initially, the authors (1) present and discuss basic concepts, (2) list data repositories helpful in studying this phenomenon, (3) summarize the main strategies explored for understanding, as well as technological approaches to detection and monitoring of disinformation on digital platforms and (4) present a critical overview of the area, highlighting challenges and research opportunities in this context.

Characterizing Early Electoral Advertisements on Twitter: A Brazilian Case Study.
Lecture Notes in Computer Science. 136ed.: Springer, Cham, 2022, v. , p. 257-272.

Abstract: Some countries impose strict regulations regarding the distribution of electoral advertising during election periods. This is the case of Brazil, where electoral ads distributed before a predetermined period (called early ad) are prohibited by law. Whereas the enforcement of such regulation on traditional mass media technologies (e.g., radio and TV) is common practice in the country, the same is a very challenging task for content shared on social media platforms, mostly due to the lack of proper tools to automatically identify content containing (early) electoral ads. This study aims to develop fundamental knowledge about characteristics of textual content containing early ads shared on Twitter, so as to drive the future design of effective detection tools. We offer a broad characterization of the textual content associated with a set of early electoral ads shared on Twitter in pre-election periods of three recent elections in Brazil, comparing their textual properties with those of other (non ads) tweets. Our main findings are that ads tend to have a negative or neutral sentiment, a certain syntactic structure, while most tend to explicitly mention a candidate or party to be chosen or avoided.

Press Coverage

Here is some coverage of my recent research on important blogs, magazines, and newspapers.

Projeto Eleições sem Fake

Systems and Applications

Eleições sem Fake: Many systems to help with the Fake news problem.

Awards

Inria and the CNIL award the 2020 Privacy Protection prize, in Frech - Europe - (January/2020)
Honorable Mention Article "Facebook Ads Monitor: An Independent Auditing System for Political Ads on Facebook", WWW Conference 2019 (WWW'19)
Best paper nominee: WWW’19

Márcio Silva

About me

Selected Publications

For full list of publications, kindly check my Google Scholar

Conferences

Anatomy of Hate Speech Datasets: Composition Analysis and Cross-dataset Classification

Characterizing Early Electoral Advertisements on Twitter: A Brazilian Case Study

COVID-19 Ads as Political Weapon

Analyzing the Use of COVID-19 Ads on Facebook

A System for Monitoring Public Political Groups in WhatsApp

Facebook Ads Monitor: An Independent Auditing System for Political Ads on Facebook

Measuring the Facebook advertising ecosystem

Brazilian Venues

WhatsApp Monitor 2.0 - Monitoring Brazilian Political Groups on WhatsApp.

Legal Electoral Campaign: Detection of Electoral Propaganda and Coordinated Campaign Actions.

Characterizing Brazilian Political Ads on Facebook

Analysis of Early Electoral Advertisements on Twitter

CaçaFake: A System for Monitoring and Analyzing Low Credibility Websites in Brazil

Early Electoral Propaganda: A First Look at Pre-election Posts on Social Media

Book Chapters

Disinformation on Digital Platforms: Concepts, Technological Approaches, and Challenges.

Characterizing Early Electoral Advertisements on Twitter: A Brazilian Case Study.

Press Coverage

Projeto Eleições sem Fake

Systems and Applications

Awards

Interests

Education

Language