Genai Factchecking Penn State Center For Social Data Analytics

Bonisiwe Shabane

-Nov 30, 2025, 10:09 PM

genai factchecking penn state center for social data analytics

Last week we were thrilled to host Dr. Yuehong Cassandra Tai, assistant research professor and assistant director at C-SoDA, for a talk on, "GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales." Dr. Tai presented new research evaluating how well generative AI models assess the credibility of political content, specifically, content in posts shared by subnational U.S. politicians on Facebook. While some GenAI models can come close to (though never match) the performance of human fact-checkers in identifying low-credibility content, they often rely on surface-level features—like tone or source cues—rather than true reasoning about...

This work is forthcoming at WebSci 2025, and is available on ArXiv: https://lnkd.in/eKBb6VyE #GenAI #FactChecking It was a pleasure to share my research with the CSoDA community. I enjoyed the discussions with students and social scientists. The CSoDA Weekly Speaker Series offers an excellent platform for exchanging ideas on how computational approaches are advancing our understanding of both social science and the humanities. A timely warning for social scientists to use AI smartly but cautiously! Employers and policymakers alike are aware of the potentially earth-shattering ramifications of AI for the European labor market.

At our #POLITICOCompetitiveSummit, Ben Munster spoke to two labor experts on the issue. Here are his takeaways 👇 Follow along: https://ow.ly/kpZP50X5Gu2 Full room at the Bias and Fairness session (1/3 sessions), I had the pleasure of chairing on Tuesday at ECMLPKDD It was great to see fairness included as part of the main conference program... and combine papers on fairness and discrimination in AI systems with the broader topic of bias (not always directly linked to fairness, but touching model robustness more broadly). The session featured: - A paper on class imbalance learning introducing a new loss function, Jensen–Tsallis Divergence, presented by David Luengo García - A paper on overconfident predictions and how to overcome this issue,... Paper preprints are available online: https://lnkd.in/dveGWa-9 in case you want to learn more #AI #bias #fairness #ECMLPKDD25

SoDA is the integration of social scientific, computational, informational, statistical, and visual analytic approaches to the analysis of large or complex data that arise from human interaction. SoDA merges social science and data science to improve our ability to learn from social data. The mission of the Center for Social Data Analytics (C-SoDA) is to support science at Penn State that advances the state-of-the art in computationally and/or data intensive social research. We are organized along three broad sub-missions. The first is to facilitate and amplify faculty and student research programs that feature a SoDA focus. The second is to broaden and diversify the community of scholars who are engaged with, and engaged in, SoDA research.

The third sub-mission is to, in close connection with the undergraduate and graduate programs in SoDA, integrate undergraduate and graduate students, as well as postdoctoral scholars, into the SoDA research community at Penn State. Analytics. arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community?

Learn more about arXivLabs. GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales Despite recent advances in understanding the capabilities and limits of generative artificial intelligence (GenAI) models, we are just beginning to understand their capacity to assess and reason about the veracity of content. We evaluate multiple GenAI models across tasks that involve the rating of, and reasoning about, the credibility of information. The information in our experiments comes from content that subnational U.S.

politicians post to Facebook. We find that GPT-4o, one of the most used AI models in consumer applications, outperforms other models, but all models exhibit only moderate agreement with human coders. Importantly, even when GenAI models accurately identify low-credibility content, their reasoning relies heavily on linguistic features and hard criteria, such as the level of detail, source reliability and language formality, rather than an understanding... We also assess the effectiveness of summarized versus full content inputs, finding that summarized content holds promise for improving efficiency without sacrificing accuracy. While GenAI has the potential to support human fact-checkers in scaling misinformation detection, our results caution against relying solely on these models. Dr.

Cassandra Tai is an Assistant Research Professor at Penn State, co-funded by the College of the Liberal Arts and the Social Science Research Institute, and serves as Assistant Director of the Center for Social... Her research leverages large-scale text data, cross-national public opinion data, language models, generative AI, and Bayesian measurement models to examine elite political behavior and comparative public opinion. Thrilled to be part of this fantastic project, led by Yuehong Cassandra Tai, on the feasibility/performance of using LLMs as human-like fact-checkers of elected officials' social media posts. It is now forthcoming at WebSci 2025! Excited to share that our paper, "GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales," coauthored with Khushi Patni, Nick Hemauer, Bruce Desmarais, and Yu-Ru Lin, has been accepted to WebSci 2025!

We evaluate multiple GenAI models on tasks involving the rating of, and reasoning about, the credibility of information shared by state legislators on Facebook. We explored: 1. Can GenAI models match or exceed the reliability of human coders? 2. Can summaries generated by zero-shot GenAI models provide efficient credibility ratings? 3.

How do GenAI models reason about information credibility? We find: 1. GPT-4o outperforms other models, but all models show only moderate agreement with human coders. 2. GenAI models rely on "hard" criteria—like detail, source quality, and linguistic features—which help identify low-credibility content but struggle with detecting sophisticated misinformation that appears factual. This limitation is critical when addressing issues that haven't been widely debunked.

3. Summarized content holds promise for improving efficiency without sacrificing accuracy. While GenAI can aid in scaling misinformation detection, it should be viewed as a complement to, not a replacement for, human fact-checkers. A hybrid human-AI approach or human-in-the-loop model is essential for technology-assisted fact-checking. Cool work! What was the human fact checker baseline?

Did you use fact checks themselves or something like Truth seeker that links fact checks to tweets containing the check? equally thrilled to have worked with you on this project! 😁 Despite recent advances in understanding the capabilities and limits of generative artificial intelligence (GenAI) models, we are just beginning to understand their capacity to assess and reason about the veracity of content. We evaluate multiple GenAI models across tasks that involve the rating of, and reasoning about, the credibility of information. The information in our experiments comes from content that subnational U.S.

politicians post to Facebook. We find that GPT-4o, one of the most used AI models in consumer applications, outperforms other models, but all models exhibit only moderate agreement with human coders. Importantly, even when GenAI models accurately identify low-credibility content, their reasoning relies heavily on linguistic features and “hard” criteria, such as the level of detail, source reliability and language formality, rather than an understanding... We also assess the effectiveness of summarized versus full content inputs, finding that summarized content holds promise for improving efficiency without sacrificing accuracy. While GenAI has the potential to support human fact-checkers in scaling misinformation detection, our results caution against relying solely on these models. Generative Artificial Intelligence (GenAI) has transformed workflows across industries and academia since its surge in popularity in 2022 (Abels, 2023).

In scientific research, its adoption has accelerated at an unprecedented pace (Zhao et al., 2023), demonstrating transformative potential in diverse applications. Researchers have successfully employed GenAI models to impersonate survey respondents (Bisbee et al., 2024), generate counterfactual images for experimental research (Davidson, 2024), annotate text with human-comparable accuracy(Gilardi et al., 2023), identify conspiracy theories (Diab... Among these diverse applications of GenAI, a fundamental question remains about its ability to assess content credibility. We assess the capacity of GenAI models to rate the veracity of content. The online proliferation of unreliable and misleading information, including misinformation, poses threats to democracies and societies, spurring political violence, undermining trust in democratic institutions globally, and endangering public health (Eliassi-Rad et al., 2020; Lazer... Combating misinformation requires reliable detection tools; however, the complexity of fact-checking remains largely beyond the capabilities of even state-of-the-art AI systems (Neumann et al., 2024).

Given these limitations, one well-established alternative is to assess content reliability by using the credibility of source domains as a proxy (Lasser et al., 2023; Guess et al., 2020). This domain-based approach raises concerns about false positives, as not all content from unreliable domains is misinformation. Scholars have explored the potential of large language models (LLMs) as fact-checkers, building on foundational work showing that language models can verify claims without external knowledge bases (Lee et al., 2020). For example, using datasets like news headlines labeled for credibility (Gabriel et al., 2021), researchers found that GenAI models without fine-tuning achieve moderate performance in detecting political misinformation (Ziems et al., 2024). This suggests that if GenAI can reliably assess credibility through zero-shot prompting, it could alleviate human resource demands in large-scale fact-checking. To advance this line of inquiry, we explore the perceived reasoning patterns that emerge during content credibility assessment in zero-shot settings, offering insights into the internal processes of GenAI.

Specifically, we address the following research questions: RQ1 (Viability): Can GenAI models match or exceed the reliability of human coders? RQ2 (Efficiency): Can summaries generated by zero-shot GenAI models provide efficient credibility ratings? RQ3 (Functionality): How do GenAI models reason about information credibility? var accessibleTabs = (function () { "use strict"; var d = document; var tabListClass; var onClass; var hoverableClass; var setTabsOff = function (tabList) { var i = tabList.tabs.length; while (i--) { setTab(tabList.tabs[i], false); }... "0" : "-1"); } }; var _activateTab = function (e) { var tab = e.target; if (e.preventDefault) { e.preventDefault(); setTabsOff(tab.tabList); setTab(tab, true); d.getElementById(tab.panelId).children[0].focus(); } }; var _keypressed = function (e) { var tab =...

maxNo : tab.no - 1; } if (e.keyCode === 39 || e.keyCode === 40) { // right arrow / down arrow newNo = (tab.no === maxNo) ? 0 : tab.no + 1; } // if (newNo > -1) { // setTabsOff(tab.tabList); // setTab(tabs[newNo], true); // tabs[newNo].focus(); // } }; var _tabHovered = function (e) { var a = e.target; }; var... Let’s start with the basics. Artificial Intelligence (AI) is a field of computer science focused on creating systems capable of performing tasks that typically require human intelligence. These tasks may include problem-solving, language understanding, and interaction. Generative AI (GenAI) refers to a category of artificial intelligence technologies designed to generate new content, such as text, images, video, audio, or code, that is similar to what a human might produce.

Genai Factchecking Penn State Center For Social Data Analytics

People Also Search

Last Week We Were Thrilled To Host Dr. Yuehong Cassandra

This Work Is Forthcoming At WebSci 2025, And Is Available

At Our #POLITICOCompetitiveSummit, Ben Munster Spoke To Two Labor Experts

SoDA Is The Integration Of Social Scientific, Computational, Informational, Statistical,

The Third Sub-mission Is To, In Close Connection With The