Fake News Detection Using Machine Learning Github

Bonisiwe Shabane

-Dec 5, 2025, 5:45 PM

fake news detection using machine learning github

This repository contains a comprehensive project for detecting fake news using machine learning techniques and various natural language processing techniques. The project includes data analysis, model training, and a web application for real-time fake news detection. The machine learning model is designed to classify news articles as either real or fake based on their content. We aim to develop a machine learning program to identify when a news source may be producing fake news. The model will focus on identifying fake news sources, based on multiple articles originating from a source. Once a source is labeled as a producer of fake news, we can predict with high confidence that any future articles from that source will also be fake news.

Focusing on sources widens our article misclassification tolerance, because we will have multiple data points coming from each source. The intended application of the project is for use in applying visibility weights in social media. Using weights produced by this model, social networks can make stories that are highly likely to be fake news less visible. The repository is organized into the following directories and files: A full training dataset with the following attributes: In this project, we have used various natural language processing techniques and machine learning algorithms to classify fake news articles using sci-kit libraries from python.

The data source used for this project is LIAR dataset which contains 3 files with .tsv format for test, train and validation. Below is some description about the data files used for this project. LIAR: A BENCHMARK DATASET FOR FAKE NEWS DETECTION the original dataset contained 13 variables/columns for train, test and validation sets as follows: To make things simple we have chosen only 2 variables from this original dataset for this classification. The other variables can be added later to add some more complexity and enhance the features.

Detect fake vs real news articles using Machine Learning, TF-IDF, and Logistic Regression, complete with training scripts, evaluation charts, and an interactive Streamlit web app. The Fake News & Misinformation Detector is a complete end-to-end Natural Language Processing (NLP) project that classifies news headlines and articles as REAL or FAKE. It combines TF-IDF feature extraction with a Logistic Regression classifier, achieving perfect accuracy on the cleaned dataset. When launched, the app allows you to paste or type any news headline or paragraph and analyze its credibility in real time. Dataset Source: This project uses and modifies the Fake and Real News Dataset by Clément Bisaillon (Kaggle). Data was cleaned, header-fixed, and downsampled to 999 REAL and 999 FAKE news articles for balanced training and clear visualization.

Used purely for educational and research purposes. Run the following command from the project root: Fake news on different platforms is spreading widely and is a matter of serious concern, as it causes social wars and permanent breakage of the bonds established among people. A lot of research is already going on focused on the classification of fake news. Here we will try to solve this issue with the help of machine learning in Python. Before starting the code, download the dataset by clicking the link.

The shape of the dataset can be found by the below code. As the title, subject and date column will not going to be helpful in identification of the news. So, we can drop these column. Fake news is a significant concern in the modern digital age, as it can negatively influence perceptions and even lead to real-world incidents. This issue has been exacerbated by the widespread dissemination of misleading information, particularly during the 2016 U.S. election and the onset of the 2020 pandemic.

With the majority of U.S. news consumers being worried about the authenticity of news they consume, there’s an urgent need for tools to verify news authenticity. The proposed False News Detection project intends to address this problem by creating a tool that uses machine learning and natural language processing to detect fake news. The primary objectives of this project are to reduce the circulation of unreliable articles, boost readers’ confidence in news authenticity by providing a reliability score, and caution users against unreliable articles with warning labels. This report delves into natural language processing, exploring how machines interpret human communication and contextually process text. The main focus is on distinguishing between “real” and “fake” news using various classification models, including Decision Tree, Logistic Regression, Random Forest, Gradient Boosting, SVM, Multinomial Naive Bayes, and Neural Network with LSTM.

The study follows a three-step approach: preprocessing the data, converting text to numeric representation using models like TF-IDF, Count Vectorizer, and Word2Vec, and then classifying this numeric data with advanced machine learning techniques. The study employed the WELFake dataset from Kaggle, comprising 72,134 news articles, with 35,028 being real and 37,106 being fake. This dataset has four attributes: serial number, title, text, and a label indicating real (1) or fake (0) news. The primary focus during analysis and modeling was the text attribute, given its significance in distinguishing between real and fake news. However, the dataset has limitations, including the absence of links or sources for validation, unclear classification criteria for articles, and uncertainty about the data curation process, which could introduce potential biases. The project follows a structured pipeline: starting with data collection and cleaning, followed by feature extraction and exploratory analysis.

The primary focus for modeling was the text attribute. The dataset was then divided into training and testing sets. The training data was used to train various models, which were subsequently evaluated. If necessary, models were refined and retrained until a satisfactory classification of the articles was achieved. Natural Language Processing Models - Pre-Training Algorithm Uncover The Secrets Of Building A Fake News Detection Project With Machine Learning In This Comprehensive Project Tutorial.

| ProjectPro { "@context": "https://schema.org", "@type": "BlogPosting", "image": [ "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Fake_News_Detection_Project.png?w=576&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/False_News_Detection_Project_Github.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/False_News_Detection_Dataset.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/False_News_Detection_Dataset_Import.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Reading_Training_Dataset_For_Detecting_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Histogram_For_Detecting_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Graph_For_False_News_Data.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Text_Preprocessing_For_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Using_Tokenizer_For_Detecting_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Preprocessing_False_News_Using_Tokenizer.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Preprocessed_Datasets_For_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Activation_Functions_For_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Early_Stopping_Method_For_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Training_Logs_For_False_News.png?w=1242&dpr=1.3", "https://dezyre.gumlet.io/images/blog/fake-news-detection-project/Evaluating_False_News_Dataset.png?w=1242&dpr=1.3" ], "@id": "https://www.projectpro.io/article/fake-news-detection-project/854" } Uncovering the truth has never been easier! Learn how machine learning algorithms can help combat fake news with our fake news detection project tutorial! Imagine a scenario where a false news story spreads rapidly on social media, claiming that a particular medication is a cure for a deadly disease. People start hoarding the medication, causing scarcity and preventing those who need it from accessing it.

This example scenario shows one of the several real-world risks of fake news. The rapid spread of fake news has become a major issue worldwide. The spread of false and misleading news has led to significant social and economic consequences, impacting industries from finance to healthcare. For example, in 2020, during the COVID-19 pandemic, several countries witnessed a spike in false news about the virus, leading to confusion and panic among people. Misinformation and fake news can have a long-term impact, especially when people rely on accurate information to make critical decisions. The need for detecting fake news has never been more crucial.

Machine learning techniques can help us detect fake news efficiently and accurately. Using natural language processing techniques, machine learning algorithms can accurately detect and categorize true and false news. ML systems may distinguish between true news and false news by analyzing patterns in the language and sources used in news reports. This blog will explore a fake news detection project using machine learning and discuss how machine learning algorithms can efficiently detect and distinguish false news from real news. We will also explore the key machine-learning algorithms used to identify false and true news and real-world use cases of fake news detection. In the recent years of information transfer we have seen how major shortcomings in the field of technology have affected the lives of the people.

The times of social media has catalyzed the process of propagating a lot of fake news from anti-social elements all across the world. We as a group want to solve this problem by applying the concepts of Machine Learning learnt in the class and get a result which enables us to solve the problem. We thought about this project by first interacting with many of our friends and peers which made us aware of the issue. One can also see how in the past as well as in the present there has been chaos which has also led to loss of human lives due to transfer of incorrect news in... So all in all our group wants to contribute in ensuring peace and sanity by identifying which is fake and real news through the use of Machine Learning Concepts. All of us are CS undergrads at IIIT Delhi

Here find the most detailed Report and Presentation you will ever get. It has even the most minute details of our project. They contain our motivation, methodology, results and future work. This clearly is a binary classification problem where we need to classify a news article or statement as Fake or not. The following steps need to be perfomed : Though, technology has been the reason for the recent positive developments in the human history it also has had its fair share of disadvantages too.

One can see that there was a time when we had to search books for gathering information or maybe read newspapers for reading news but now people have both information and news in their... With regards to news which comes from various sectors in the form of social media, digital news etc. people tend to rely on certain things which are not true. This results in the propagation of information which is wrong. This is happening extensively nowadays due to bias with which journalists are reporting incidents due to their involvement of a particular political organization. Just recently we saw how there were riots in India due to circulation of news where a person belonging from a certain community was accused killing someone from the other communities.

We have also observed how political parties instigate the public by using their IT cell networks to hide truth as well as polarize the voters. So all in all our problem statement is to analyze news that we get from social media and label them as fake or real. In this project we will get the data in form of paragraphs and would see which person has spoken the words from which political party. Depending on the sentiment analysis which would be handled by applying Natural Language Processing where we would be able to analyze the text and convert it into numeric data which would help us to... Several machine learning models have already been applied to solve this problem. But of the people have tried to test the models on particular datasets thereby inducing dataset biases.

So if a particular algorithm works on a specific type of dataset it may work poorly on the other. In our case we preprocess the data and train six different models which are logistic regression, Naïve Bayes, Decision Tree, Random Forrest, Neural Networks, Support Vector Machine and ADABoost Classifier. Here Naïve Bayes serves us the threshold level of accuracy and we accordingly work on the other models. The neural Networks was involved in the application of Deep Learning. It helps in the better learning of the data which enables us to give better accuracies then the models used earlier. While using Neural Networks we have used to algorithms which are used for better data visualization.

Fake News Detection Using Machine Learning Github

People Also Search

This Repository Contains A Comprehensive Project For Detecting Fake News

Focusing On Sources Widens Our Article Misclassification Tolerance, Because We

The Data Source Used For This Project Is LIAR Dataset

Detect Fake Vs Real News Articles Using Machine Learning, TF-IDF,

Used Purely For Educational And Research Purposes. Run The Following