Introduction to Sentiment Analysis with ML.NET
Sentiment analysis, also known as opinion mining, is the process of computationally identifying and categorizing opinions expressed in a piece of text, especially to determine whether the writer's attitude towards a particular topic, product, etc., is positive, negative, or neutral.
In this tutorial, you will learn how to build a machine learning model using ML.NET that can predict the sentiment of a given text. We'll cover the entire process, from data preparation to model training and evaluation.
This tutorial is designed for developers who want to integrate sentiment analysis capabilities into their .NET applications.
Get Started1. Setting Up Your Environment
Before we begin, ensure you have the following installed:
- Visual Studio 2022 or later (Community, Professional, or Enterprise)
- .NET SDK (typically included with Visual Studio)
Create a New Project
Open Visual Studio and create a new Console Application project.
- Select Create a new project.
- Search for Console Application.
- Choose the C# template for .NET.
- Name your project SentimentAnalysisDemo and click Create.
2. Installing ML.NET NuGet Packages
You need to add the necessary ML.NET packages to your project.
Add Microsoft.ML NuGet Package
Right-click on your project in Solution Explorer and select Manage NuGet Packages....
In the Browse tab, search for Microsoft.ML and install the latest stable version.
Add Microsoft.ML.FastTree NuGet Package (Optional for advanced models)
For this specific sentiment analysis task, the default components are usually sufficient. However, for other tasks, you might need additional trainers.
3. Preparing the Dataset
A good dataset is crucial for training an effective sentiment analysis model. We'll use a simplified dataset for demonstration. In a real-world scenario, you would use a much larger and more diverse dataset.
Create a new text file named sentiment_dataset.tsv in your project's root directory. Make sure to set its "Copy to Output Directory" property to "Copy if newer" in the Solution Explorer.
Add the following sample data to sentiment_dataset.tsv:
"This was a great movie, I really enjoyed it." 1
"The acting was terrible and the plot was boring." 0
"An absolutely fantastic experience from start to finish!" 1
"I wouldn't recommend this product to anyone." 0
"It was okay, nothing special." 0
"Loved the new features, very innovative!" 1
"Very disappointing, a complete waste of time." 0
"A masterpiece of cinema, truly moving." 1
"The service was slow and the food was cold." 0
"Highly recommend this book, a real page-turner." 1
In this dataset:
- The first column is the text review.
- The second column is the sentiment label: 1 for positive, 0 for negative.
4. Defining Data Structures
We need to define classes to represent our input data and the model's prediction.
Create a new file DataStructures.cs and add the following code:
using Microsoft.ML.Data;
public class SentimentData
{
[LoadColumn(0)]
public string SentimentText;
[LoadColumn(1), ColumnName("Label")]
public bool Sentiment;
}
public class SentimentPrediction : SentimentData
{
[ColumnName("PredictedLabel")]
public bool PredictedLabel;
public float Score;
public float Probability;
}
5. Building the Sentiment Analysis Model
Now, let's write the core logic in your Program.cs file to load data, train the model, and make predictions.
Initialize MLContext and DataView
The MLContext class is the starting point for all ML.NET operations. We'll also load our dataset into an IDataView.
Define the Training Pipeline
We'll create a pipeline that transforms the text data and trains a binary classification model.
Train the Model
Train the pipeline using the loaded dataset.
Create a Prediction Engine
A prediction engine allows you to make predictions on single instances of data.
Make Predictions
Test the model with new text inputs.
Replace the content of your Program.cs with the following code:
using System;
using Microsoft.ML;
using Microsoft.ML.Data;
// Ensure DataStructures.cs is in the same project or referenced.
var mlContext = new MLContext();
// 1. Load Data
var dataPath = "sentiment_dataset.tsv";
IDataView trainingDataView = mlContext.Data.LoadFromTextFile(dataPath, hasHeader: false);
// 2. Define Training Pipeline
// Concatenate the text column into a single vector column
var pipeline = mlContext.Transforms.Text.FeaturizeText("Features", "SentimentText")
// Use a trainer for binary classification (e.g., FastTree)
.Append(mlContext.BinaryClassification.Trainers.FastTree(labelColumnName: "Label", featureColumnName: "Features"));
// 3. Train the Model
Console.WriteLine("Training the sentiment analysis model...");
var model = pipeline.Fit(trainingDataView);
Console.WriteLine("Model training complete.");
// 4. Create Prediction Engine
var predictionEngine = mlContext.Model.CreatePredictionEngine(model);
// 5. Make Predictions
Console.WriteLine("\n--- Making Predictions ---");
var sampleStatementPositive = new SentimentData { SentimentText = "This is a wonderful day!" };
var predictionPositive = predictionEngine.Predict(sampleStatementPositive);
Console.WriteLine($"Text: '{sampleStatementPositive.SentimentText}'");
Console.WriteLine($"Prediction: {(predictionPositive.PredictedLabel ? "Positive" : "Negative")}");
Console.WriteLine($"Confidence: {predictionPositive.Probability:P2}");
var sampleStatementNegative = new SentimentData { SentimentText = "I am very unhappy with the service." };
var predictionNegative = predictionEngine.Predict(sampleStatementNegative);
Console.WriteLine($"\nText: '{sampleStatementNegative.SentimentText}'");
Console.WriteLine($"Prediction: {(predictionNegative.PredictedLabel ? "Positive" : "Negative")}");
Console.WriteLine($"Confidence: {predictionNegative.Probability:P2}");
Console.WriteLine("\nPress any key to exit.");
Console.ReadKey();
6. Running the Application
Press F5 or click the Start button in Visual Studio to run your application.
You should see output similar to this in the console:
Training the sentiment analysis model...
Model training complete.
--- Making Predictions ---
Text: 'This is a wonderful day!'
Prediction: Positive
Confidence: 98.50%
Text: 'I am very unhappy with the service.'
Prediction: Negative
Confidence: 97.20%
Press any key to exit.
7. Evaluating the Model (Optional)
For a more robust solution, you would typically split your data into training and testing sets and evaluate the model's performance using metrics like accuracy, F1-score, and AUC.
ML.NET provides tools for this:
- Split your dataset using mlContext.Data.TrainTestData.TrainTestDataSplit.
- Train on the training set.
- Evaluate on the testing set using mlContext.BinaryClassification.Evaluate.
This step is essential to understand how well your model generalizes to unseen data.
8. Saving the Model
You can save the trained model for later use without retraining.
// Add this before the prediction section
string modelPath = "sentiment_model.zip";
mlContext.Model.Save(model, trainingDataView.Schema, modelPath);
Console.WriteLine($"Model saved to: {modelPath}");
// To load the model later:
// var loadedModel = mlContext.Model.Load(modelPath, out var modelSchema);
// var predictionEngine = mlContext.Model.CreatePredictionEngine(loadedModel);
This will create a sentiment_model.zip file in your project's output directory, containing the trained model.
Next Steps
Congratulations on building your first sentiment analysis model with ML.NET!
- Explore different trainers and parameters for better accuracy.
- Integrate this model into a Web API or a desktop application.
- Experiment with larger, real-world datasets.
- Learn about other ML.NET capabilities like text classification, recommendation, and forecasting.