Leveraging Recurrent Neural Networks for Text Understanding
Sentiment analysis, the task of determining the emotional tone behind a piece of text, is a fundamental problem in Natural Language Processing (NLP). Recurrent Neural Networks (RNNs), particularly LSTMs and GRUs, are well-suited for sequential data like text, allowing them to capture dependencies and context over time. This tutorial demonstrates how to build and train a sentiment analysis model using TensorFlow and RNNs.
Prerequisites: Basic understanding of Python, TensorFlow, and neural networks.
We will use a classic dataset for sentiment analysis, such as the IMDB movie reviews dataset. This dataset contains movie reviews labeled as either positive or negative. TensorFlow Datasets (TFDS) provides an easy way to access this dataset.
import tensorflow_datasets as tfds
import tensorflow as tf
# Load the IMDB reviews dataset
(ds_train, ds_test), ds_info = tfds.load(
'imdb_reviews',
split=['train', 'test'],
shuffle_files=True,
as_supervised=True,
with_info=True,
)
# Explore dataset information
print(f"Dataset info: {ds_info}")
Before feeding text data into a neural network, it needs to be processed. This typically involves tokenization, vocabulary building, and padding.
We'll create a vocabulary that maps words to integer indices. TensorFlow's TextVectorization
layer is excellent for this.
VOCAB_SIZE = 10000
MAX_SEQUENCE_LENGTH = 128 # Maximum length of the input sequences
vectorize_layer = tf.keras.layers.TextVectorization(
max_tokens=VOCAB_SIZE,
output_sequence_length=MAX_SEQUENCE_LENGTH,
standardize='lower_and_strip_punctuation',
split='whitespace'
)
# Adapt the layer to the training data to build the vocabulary
train_texts = ds_train.map(lambda text, label: text)
vectorize_layer.adapt(train_texts.take(10000)) # Adapt on a subset for efficiency
# Example of vectorization
example_text = "This movie was absolutely fantastic and I loved it!"
print(f"Vectorized example: {vectorize_layer([example_text])}")
Sequences of varying lengths need to be padded to a uniform length for batch processing.
# Map the functions to the dataset
def preprocess_text(text, label):
text = tf.expand_dims(text, -1)
return vectorize_layer(text), label
# Apply preprocessing and batching
BATCH_SIZE = 64
ds_train_processed = ds_train.map(preprocess_text).cache().batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
ds_test_processed = ds_test.map(preprocess_text).cache().batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
We will construct a model using Keras's Sequential API. The model will include an Embedding layer, an RNN layer (e.g., GRU), and a Dense layer for classification.
EMBEDDING_DIM = 16 # Dimension of the word embeddings
model = tf.keras.Sequential([
tf.keras.layers.Embedding(VOCAB_SIZE, EMBEDDING_DIM, input_length=MAX_SEQUENCE_LENGTH),
# Using GRU for its efficiency and ability to handle long sequences
tf.keras.layers.GRU(32, return_sequences=False), # return_sequences=False for a single output per sequence
tf.keras.layers.Dense(6, activation='relu'), # A small dense layer before output
tf.keras.layers.Dense(1, activation='sigmoid') # Output layer for binary classification (positive/negative)
])
model.summary()
Note: You can experiment with different RNN units (e.g., LSTM), number of layers, and hyperparameters.
Compile the model with an optimizer, loss function, and metrics. Then, train it on the processed dataset.
model.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
metrics=['accuracy'])
EPOCHS = 5
history = model.fit(
ds_train_processed,
epochs=EPOCHS,
validation_data=ds_test_processed
)
Evaluate the trained model on the test set to assess its performance.
loss, accuracy = model.evaluate(ds_test_processed)
print(f'Test Loss: {loss}')
print(f'Test Accuracy: {accuracy}')
Use the trained model to predict the sentiment of new, unseen text.
def predict_sentiment(text_list):
processed_texts = vectorize_layer(tf.constant(text_list))
predictions = model.predict(processed_texts)
return ["Positive" if p > 0.5 else "Negative" for p in predictions]
# Example prediction
new_reviews = [
"This film was a masterpiece, truly captivating from start to finish.",
"A complete waste of time, the plot was predictable and the acting was terrible.",
"It was okay, not great but not bad either."
]
sentiments = predict_sentiment(new_reviews)
for review, sentiment in zip(new_reviews, sentiments):
print(f"Review: '{review}' -> Sentiment: {sentiment}")