Jayita Gulati
2024-10-31 10:00:00
www.kdnuggets.com
Image by Editor | Ideogram
T5 is a powerful model created to help computers understand and generate human language. T5 stands for “Text-to-Text Transfer Transformer.” It is a model that can do many language tasks. T5 treats all tasks as text-to-text problems. In this article, we will learn how to improve T5 for question answering.
Install the Required Libraries
First we must install the necessary libraries for our purposes:
pip install transformers datasets torch
- transformers: The Hugging Face library that provides the T5 model and other transformer architectures
- datasets: A library for accessing and processing datasets
- torch: A deep learning library that helps build and train neural networks
Load the Dataset
For fine-tuning T5 for question answering, we will use the BoolQ dataset, which contains question-answer pairs where the answers are binary (yes/no). You can load the BoolQ dataset using Hugging Face’s datasets library.
from datasets import load_dataset
# Load the BoolQ dataset
dataset = load_dataset("boolq")
# Display the first few rows of the dataset
print(dataset['train'].to_pandas().head())
Preprocessing the Data
T5 requires the input in a specific format. We need to change the dataset so that both the questions and answers are in text format. The inputs will be in the format question:
from transformers import T5Tokenizer, T5ForConditionalGeneration, Trainer, TrainingArguments
# Initialize the T5 tokenizer and model (T5-small in this case)
tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")
# Preprocessing the dataset: Prepare input-output pairs for T5
def preprocess_function(examples):
inputs = [f"Question: {question} Passage: {passage}" for question, passage in zip(examples['question'], examples['passage'])]
targets = ['true' if answer else 'false' for answer in examples['answer']]
# Tokenize inputs and outputs
model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding='max_length')
labels = tokenizer(targets, max_length=10, truncation=True, padding='max_length')
model_inputs["labels"] = labels["input_ids"]
return model_inputs
# Preprocess the dataset
tokenized_dataset = dataset.map(preprocess_function, batched=True)
Fine-Tuning T5
Now that our data is prepared, we can fine-tune the T5 model. Hugging Face’s Trainer API simplifies this process by handling the training loop, optimization, and evaluation.
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=8,
per_device_eval_batch_size=8,
num_train_epochs=3,
weight_decay=0.01,
logging_dir="./logs",
logging_steps=10,
)
# Initialize the Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset["train"],
eval_dataset=tokenized_dataset["validation"],
)
# Fine-tune the model
trainer.train()
Evaluating the Model
After fine-tuning, it’s important to evaluate the model on the validation set to see how well it answers questions. You can use the evaluate method of the Trainer.
# Evaluate the model on the validation dataset
eval_results = trainer.evaluate()
# Print the evaluation results
print(f"Evaluation results: {eval_results}")
Evaluation results: {‘eval_loss’: 0.03487783297896385, ‘eval_runtime’: 37.2638, ‘eval_samples_per_second’: 87.753, ‘eval_steps_per_second’: 10.976, ‘epoch’: 3.0}
Making Predictions
Once the T5 model is fine-tuned and evaluated, we can use it to predict new question-answering tasks. To do this, we can prepare a new input (question and context), tokenize it, and generate the output (answer) from the model.
from transformers import T5Tokenizer, T5ForConditionalGeneration
# Load the fine-tuned model and tokenizer
model = T5ForConditionalGeneration.from_pretrained("./results")
tokenizer = T5Tokenizer.from_pretrained("t5-base")
# Prepare a new input
input_text = "question: Is the sky blue? context: The sky is blue on a clear day."
# Tokenize the input
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
# Generate the answer using the model
output_ids = model.generate(input_ids)
# Decode the generated tokens to get the predicted answer
predicted_answer = tokenizer.decode(output_ids[0], skip_special_tokens=True)
# Print the predicted answer
print(f"Predicted answer: {predicted_answer}") # Predicted answer: yes
Conclusion
In conclusion, fine-tuning T5 helps it become better at answering questions. We learned how to prepare data and train the model. Using the Hugging Face Transformers library made the process easier. After training, T5 can understand questions and give correct answers. This is helpful for many uses, like chatbots or search engines.
Jayita Gulati is a machine learning enthusiast and technical writer driven by her passion for building machine learning models. She holds a Master’s degree in Computer Science from the University of Liverpool.
Support Techcratic
If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.
Bitcoin Address:
bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge
Please verify this address before sending funds.
Bitcoin QR Code
Simply scan the QR code below to support Techcratic.
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.