PythonPyTorchFine-TuningGpt-2

Deep abstract generator

By Arapsih Güngör
Picture of the author
Published on
Duration
6 Months
Category
Masterthesis
Role
Developer
Deep abstract generator Arapsih Güngör

Project Overview

This master thesis project explores the adaptation of GPT-2, a pre-trained model by OpenAI, for the task-specific challenge of generating coherent and contextually relevant scientific abstracts. Through fine-tuning techniques, this project aims to enhance the model's performance, tailoring its capabilities to better suit the unique requirements of academic literature synthesis.

Methodology

The project involved two primary approaches:

  1. Fine-tuning GPT-2: Leveraging a pre-trained GPT-2 model and refining it using a curated dataset of scientific papers to improve its abstract generation capabilities.
  2. Developing a Custom Transformer Model: Constructing a transformer model from scratch to address the nuances of scientific text and compare its effectiveness against the fine-tuned GPT-2 model.

Key Techniques Employed

  • Dataset Assembly: Compilation of a substantial corpus from scientific journals.
  • Model Training and Evaluation: Utilizing Python and PyTorch for model training, with evaluation based on METEOR, ROUGE, and BLEU metrics.

Results and Findings

The fine-tuned GPT-2 model significantly outperformed the custom transformer model, offering:

  • Faster training and evaluation times.
  • Higher accuracy in generating text that closely mirrors human-written abstracts.
  • Improved efficiency in processing, making it a viable tool for real-time applications.

Comparative Analysis

MetricGPT-2 Fine-tunedCustom Transformer
Training Time9 hours75 hours
METEOR Score18.1%3.7%
ROUGE Score18.3%1.8%
BLEU Score21.6%3.1%

Contributions to the Field

This thesis underscores the potential of fine-tuning pre-trained models over building new ones from scratch for specific tasks. The findings advocate for the application of deep learning techniques in automating aspects of scientific writing, potentially transforming how literature reviews are conducted.

Future Work

Suggestions for future research include:

  • Expanding the training dataset to cover a broader range of disciplines.
  • Exploring the integration of multi-lingual capabilities to support non-English texts.
  • Enhancing the model's understanding of complex scientific concepts and terminologies.