Text Summarization using a Transformer Architecture An Attention based Transformer approach to Abstractive Summarization
Information
Författare: Jonas JonsBeräknat färdigt: 2024-05
Handledare: Mikael Axelsson
Handledares företag/institution: Consid AB
Ämnesgranskare: Ingela Nyström
Övrigt: Söker opponent, redo för presentation
Presentation
Presentatör: Jonas JonsPresentationstid: 2024-05-31 09:15
Opponent: Linnea Lisper
Abstract
With an ever-growing volume of data on the internet and in literature, grasping the full picture of
a subject becomes increasingly difficult. One such area that has been heavily studied is the
COVID-19 pandemic, which prompted a global scientific race to stop, mitigate, and protect
against the virus.
To help manage the vast number of studies on COVID-19 and conclude the scientific findings,
the White House launched the CORD-19 Dataset, an open-source project compiling many of
these studies. Hosted on the popular data science community website Kaggle, this project called
for the community’s aid in deriving new insights from the extensive research available. This
study focuses on two main subjects, with inspiration from the aforementioned subjects: the ever-
growing amount of data and the CORD-19 open-source project.
Grounded in the influential paper “Attention is All You Need” created in 2017, which introduced
the original transformer model (now used in applications like ChatGPT), this study aims to
create a transformer model from scratch to perform text summarization on the samples in the
CORD-19 dataset. By combining cutting-edge transformer technology with the analysis of the
CORD-19 dataset, the study provides valuable contributions to both areas. This is especially
important as the scientific literature on transformers is currently limited, given the recent
development of this type of deep learning network.
In this thesis, the data from the CORD-19 dataset is downloaded, cleaned, and processed. It is
then fed into a custom-built transformer neural network, specifically modified for the task of text
summarization. Building the network from scratch, rather than using a pre-built model, aims to
foster a deeper understanding of the fundamental technology and its mechanics. The conclusions
and results of this thesis will offer valuable insights into both the transformer model and the
challenges associated with analyzing the CORD-19 dataset.