Hoppa till innehåll

  • Start
  • Nyheter
  • Om Programmet
    • Varför STS?
    • Fördjupning om programmet
    • Ämnesöversikt
    • Intervjuer
  • Arbetsmarknad
    • För studenten
  • Student på programmet
    • Studieresurser
    • C-uppsatser
    • Utlandsstudier
  • Examensarbete
    • Att skriva examensarbete
    • Registrera examensarbete
    • Boka tid för presentation
    • Listor över examensarbeten
    • Kommande Exjobbspresentationer
Sök

Exploring NMF and LDA Topic Models of Swedish News Articles

Information

Författare: Karin Svensson, Johan Blad
Beräknat färdigt: 2020-12
Handledare: Lovisa Bergström
Handledares företag/institution: Dagens Nyheter
Ämnesgranskare: Niklas Wahlström
Övrigt: -


Presentationer

Presentation av Karin Svensson
Presentationstid: 2020-12-16 13:15

Presentation av Johan Blad
Presentationstid: 2020-12-16 14:15

Opponenter: Patrik Björklund, Anna Rydin

Abstract

The ability to automatically analyze and segment news articles by their content is a growing research field. This thesis explores the unsupervised machine learning method topic modeling applied on Swedish news articles for generating topics to describe and segment articles. Specifically, the algorithms non-negative matrix factorization (NMF) and the latent Dirichlet allocation (LDA) are implemented and evaluated. Their usefulness in the news media industry is assessed by its ability to serve as a uniform categorization framework for news articles. This thesis fills a research gap by studying the application of topic modeling on Swedish news articles and contributes by showing that this can yield meaningful results. It is shown that Swedish text data requires extensive data preparation for successful topic models and that nouns exclusively and especially common nouns are the most suitable words to use. Furthermore, the results show that both NMF and LDA are valuable as content analysis tools and categorization frameworks, but they have different characteristics, hence optimal for different use cases. Lastly, the conclusion is that topic models have issues since it can generate unreliable topics that could be misleading for news consumers, but that they nonetheless can be powerful methods for analyzing and segmenting articles efficiently on a grand scale by organizations internally. The thesis project was a collaboration with one of Sweden’s largest media groups and its results led to a topic modeling implementation for large-scale content analysis to gain insight into readers’ interests.

Ladda ner rapporten

Exploring NMF and LDA Topic Models of Swedish News Articles
  • Start
  • Nyheter
  • Om Programmet
    • Varför STS?
    • Fördjupning om programmet
    • Ämnesöversikt
    • Intervjuer
  • Arbetsmarknad
    • För studenten
  • Student på programmet
    • Studieresurser
    • C-uppsatser
    • Utlandsstudier
  • Examensarbete
    • Att skriva examensarbete
    • Registrera examensarbete
    • Boka tid för presentation
    • Listor över examensarbeten
    • Kommande Exjobbspresentationer

Kontakt

Hemsideansvarig
Studievägledare
STS-sektionen

Andra webbplatser

Uppsala Universitet
Schema
Antagning.se
Antagningsstatistik
Hittatenta.se
STS-sektionens hemsida

 

Integritetspolicy | STS-programmet 2024