hig.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard-cite-them-right
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • sv-SE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • de-DE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Sentimentanalys av finansiella nyhetsartiklar med Gemini 2.5: En komparativ studie om integrering av nyhetssentiment i Random Forest-modeller
University of Gävle, Faculty of Engineering and Sustainable Development, Department of Computer and Geospatial Sciences, Computer Science.
2025 (Swedish)Independent thesis Basic level (degree of Bachelor), 180 HE creditsStudent thesis
Abstract [en]

This study investigates whether the use of sentiment data from news articles, generated using the language model Gemini 2.5 Pro, can enhance the ability of Random Forest models to predict stock returns for the OMXS30 index. Using a quantitative approach, monthly data for stock prices (2021-2025) and news articles were analyzed, with Gemini 2.5 Pro employed to generate sentiment scores. Two Random Forest models were developed: one incorporating NLP-generated sentiment data and one without. Their predictive capabilities were evaluated by simulating long-short portfolio strategies.

The results indicated that the model integrating sentiment data achieved a statistically significant annualized risk-adjusted excess return (Jensen’s alpha) of 2.64 %, significantly outperforming the model without sentiment, which had an alpha close to zero. Although R2 values for predicting individual stock returns were generally low, the portfolio based on sentiment signals demonstrated improved performance. However, a paired t-test on the average monthly returns between the two portfolio strategies revealed no statistically significant difference (p=0.15).

The study concludes that sentiment data from advanced language models like Gemini 2.5 Pro can contribute to enhancing portfolio performance in terms of risk-adjusted returns on the Swedish stock market, although the effect on average monthly returns was not statistically established within the study’s scope.

Place, publisher, year, edition, pages
2025. , p. 30
Keywords [sv]
Maskininlärning, Språkmodeller, OMXS30, Random Forest, Sentiment, Hypotesen om Effektiva marknader
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hig:diva-47641OAI: oai:DiVA.org:hig-47641DiVA, id: diva2:1976119
Subject / course
Computer science
Supervisors
Examiners
Available from: 2025-06-26 Created: 2025-06-24 Last updated: 2025-10-02Bibliographically approved

Open Access in DiVA

fulltext(781 kB)51 downloads
File information
File name FULLTEXT01.pdfFile size 781 kBChecksum SHA-512
fe4109ce71b8fe4af1010fad83fb33b07304bea8b81c53879c7f061441066255d0e8ce57bb215ca6d52e0816e0cb46906afa647a8c04f37ceebf4d0e886ed174
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Öhrn, Ferdinand
By organisation
Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 51 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 95 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard-cite-them-right
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • sv-SE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • de-DE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf