Leveraging LLMs for detecting phishing websites

Phishing attacks are among the most common threats in the cybersecurity world, and because of this, they pose a major risk to online security. To address the continuous evolution of these attacks, researchers have developed several solutions that aim to miti gate and identify their patterns. Many of...

Descripción completa

Detalles Bibliográficos
Autor: Tacons Vega, Marc
Tipo de recurso: tesis de maestría
Fecha de publicación:2025
País:España
Institución:Universitat Politècnica de Catalunya (UPC)
Repositorio:UPCommons. Portal del coneixement obert de la UPC
Idioma:inglés
OAI Identifier:oai:upcommons.upc.edu:2117/452103
Acceso en línea:https://hdl.handle.net/2117/452103
Access Level:acceso abierto
Palabra clave:Computer security
Machine learning
Cybersecurity
LLM
LMM
Large language models
Phishing detection
Seguretat informàtica
Aprenentatge automàtic
Àrees temàtiques de la UPC::Informàtica::Seguretat informàtica
Descripción
Sumario:Phishing attacks are among the most common threats in the cybersecurity world, and because of this, they pose a major risk to online security. To address the continuous evolution of these attacks, researchers have developed several solutions that aim to miti gate and identify their patterns. Many of these approaches have adopted Large Language Models (LLM) or multimodal Large Language Models (LMMs) to leverage their text and image processing capabilities for phishing domain detection. However, most existing methodologies rely on similar strategies such as brand-based detection or HTML/screen shot analysis. In this thesis, we propose a new phishing detection system that leverages the capabilities of LLMs to interact with real online domains and, through a step-by-step reasoning process, determine whether a domain is phishing or legitimate. This document presents the development of the proposed solution, detailing the different phases involved in building our system, and evaluates its performance in comparison with other state of-the-art approaches to assess its effectiveness. The code of our system is available at: https://github.com/Teiconsmarc/PhishingVoyager.