M.S. Thesis

M.S. Thesis

Övgü Özdemir, Exploring the Capabilities of Large Language Models in Visual Question Answering: A New Approach Using Question-Driven Image Captions as Prompts

Visual Question Answering (VQA) is defined as an AI-complete task that requires understanding, reasoning, and inference of both visual and language content. Despite recent advancements in neural architectures, zero-shot VQA remains a significant challenge due to the demand for advanced generalization and reasoning skills. This thesis aims to explore the capabilities of recent Large Language Models (LLMs) in zero-shot VQA. Specifically, it evaluates the performance of multimodal LLMs such as CogVLM, GPT-4, and GPT-4o on the GQA dataset, which includes a diverse range of questions designed to assess reasoning abilities. A new framework for VQA is proposed, leveraging LLMs and integrating image captioning as an intermediate step. Additionally, the thesis examines the effect of different prompting techniques on VQA performance. Evaluations are conducted on questions that vary semantically and structurally. The findings highlight the potential of using image captions and optimized prompts to enhance VQA performance under zero-shot setting.

Date: 04.09.2024 / 13:30 Place: A-212

English

Pelin Dayan Akman, Analysis of Technical Debt in ML-based Software Development Projects

This research addresses the multifaceted nature of Technical Debt (TD) in Machine Learning (ML) projects, distinct from traditional software projects due to their probabilistic nature and data dependency. The study systematically examines how TD manifests across various dimensions in ML projects, identifying root causes, impacts, and band-aid solutions contributing to its persistence. ML-specific TD was categorized through thematic analysis of interviews with industry professionals. The findings were reviewed by academic experts in multiple iterations. This study fills a gap in the literature and offers practical insights for managing TD in ML contexts, as well as a TD-oriented structure for its assessment.

Date: 06.09.2024 / 09:30 Place: A-212

English

Engin Uzun, Simulating and Augmenting Turbulent Thermal Images for Deep Object Detection Models

Atmospheric turbulence, caused by factors such as temperature, wind speed, and humidity, leads to random fluctuations in the atmosphere's refractive index. This phenomenon degrades the image quality of long-range observation systems through geometric distortions and spatial-temporal varying blur. Turbulence can affect various imaging spectra, including visible and thermal bands. This thesis addresses the challenge of atmospheric turbulence in thermal imagery and its impact on object detection models. To tackle this challenge, we propose a data augmentation method that enhances the performance of object detectors by utilizing turbulent images with varying severity levels as training data. We generate training samples using a geometric turbulence simulator and use Geometric, Zernike-based, and P2S-based simulators to create the turbulent test sets, confirming the effectiveness of our augmentation method across different types of simulated turbulence. Our results demonstrate that this data augmentation approach significantly improves performance for both turbulent and non-turbulent thermal test images.

Date: 03.09.2024 / 13:30 Place: B-116

English

Burak Sevsay, Infrared Domain Adaptation with Zero-Shot Quantization

The quantization of neural networks is essential to meet real-time requirements. Zero-shot quantization is a key approach when training data is unavailable. To the best of our knowledge, zero-shot quantization in the infrared domain has not been explored before. This thesis examines the performance of batch normalization statistics-based zero-shot quantization on models trained with infrared imagery. We fine-tuned models pretrained on RGB images using infrared images and carefully investigated the data generation process to achieve optimal results for YOLOv8 and RetinaNet. Our results demonstrate that zero-shot quantization is more effective in the infrared domain.

Date: 03.09.2024 / 11:00 Place: B-116

English

Utku Mert Topçuoğlu, Efficient Pretraining of Vision Transformers: A Layer-Freezing Approach with Local Masked Image Modeling

This thesis explores efficient pretraining methods for Vision Transformers by integrating progressive layer freezing with local masked image modeling. The study assesses the computational demands and extended training periods typical of self-supervised learning methods for ViTs. Key innovations include implementing the FreezeOut method within the LocalMIM architecture to significantly enhance training efficiency. Experimental results show a reduction in training time by about 12.5% while maintaining competitive accuracy, demonstrating the effectiveness of strategic layer freezing combined with tailored learning rate scheduling. This approach promotes more accessible self-supervised learning on constrained computational resources.

Date: 03.09.2024 / 09:30 Place: B-116

English

Yasin Aksüt, An Analysis of Kerberoasting Attack and Detection with Supervised Machine Learning Algorithms

Perimeter security is no longer barrier to access networks and critical data, making traditional security measures outdated. A robust security strategy is crucial to prevent and detect Active Directory (AD) attacks, which can be difficult to detect due to their blend in with normal network traffic. One such attack is the Kerberoasting attack, which exploits weaknesses in the Kerberos authentication protocol. To detect these attacks, supervised machine learning algorithms are being proposed. And also publicly available dataset to measure the efficiency of these algorithms for Kerberoasting attacks was created and shared.

Date: 05.09.2024 / 10:00 Place: II-06

English

Anıl Öğdül, A Continuation-Based Compositional Account for Syntax-Semantics of Turkish Perfective-Evidential Suffix -mış

This work investigates the meaning of the perfective/evidential suffix -mIş, focusing on its perfect interpretation. It has been argued that there are two distinct syntactic structures for simple verbal sentences [verb+past] and complex verbal sentences [verb+part+cop+past] (Kornfilt, 1996; Kelepir, 2001). Demirok and Sağ (2023) offer a compositional account for these two structures, taking the temporal relations as the basis. Building on that, we propose an Aktionsart-oriented analysis of the verb-participle relation. We offer a continuation-based compositional account within quantificational event semantics (Champollion, 2015) to reconcile the syntactic account of Kelepir (2001) and observations on the perfect meaning of -mIş.

Date: 06.09.2024 / 15:30 Place: A-212

English

Tuğçe Vural, Exploration of Practitioners’ Continuance Intention toward Agile Methodology Usage: An Empirical Investigation

This thesis aims to identify the factors influencing practitioners' continuance intention toward Agile methodology usage. The study also examines the influence of identified factors on the continuance intention of Agile methodology usage and proposes a model in the context of Agile methodology. The model was verified with the reliability tests, Exploratory Factor Analysis, Confirmatory Factor Analysis, and Structural Equation Modeling. By utilizing Structural Equation Modeling, the influencing factors and the relationships among these factors were analyzed and the final model is proposed.

Date: 05.09.2024 / 09:45 Place: A-108

English

Nisa Demir, Identification of Critical Success Factors in Data Analytics Projects

This thesis explores the identification and prioritization of Critical Success Factors (CSFs) in data analytics projects. Through a systematic literature review and semi-structured interviews with data professionals, a comprehensive list of CSFs was developed, structured hierarchically, and refined based on expert feedback. The study addresses gaps in existing literature by providing a cross-disciplinary CSF framework applicable to various fields like AI, big data, and business intelligence. Additionally, the research prioritizes these factors through the semi-structured interviews based on organizational contexts such as company size, project complexity, and technological maturity.

Date: 05.09.2024 / 11:00 Place: A-212

English

Ümit Eronat, A Comparative Analysis of Various 3D Mesh Optimization Algorithms for Assessing Effectiveness on Sustaining Virtual Visual Illusion

This thesis presents a method of comparing the cost-effectiveness of 3D mesh simplification algorithms using the McGurk effect, where visual and auditory cues are combined to create an illusion. The study involves designing a human head mesh, animating mouth movements, and recording certain syllable sounds to produce a virtual scene. Using this virtual scene and applying three different mesh simplification algorithms on the animated head, a user study was conducted to test and measure the effectiveness of each algorithm for each different syllable in medium and high difficulty levels. Results highlight the balance between computational efficiency and perceptual accuracy, providing insights for 3D modeling and virtual reality applications.

Date: 04.09.2024 / 10:00 Place: II-06

English

Pages

Subscribe to RSS - M.S. Thesis