SNE Master Research Projects Web Page


Home Previous years


This page reports the list of student projects with the type (long or short), the contact person for each project ("@" is replaced by "=>"), the status (available or assigned) the warning level (low, medium or high; where high means that is strongly suggested to submit the project proposal head of time to not incur in delays). New projects will be added at the end. All other information related with the projects are available on the course pages on Canvas.

Number and Type Title and Abstract Supervisor Status Warning
1 - short
Title: Pipeline of Mass Disclosure

Want your RP to create immediate societal impact? The Dutch Institute of Vulnerability Disclosure is known for responding to emerging high-impact and high-exposure vulnerabilities on the full IPv4-space. The recent trends and exploitation of these vulnerabilities no longer allows for a manual scanning process and requires increasingly fast responses. This research focuses on realizing modular and standardized automation for DIVDís Scan-Notify process to further mature DIVDís processes and enable faster and more accurate response.

Topics: requirements engineering, architecture design, terraform, ansible, Linux, networking, vulnerability management, knowledge management, data engineering
Max van der Horst max=>divd.nl available low
2 - short or long
AutoML for Side-Channel Analysis


Description: Side-channel analysis (SCA) is an important aspect of modern hardware security and cryptography. It consists of acquiring physical measurements and analyzing them using machine learning. Any machine-learning processing pipeline can quickly get very large and complex: as a result, the hardware hacker may not be able to pick the optimal attack parameters. To that end, we have developed a framework that automates ML in the context of hardware security (called MetaHive), using heuristics like particle swarm optimization (PSO).

Goals:
-learn the basics of SCA and PSO
-implement PSO in an existing python framework for automated ML
-potential extensions: measure the effectiveness of PSO in optimizing the the parameters of a side-channel attack
Kostas Papagiannopoulos k.papagiannopoulos=>uva.nl available low
3 - long
Investigating Parsing Differentials in micro services

Environments that use micro services often have a wide variety of programming languages and frameworks. Therefore, we suspect that parser differentials vulnerabilities are common in micro service architectures. For example how two libraries parse (malformed) JSON, HTTP requests etc. This could lead to interesting vulnerabilities that are hard to find. The goal of this project is to find such parser differentials in commonly used libraries (first period) and see if this could lead to real vulnerabilities (second period). Reference:
https://www.sonarsource.com/blog/security-implications-of-url-parsing-differentials/
Matthijs Melissen mmelissen=>computest.nl and Daan Keuper dkeuper=>computest.nl
available low
4 - short
Detecting DDoS

DDoS attacks are still a large problem for internet-connected applications. In DDoS attacks, an attacker often abuses software of third parties without these third parties awareness. The goal of this project is to be able to test whether an organisation's internet-exposed network can be exploited to cause a DDoS. To do so, the student will first find out which amplification factor is in use for common software. Then the student will build a tool to verify if this software is running within the organisaton.
Reference:
https://www.shadowserver.org/news/over-18-8-million-ips-vulnerable-to-middlebox-tcp-reflection-ddos-attacks/
Matthijs Melissen mmelissen=>computest.nl and Daan Keuper dkeuper=>computest.nl available low
5 - long
Browser-powered Desync Attacks

Browser-powered desync attacks are a new class of web-based attacks using HTTP Request Smuggling. In the first period, the student will create a test methodology and possible tool suite that Computest can use to test for these types of attacks. In the second period, the student will investigate if common software is vulnerable to this type of attack.
Reference:
https://portswigger.net/research/browser-powered-desync-attacks
Matthijs Melissen mmelissen=>computest.nl and Daan Keuper dkeuper=>computest.nl available medium
6 - short
Interpreting automated tool output for cloud security testing

There exist a lot of automated security tools for cloud environments (Azure, AWS and Google Cloud). An example of such a tool is Defender for Cloud. We notice that the results of these tools are often hard to interpret. During this project, the student will look for a tool that is able to map the technical findings originating from the automated tool to the customer's risk assessment.
Matthijs Melissen mmelissen=>computest.nl and Daan Keuper dkeuper=>computest.nl available low
7 - short
Hardware dropper

In redteaming, we often try to get access to network ports at the customer location. These network ports might be in use, so in this case we would like to introduce a piece of hardware that acts as a transparent proxy for the existing user, and also allows the security tester to access the network. The device probably should have wifi and/or 5G, so that data can be exfiltrated invisible to the tested organisation. During this project, the student will create such a hardware dropper (perhaps using existing hardware), and implements the correct software to use the device.
Matthijs Melissen mmelissen=>computest.nl and Daan Keuper dkeuper=>computest.nl available low
8 - short
Static code analysis using large language models

Large language models such as GPT have attracted a lot of attention in artificial intelligence. One application of such models is the generation of code (e.g. Copilot). This raises the question whether such models can also be applied for detecting and repairing vulnerabilities in source code, and how they perform compared to traditional code analysis tools.
Matthijs Melissen mmelissen=>computest.nl and Daan Keuper dkeuper=>computest.nl available low
9 - long
Automatic evidence processing and analysis in the cloud

Description: In this research the student will research how the cloud can be used to perform automatic evidence processing and analysis. Invictus performs incident response engagements in cloud environments and is often tasked with finding the needle in a large haystack. The cloud offers lots of possibilities to automate the evidence collection, processing and analysis components of an incident response using serverless components and temporarily using cloud resources to quickly spin up and process data. Ideally the student first analyses the options available in the cloud for automatic evidence processing and analysis and will later build a PoC in the cloud where we can securely upload data that will be picked up automatically for processing and analysis.
Korstiaan Stam
korstiaan=>invictus-ir.com
available low
10 - long
Research the use and applicability of honeypots in cloud environments

Description: The student will investigate how honeypots can be used in cloud environments to attract threat actors. The idea is that it's trivial to spin up a 'fake' company in the cloud to serve as a honeypot which makes it interesting to see if a honeypot environment can be created that mimics an actual organization which can then be used to attract real attackers. The student will start with theoretical research to investigate honeypot environments in the cloud and how to make them interesting for attackers. Next the student will apply that knowledge and build a PoC using Infrastructure as a Code in order to make the process repeatable. The student can use whatever cloud platform or technology they prefer, Invictus will provide access to required resources and support the student in any way possible
Korstiaan Stam
korstiaan=>invictus-ir.com
available low
11 - short or long
(CSA CAIQ or European Cybersecurity Compliance framework)  into the DevSecOps pipeline for security improvement and risk assessment of applications

Analise the structure of the selected (one of) compliance frameworks and map it to the typical (cloud-based) application architecture, suggest architecture design patterns. Identify what security controls can be used and how they can be applied to different CI/CD stages.
Yuri Demchenko Y.Demschenko=>uva.nl available low
12 - short
Malicious G-Code Characterization in Additive Manufacturing

Abstract: Additive manufacturing/ 3d printing is an emerging technology that has a wide range of applications, from printing foods to printing jet engines. And all of the 3D printers are controlled by a language called G-Code. It is crucial to thoroughly study all possible G-code instructions and understand the potential security threats that can be introduced by malicious G-code. The goal of the project is to thoroughly understand the behaviors and effects of potentially malicious G-code
Chenglu Jin Chenglu.Jin=>cwi.nl available low
13 - short
Malicious G-Code Detection in Additive Manufacturing

Abstract: After how G-code can be abused by malicious users, a detection tool needs to be developed to prevent attackers from downloading/executing malicious G-code on a protected additive manufacturing machine/ 3d printer. Ideally, the detection needs to be efficient and robust.
Chenglu Jin Chenglu.Jin=>cwi.nl available low
14 - short
Autoencoder for Detecting Malicious Model Updates in Differentially Private Federated Learning

Project motivation:
Imagine a world where your smart devices, like your phone or smartwatch, can learn new things and get smarter without sharing your private information. Federated learning makes this possible by letting devices work together on training a smart model without sharing personal data. However, there's a challenge Ė some bad actors might try to trick the system by sending fake updates to mess up the learning process. 
In our project, we want to use a special kind of computer program called an "autoencoder" to catch these tricky updates. Autoencoders are like detectives that can spot unusual patterns. By having them watch over the learning process, we aim to make sure our smart devices stay smart and secure, protecting your privacy while they learn.
Project description:
In this research project, we seek to address the critical challenges related to security and privacy in differentially private federated learning environments. Federated learning allows decentralized devices to collaboratively train machine learning models without sharing raw data, promoting privacy preservation. However, the federated learning process is susceptible to malicious activities, particularly in the form of poisoning attacks, where adversaries intentionally inject misleading model updates to compromise the system's integrity.
Our proposed solution involves the deployment of autoencoders, a class of neural networks, to detect and mitigate these malicious model updates effectively. Autoencoders, with their ability to capture intrinsic features and patterns in the data, will be tailored to identify anomalies introduced by poisoning attacks during the federated learning aggregation process. This approach aims to bolster the security of differentially private federated learning systems by providing an additional layer of defense against potential threats.
Research Questions:
In this project we aim to answer the following research questions:
    1) How can autoencoders be seamlessly integrated into the federated learning pipeline to detect and mitigate the impact of poisoning attacks on model updates?
    2) To what extent does the incorporation of the proposed autoencoder-based detection mechanism enhance the robustness of federated learning models against poisoning attacks, and how does it impact the overall performance under differential privacy constraints?

References:
    1) Detecting Malicious Model Updates from Federated Learning on Conditional Variational Autoencoder, Zhipin Gu, Yuexiang Yang, 2021
    2) Federated Learning with Differential Privacy: Algorithms and Performance Analysis, Kang Wei et. Al., 2020

Sheikhalishahi, Mina mina.sheikhalishahi=>ou.nl available low
15 - short
Secure Collaborative Data Sharing for Enhanced Machine Learning Insights

Project Motivation:

In today's world, data is gold, but sharing it comes with challenges. People have valuable information, but they're worry of exposing too much. We aim to empower data owners to collaboratively contribute their data for collective insights without compromising their privacy. By employing advanced privacy-preserving techniques, we're creating a secure environment for sharing without revealing sensitive details.

Project Description:

Our project envisions a scenario where multiple entities (e.g., hospitals), each holding a piece of valuable information, want to pool their resources for a broader understanding. However, they're cautious about divulging sensitive details. To address this, we introduce a sophisticated approach: these entities engage in a privacy-aware conversation, leveraging Local Differential Privacy (LDP) techniques with frequency estimation. This dialogue helps them collectively decide which aspects of their data to share and which to keep confidential. The result? A unified dataset that combines their insights in a way that respects privacy constraints (in terms of entropy or probably k-anonymity). 

Now, the central question emerges: Can this unified dataset, crafted through intelligent privacy-preserving collaboration, effectively train machine learning models? And how does this collaborative approach enhance privacy against potential attacks seeking to discern individual identities within the aggregated data? Our exploration seeks to uncover the delicate balance between insightful collaboration and safeguarding individual privacy in the realm of shared data (in terms of membership inference attack).

Research Questions:

We plan to answer the following research questions:

    1) To what extent does the proposed framework ensure the accuracy of classifiers trained on the consolidated dataset generated through collaboration among distributed entities? 
    2) In what ways does the framework contribute to privacy improvement for each participating entity?

Sheikhalishahi, Mina mina.sheikhalishahi=>ou.nl available low
16 - short
Local Differential Privacy for Data Clustering

Project Motivation:

In our increasingly digital world, the information we share online is like a puzzle piece revealing a bit about us. Protecting our privacy while still contributing to data-driven projects is tricky but crucial. Imagine you're part of a big picture, but you only share a tiny piece of it without giving away too much about yourself. That's what we aim to explore with this project - finding a smart way to keep your info private while still helping out in projects that use your piece of the puzzle.

Project Description:

In this project, we're going to use a clever technique called "Local Differential Privacy (LDP)." Think of it as a way for you to share your puzzle piece without directly revealing what it looks like. We'll break down a space into grids (like squares on a map), and everyone will find which square their piece belongs to. Using a secret method, you'll then tell us just the (perturbed) number of your square without spilling the beans about your actual info.  Now, we'll use this "hidden" data to create a new version of our puzzle pieces - a kind of secret club where we can still see the big picture without knowing too much about each piece. We'll run some tests to see how well this secret club version works compared to the original one.

In addition to our clever privacy protection, we're diving into the world of "clustering." Imagine you have a bunch of puzzle pieces, and you want to group them based on similar patterns or colors. That's what clustering does - it helps us organize information in a way that makes sense.  So, we're not only exploring how well our secret-sharing method preserves privacy, but we're also curious about how this secret-shared data can help us make sense of patterns and groups within the big picture. Can we still figure out which pieces belong together without knowing everything about each piece?

That's the exciting puzzle we aim to solve in this project!

Research Questions:

In this project we aim to answer the following research questions:
    1) How does our sneaky method of sharing data (LDP) affect the accuracy of our big-picture understanding (Clustering outcome), and how does it stack up against knowing everything about each piece? 
    2) Does the size of our puzzle (how many pieces we have) change how well our secret-sharing method works (the impact of dataset size)? We want to find the sweet spot between keeping things private and still making sense of the big picture. 
    3) What happens if we change the size of our grid squares? Does it help us keep more secrets while still putting together a clear picture?

We're curious to find out!
Sheikhalishahi, Mina mina.sheikhalishahi=>ou.nl available low
17 - short
Evaluating Fairness in k-Anonymized Datasets

Project Motivation:
In today's digital landscape, ensuring the privacy of individuals in shared datasets is like guarding a treasure chest of personal information. One way we secure this treasure is by using techniques such as k-anonymity. Imagine each person in a dataset is like a superhero, and k-anonymity is the cloak that helps them hide in a crowd, making it challenging to identify any one superhero. It's a vital tool for protecting personal details while still sharing valuable information.  However, there's a catch. While we're busy protecting our superheroes, we want to make sure the cloak we're using doesn't accidentally make things unfair, especially for certain groups of superheroes. It's like making sure all superheroes, regardless of their background or characteristics, get a fair chance.  So, in this project, we're not just interested in how well our privacy cloak works but also in whether it treats all superheroes equally. We want to understand if, by using this cloak, some superheroes end up more hidden than others. It's about striking the right balance between keeping secrets safe and making sure no superhero feels left in the shadows. This project is an exciting quest to explore how our privacy tools can be superheroes themselves, defending fairness along with confidentiality.

Project Description:
This research project seeks to provide a nuanced exploration of the implications of k-anonymity on the fairness of anonymized datasets. Unlike traditional privacy-preserving methodologies, such as differential privacy, k-anonymity focuses on data generalization to achieve anonymity guarantees. Our study will go beyond the mere assessment of privacy protection and delve into the impact of k-anonymity on the fairness of generated data. This investigation will particularly emphasize the potential disparate impacts on underrepresented classes and subgroups within the dataset. 

By scrutinizing the fairness implications of k-anonymization, we aim to offer insights into how privacy measures can be fine-tuned to align with the principles of fairness and equity. The ultimate objective is to strike a delicate balance between preserving individual privacy and ensuring that privacy measures contribute positively to the overall fairness of the data-sharing ecosystem.

Research Questions:
This prject aims to answer the following research questions:
    1) To what extent does the fairness of the original dataset influence the observed disparate impact (and other fairness metrics) in the k-anonymous dataset?
    2) How can adjustments or enhancements to the k-anonymity process  (different values of k and different k-anonymity techniques) be implemented to mitigate potential fairness concerns without compromising individual privacy?
    3) How the properties of dataset (size, numerical or categorical feature, the number of features) affect the final outcome?

Sheikhalishahi, Mina mina.sheikhalishahi=>ou.nl available low
18 - short
Using ZIP files to smuggle malware through scanners undetected

Abstract: The ZIP file format is an ubiquitous file format that is widely supported by tools and operating systems. The specifications are open and anyone can reimplement these. The specifications are also unclear and various tools and libraries have implemented the specifications differently and at times in ways that directly contradict the specification. Some tools can create files that cannot be processed correctly by other tools, leading to crashes or unexpected behaviour. Interestingly, there are ways to add a malicious payload to ZIP files in ways that make it hard or impossible to detect for tools that implement ZIP in a straightforward and simplistic way.

Your task would be to research how various virus/malware scanners behave given unusual ZIP files and to see if a payload can be smuggled past these tools. There might also be some coding involved using Python.
Armijn Hemel armijn=>tjaldur.nl unavailable low
19 - short
Electronic voting

In this project, you will build and analyse solutions to implement electronic voting in the following client-server setting: a group of voters (acting as clients) send votes to a set of servers who are responsible for collecting the votes and computing the result.

The security goals are correctness (the result correctly reflects the votes cast, even if some of the servers are corrupt) and privacy (only the final voting result becomes known, the individual votes remain private).

The project approach will be as follows: You will study and implement (in python, preferably)  solutions tolerating corrupt servers that suffer from crash-failures and build upon this to tolerate corrupt servers that can deviate from the protocol in any arbitrary manner.

Using the tool of secret sharing is one approach for project. Based on time and interest you could opt for alternate approaches such as combining this project with exploring existing e-voting systems such as Helios or Election Guard or look at homomorphic encryption based solutions for e-voting (https://link.springer.com/chapter/10.1007/3-540-68339-9_7).
Divya Ravi d.ravi=>uva.nl and Kostas Papagiannopoulos k.papagiannopoulos=>uva.nl unavailable low
20 - short
Picture to datasheet correlation

Object Recognition Algorithm (AI) to automatically detect chips on PCBs and generate a link to their data sheets (could be using a camera on a phone and uploading it through a webapp).  With correlation to exploits available for that chip model. Possibility to identify counterfeits based on the manuals. Goal - Speeding up the identification process of hardware hacking projects.
Slobbe, Jeroen JSlobbe=>deloitte.nl unavailable low
21 - short
Automating asset inventory in sites (like factories, power plants)

Usually asset inventory is pulled from client docs like Excel and monitoring tools like Nozomi, Claroty. How to ensure completeness? Is there a better way of creating asset inventory passively? How can the data be consolidated? Preferably automatic tooling.
Slobbe, Jeroen JSlobbe=>deloitte.nl unavailable low
22 - short or long
Energy Cost of PETs

Comparison between using synthetic data, multi-party computation (and perhaps differential privacy) to safely do various ML tasks on sensitive data. The comparison should cover energy consumption, data utility, and anonymisation metrics (similar to anonymeter, but that only works for synthetic data).
Ana Oprescu ana.oprescu=>gmail.com
available low
23 - short or long
P2P eduVPN

eduVPN provides an open-source VPN solution allowing ISPs, hosting providers and businesses to easily set up a secure VPN service. Currently eduVPN has been engineered in a traditional client-server VPN model. Basically connecting the client with VPN technology into the organization where the VPN server is deployed.
 Roughly 140 organisations, and estimated 300K users, around the globe are using eduVPN. The current client-server model of eduVPN doesnít facilitate directly connecting devices located in various places, like IoT devices at home or services offered in various datacenters or (public) cloud environments. This project focusses on engineering a P2P solution integrated with eduVPN, which empowers users to connect safely to all their devices, anywhere on the internet.
Rogier Spoor Rogier.Spoor=>SURF.nl and Jeroen Wijenbergh jeroen.wijenbergh=>geant.org
unavailable low
24 - short or long
State of the art of log collection methods for security and application monitoring purposes

Log collectors and log collector architectures are relied upon for application and security monitoring purposes in mature organisations, to provide for proactive resolution of incidents. Students of SNE should review the state of the art, choose products that worth to compare and provide a report on how the chosen products compare regarding reliability, maintainability, support etc. Example products that can be part of the scope of the project include: ELK, OpenSearch, Splunk, NXLog, Beats, Fluentbit, Rsyslog, WEF. Project can be security / application focused, Linux or Windows or both. Suitable for pairs or individuals, and depending on the student it can be long term / short term project.
Peter Prjevara peter=>securitymindset.eu available low
25 - short
Testing for vulnerable SAML signature validation

SURFconext is the national login platform for all (higher) education and research processing 250 million logins per year, many of which use the SAML 2.0 protocol. A key element of the SAML 2.0 protocol is that signed XML message is sent to the service the user is logging in to. If this signature is not verified (correctly), the security of the receiving party is fully compromised. Sadly this seems to happen in practice. While it's easy to test that a correct login works, devise a way or ways for SURFconext to test service providers whether they not only accept good messages but properly reject bad messages.
Thijs Kinkhorst thijs.kinkhorst=>surf.nl and Wim Biemolt wim.biemolt=>surf.nl
available low
26- short
DDoS mitigation

The network of SURF currently uses various techniques to protect the network and the connected institutions against various kinds of (DDoS) attacks. But as attacks and/or networks evolve we need to investigate different solutions. For a wide range of attacks.
Thijs Kinkhorst thijs.kinkhorst=>surf.nl and Wim Biemolt wim.biemolt=>surf.nl available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low
XX - short
Title

Abstract
Supervisor available low