SNE Master Research Projects Web Page


Home Previous years


This page reports the list of student projects with the type (long or short), the contact person for each project ("@" is replaced by "=>"), the status (available or assigned) the warning level (low, medium or high; where high means that is strongly suggested to submit the project proposal head of time to not incur in delays). New projects will be added at the end. All other information related with the projects are available on the course pages on Canvas.

Number and Type Title and Abstract Supervisor Status Warning
1 - short
Security impact of DNS over TLS (DoT) and DNS over HTTPS (DoH)

DNS resolution is a critical and sensitive service. By default, DNS queries and responses are sent in plaintext. There are mainly two recently developed protocols, DNS over TLS (DoT) and DNS over HTTPS (DoH), which are of growing importance aiming to protect DNS privacy. Such encrypted protocols are cleary of benefit by protecting integrity and confidentiality of DNS traffic. However, they can effectively disrupt security controls and network monitoring solutions. The goal of this research is to analyse the security impact of DoT and DoH in order to securely implement encrypted DNS without compromising network security.
Silke Knossen <silke.knossen=>kpn.com> unavailable
low
2 - short
Topic: TR-369 research

TR-69 is a commonly used protocol for remote management of modems/routers/gateways, which has been around for 15 years. Until now, this is how most consumer modems are remotely managed at KPN. A new protocol has been developed by the Broadband Forum, which is called TR-369. It is intended to replace TR-69. It offers a new architecture where multiple "controllers" (providers, vendors, or end users) can interact with endpoint devices (modems/routers, wifi controllers, iot etc). It supports multiple transport protocols, including websockets/COAP/MQTT/etc. KPN REDteam recently did a time boxed test on a test setup for a new modem which is controlled through TR369 (in this case, over MQTT), and we found some security issues.

Goals:
* Review TR369/transport protocol "suite" with regards to security.
* Create tooling/pentesting a modem with TR369 backend infrastructure.

References:
https://www.avsystem.com/blog/TR-369/
https://www.broadband-forum.org/download/TR-369.pdf

Notes:
Project available only for a group of two students
Anand Groenewegen <anand.groenewegen=>kpn.com> and Stef van Dop <stef.vandop=>kpn.com>
unavailable
medium
3 - short
Privacy and Robustness in DP-based (Differential Private based) Federated Learning

Federated learning is a collaborative learning infrastructure in which the data owners do not need to share raw data with one another or rely on a single trusted entity. Instead, the data owners jointly train a Machine Learning model through executing the model locally on their own data and only share the model parameters with the aggregator. While the participants only share the updated parameters, still some private information about underlying data can
be revealed from the shared parameters. To address this issue, Differential Privacy has been used as effective tool to protect information leakage over shared parameters in Federated Learning, say DP-FED. However, it has not yet been investigated whether (and to what extent) the DP-FED is resistant against attacks.

This project aims to evaluate the resistance of DP-FED against different attacks and to explore the possibilities of reducing the success rate of these attacks. To conduct this research, at least three datasets, three different DP-FED techniques, and three different privacy threat models should be selected. Then, a comparison of DP-FED and FED (without DP) should be performed to evaluate how much embedding Differential Privacy in Federated Learning
algorithms makes them robuster.

The following papers are suggested to be studied for this work:
1. Mohammad Naseri, Jamie Hayes, and Emiliano De Cristofaro; Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy, CoRR, 2020.
 
2. Lingjuan Lyu, Han Yu, Xingjun Ma, Lichao Sun, Jun Zhao, Qiang Yan, Philip S. Yu, Privacy and Robustness in Federated Learning: Attacks and Defenses, arXiv, 2022.

3. Ahmed El Ouadrhiri, Ahmad Abdelhadi, Differential Privacy for Deep and Federated Learning: A Survey, IEEE Access, 2022.

4. Malhar Jere, Tyler Farnan, and Farinaz Koushanfar; A Taxonomy of Attacks on Federated Learning, IEEE Security & Privacy, 2021.

5. Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong, Data Poisoning Attacks to Local Differential Privacy Protocols, CoRR, 2019.

6. Minghong Fang, Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong; Local Model Poisoning Attacks to Byzantine- Robust Federated Learning, the 29th Usenix Security Symposium, 2020.
Mina Sheikhalishahi <mina.sheikhalishahi=>ou.nl> available medium
4 - short
Private GAN for Machine Learning

Generative Adversarial Network (GAN) provide a promising direction in research studies where data availability is limited. One common issue in GANs is that due to the high model complexity of deep networks, they are vulnerable in revealing information about training samples. This issue has been addressed in several studies by designing Differentially Private GAN (DPGAN) models, in which DP is adopted in training GANs. While DPGANs serve as effective tools in this regard, still a comprehensive understanding of the utility of this new generated data, with the purpose of being used as the source data of Machine Learning algorithms, is missing. Also, it is not clear how much each DPGAN technique is resistant against privacy threats compared to other DPGAN methodologies.

In this project, we select several DPGAN techniques, several datasets (with different properties), several ML algorithms, and two/three privacy attacks. We first train DPGAN techniques on selected datasets. We next evaluate the utility of data by employing ML algorithms on generated data. We compare the utility of generated data based on ML model accuracy. Also, we analyze how the dataset properties and the ML technique properties affect the effectiveness of data. We then employ privacy attacks on DPGANs and compare the results with GANs to evaluate and compare the robustness of different DPGANs.

The following studies are recommended:

1. Liyue Fan, A Survey of Differentially Private Generative Adversarial Networks, 2021.

2. Liyang Xie, Kaixiang Lin, Shu Wang, Fei Wang, Jiayu Zhou, Differentially Private Generative Adversarial Network, 2018.

3. Chugui Xu, Ju Ren, Deyu Zhang, Yaoxue Zhang, Senior , Zhan Qin, Kui Ren, GANobfuscator: Mitigating Information Leakage Under GAN via Differential Privacy, IEEE Transaction on Information Forensics and Security, 2018.
Mina Sheikhalishahi <mina.sheikhalishahi=>ou.nl> available medium
5 - Long
Comparison of state-of-the-art endpoint defence solutions to (partially) open-source endpoint defence

Endpoint defence evolved a lot in the last decade and the old anti-malware / anti-virus software a small sub-section of the state-of-the-art endpoint defence solutions. Instead of anti-malware / anti-virus, we are now talking about Endpoint Defense and Repsonse (EDR), Data Loss Protection (DLP), File Integrity Monitoring (FIM) and other fancy words that suppliers have the creativity to come up with. The biggest suppliers on the market are busy expanding their software with new features. This project will allow the students to get access to some vendor trial licences (1 or more) and compare the functionality of the products with free and open-source product offerings. Depending on student ability the project can result in the development of new features into open-source products. A minimum expected deliverable of the project is a comparison report and proposed development path to improve the open-source or proprietary products.

This long project is divided in the following way:

*) Phase 1: building on the RP of Dennis from 2021, further develop an open criteria of assessing and quantifying the effectiveness of a modern EDR (qualitative theoretical study)
*) Phase 2: put this theory into practice by putting several state of the art tools to test, possibly in a specific context (Office IT or possibly SCADA) depending on availability of opportunities
Peter Prjevara <peter=>securitymindset.eu> unavailable low
6 - Short
Comparison of architectures supporting high integrity and secure data pipelines

Tennet TSO is a leading European grid operator committed providing secure and reliable supply of electricity 24 hours a day 365 days a year, while helping to drive the energy tranisition. As a first cross-border Transmission System Operator (TSO), we design, build maintain and operate 23,900 km of high-voltage electricity grid in the Netherlands and large parts of Germany and facilitate the European energy market, through 16 interconnectors to neighboring countries. As part of this effort some of our teams are committed to deliver a private cloud infrastructure that house the data pipelines we use to interface between our internal departments and with our external partners. In these data pipelines data integrity and security is of high importance, so we must use modern technologies and data architectures that support this data integrity and security. However we also have legacy requirements, which must integrate securely with the modern technologies. Modern technologies we use include k8s and Apahce Kafka and MinIO, while some of the legacy requirements we have is the need for SQL based querying methods, or file based data transfers (SCP / SFTP).We would like to offer a project to SNE students where they explore the possibilities of architecting data pipelines combining these technologies - or even newer / better ones. Some of the questions that can form a basis for research questions are as follows:

- How are these technologies can be best combined to offer maximum data integrity?
- How can the technologies be best used to create long term, highly integer data archiving?
- What are the limits of this integration (on the available hardware to the students)?
- What are the advantages / disadvantages of implementing the architecture as a service-mesh instead of traditional architectures?

As the students will require to build their own test environment, this project is suitable for 2 candidates. Tennet will facilitate engineering support where students will gain insight into what problems the engineers and architects find important during the design of such architectures, and how the Agile teams in Tennet work together to deliver similar systems and architectures.
Peter Prjevara <peter=>securitymindset.eu> available low
7 - Short
Parser differentials in micro services

Environments that use micro services often have a wide variety of programming languages and frameworks. Therefore, we suspect that parser differentials vulnerabilities are common in micro service architectures. For example how two libraries parse (malformed) JSON, HTTP requests etc. This could lead to interesting vulnerabilities that are hard to find. The goal of this project would be to find such parser differentials in commonly used libraries and see if this could lead to real vulnerabilities.
Daan Keuper <dkeuper=>computest.nl>
available  medium
8 - Short
Race conditions in web applications

In local applications race conditions are well understood and we have tons of examples that were affected by this vulnerability class. However, in web applications research on this topic seems to be scarce. We’ve found some real life vulnerabilities abusing race conditions (for example, claiming a coupon code more than once), but we suspect that more of such cases could be found. The goal of this project is to find more examples of race conditions  in web applications in real life applications.
Daan Keuper <dkeuper=>computest.nl>
available  medium
9 - Short
Purple teaming for telecom operators

During the last 5-10 years, a large number of organisations have adopted RED and BLUE teams. A new trend can be seen where these offensive and defensive teams work in harmony. Recent whitepapers affirm this trend[1] and outline the benefits[2]. As the largest telecom operator in The Netherlands, KPN is continuously strengthening the ties between its BLUE- and REDteam. By working together (purple teaming), we increase knowledge and effectiveness on both sides. This research is divided into a theoretical part, what does literature state regarding purple teaming best practices, and a case study by designing/building a purple team CTF combining the studied literature with a telco perspective.

Goals
* Literature study on purple team
* Design a purple team capture the flag

References:
[1] https://danielmiessler.com/study/red-blue-purple-teams/
[2] https://www.redscan.com/news/purple-teaming-can-strengthen-cyber-security/

Notes:
Project available only for a group of two students
Anand Groenewegen <anand.groenewegen=>kpn.com> and Stef van Dop <stef.vandop=>kpn.com> unavailable medium
10 - Long
XDP-based DNS hot cache

The eBPF and specifically XDP paradigms enable for processing of packets in the Linux kernel without touching the full network stack and user space.  While the flexibility of, and resources available to such XDP programs are limited, simple programs can reduce system load significantly. In DNS for example, if we can determine we can not or will not answer a DNS query at such a very early stage, we do not need to bother the software running in user space with it.

For this project, the goal is to design, develop and assess a BPF/XDP program that serves as a DNS Hot Cache, serving answers to often asked queries from kernel space.

# Part 1: design and development

In the first part of the project, the students familiarize themselves with the BPF/XDP paradigm and tool chain. At NLnet Labs, we have experience with using XDP for DNS, so we will be up to speed quickly. The final program will need to store DNS answers coming from user space, and re-use them to answer subsequent queries from kernel space directly. In preparation for part 2, we deploy the program at an actual nameserver/resolver, gathering measurements for assessment and the final report.

# Part 2: assessment of measurement results, reporting

At this stage, the developed XDP program has been running for several months, generating data such as log entries and measurements. Based on the collected insights, the students assess if and to what extent the program has affected the performance of the DNS service. (A possible outcome could be an advice on which parameters require fine-tuning for certain use-cases or networks.)

Luuk Hendriks <luuk=>nlnetlabs.nl> and Willem Toorop <willem=>nlnetlabs.nl>
available  low
11 - Short
What are the practical implementation limits of eBPF (programs)?

eBPF (which is no longer an acronym for anything) is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules." - https://ebpf.io/what-is-ebpf

eBPF sounds like the holy grail for developing 'user space'-like applications inside kernel space in a safe manner, but what can and can't you achieve as a developer of eBPF programs?

- What categories of applications can and cannot be implemented in eBPF?
- What are technical limitations that are preventing the developer of creating an application of such a category?
- What can be done to remove this limitation?
Serge van Namen <serge.van.namen=>sue.nl> and Chris Hendriks <chris=>sue.nl>
available  low
12 - Short
What is the current security posture of eBPF and implied risk of using eBPF programs?

eBPF (which is no longer an acronym for anything) is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules." - https://ebpf.io/what-is-ebpf

- What is the current security posture?
- What are the current risks of running eBPF programs?
- What are the attack surfaces?
- What is the impact upon compromise?
- How can these programs be protected?
Serge van Namen <serge.van.namen=>sue.nl> and Chris Hendriks <chris=>sue.nl> available low
13 - Short
The security state of Kubernetes

Kubernetes is becoming more and more the 'universal controle plane' for (cloud) computing. Inherent to significant growth in a technology domain is the decision of not degrading security when migrating workloads to new technology.

- What is the current security posture of Kubernetes with regards to container runtime e.g. selinux, seccomp, etc in contrast to usability?
- What can be improved?
- How can this be improved?
- What is the impact of these improvements on the usability of Kubernetes?
Serge van Namen <serge.van.namen=>sue.nl> and Chris Hendriks <chris=>sue.nl> available  low
14 - Short
eBPF based Malware

eBPF (which is no longer an acronym for anything) is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules." - https://ebpf.io/what-is-ebpf

- What types of malware can be developed inside an eBPF program?
- How can eBPF based malware be detected?
- How can a system be hardened against eBPF based malware?
- What persistency capabilities does eBPF facilitate for malware?
Serge van Namen <serge.van.namen=>sue.nl> and Chris Hendriks <chris=>sue.nl> available  medium
15 - Long
EPI - Enabling Personalized Interventions

We propose the EPI* Framework to enable secure data sharing within the healthcare context. The framework addresses multiple concerns across different levels; namely: policy level, data level, application level, and network level. Within this project proposal, we mainly focus on the last network level. To abide by security requirements at the low level of packets, we instantiate and provision Virtualised Network Functionalities (VNF) on the fly. Moreover, we containerise said VNF for higher efficiency and easier deployment. As a result, we bridge any existing security gap between the end nodes of the data-sharing session via containerised VNF or Bridging Functions (BF’s).

The framework utilises Kubernetes to orchestrate and schedule resources to run microservices across distributed clusters of proxy nodes. The goal of this project is to evaluate the framework setup via a specific threat model, and define the best practices/ mitigations in terms of security configurations. Moreover, we aim to investigate that by simulating a number of attacks to confirm the evaluation further experimentally.

Potential questions to investigate:
1- There are a number of available threat modelling methods like: STRIDE, LINDDUN, CVSS, etc. Threat models can be software centric, attacker centric, and asset centric depending on what level of security you are investigating. With the goal of evaluating the framework in mind, how to choose the appropriate methodology to use?
2-  Based on that, what threat model to use to create a system abstraction, identify security requirements, potential vulnerabilities, and mitigations while running network-based microservices with Kubernetes? Example: key management, worker node authentication, etc.

*EPI - Enabling Personalized Interventions: https://delaat.net/epi/index.html

BF chaining and proxy implementation: https://github.com/onnovalkering/socksx
Jamila Alsayed Kassem <j.alsayedkassem=>uva.nl> available  low
16 - Long
Side-channel analysis using on-line statistics

Side-channel analysis is the art of cracking cryptographic implementations by observing unintended signals such as the algorithm’s execution time or the power consumption of a device
[https://youtu.be/OlX-p4AGhWs]

Industrial-level side-channel analysis requires lengthy signal measurements over multiple days. Acquiring and processing such a large datasets is a very demanding computational task that must be carried out within specific time constraints.

In this project we will utilize efficient online statistical computations that can pinpoint the useful part of the signal in a very large dataset. To do so efficiently, we will “drill” for useful leakage information through reinforcement learning algorithms. Our final goal is to develop an efficient processing strategy that will maximize our ability to detect and perform side-channel attacks

Statistical formulas: [https://eprint.iacr.org/2015/207.pdf]

Matlab code for statistical formulas(also available in Python): [https://github.com/kostaspap88/statistics]


1st part: study and utilize online statistics
2nd part: adaptive “drilling” for leakage
Kostas Papagiannopoulos <k.papagiannopoulos=>uva.nl> available  low