Contact
Francesco
Regazzoni, Cees de Laat
|
Course
Codes:
|
|
Research
Project 1 |
53841REP6Y
|
Research
Project 2 |
53842REP6Y |
|
TimeLine
|
|
RP1 (January):
- Wednesday Nov 10, 10h00-10h30: Introduction to the
Research Projects.
- Wednesday Dec 8, 10h00-12h00; 14:00-16:00:
Detailed discussion on selections for RP1.
- Friday Jan 14, 24h00: research plan due.
- Tuesday Feb 8, 10h00-17h00: (updated)
Presentations RP1 and out of order RP2.
- Wednesday Feb 9, 10h00 - 17h00: (updated)
Presentations RP1 and out of order RP2.
- Sunday Feb 13, 23h59: RP - reports due
|
RP2 (June):
- Wednesday May 25 10h15-13h00, Detailed
discussion on selections for RP2 + delivery of
the
preliminary analysis of the ethical implications
of the project (one or two paragraph max).
- Friday Jun 10 24h00: research plan due.
- All Wednesday of June 9:00 - 10:00 virtual BBB
room open for spontaneous questions
(voluntarily)
- Tuesday Jul 5, 10h00-17h00 - SP C0.110: (updated) Presentations RP1
and RP2.
- Wednesday Jul 6, 10h00-17h00 - SP C0.110:
(updated) Presentations RP1 and RP2.
- Friday Jul 8, 17h00 (updated): RP - reports
due.
|
Projects
Here is a list of student projects. New
ones added at the end. Old and unavailable
rp's will be removed including the number, hence the
gaps. Remaining rp's carry over to next year. They can be found here.
In a futile attempt to prevent spam "@" is replaced
by "=>" in the table. Color of cell background:
Project available |
Presentation received. |
Confidentiality was requested. |
Currently chosen project. |
Report
received. |
Blocked,
not available. |
Project plan received. |
Completed
project. |
Report
but no presentation |
|
|
|
title
summary |
supervisor contact
students |
R
P |
1
/
2 |
1 |
Blockchain's Relationship
with Sovrin for Digital Self-Sovereign Identities.
Summary: Sovrin (sorvin.org) is a blockchain for
self-sovereign identities. TNO operates one of the
nodes of the Sovrin network. Sovrin enables easy
exchange and verification of identity information
(e.g. “age=18+”) for business transactions.
Potential savings are estimated to be over 1 B€
per year for just the Netherlands. However, Sovrin
provides only an underlying infrastructure.
Additional query-response protocols are needed. This
is being studied in e.g. the Techruption
Self-Sovereign-Identity-Framework (SSIF); project.
The research question is which functionalities are
needed in the protocols for this. The work includes
the development of a datamodel, as well as an
implementation that connects to the Sovrin network.
(2018-05) |
Oskar van Deventer
<oskar.vandeventer=>tno.nl>
|
|
|
2 |
Sensor data streaming
framework for Unity.
In order to build a Virtual Reality “digital
twin” of an existing technical framework (like a
smart factory), the static 3D representation needs
to “play” sensor data which either is directly
connected or comes from a stored snapshot. Although
a specific implementation of this already exists,
the student is asked to build a more generic
framework for this, which is also able to “play”
position data of parts of the infrastructure (for
example moving robots). This will enable the
research on virtually working on a digital twin
factory.
Research question:
- What are the requirements and limitations of a
seamless integration of smart factory sensor
data for a digital twin scenario?
There are existing network capabilities of Unity,
existing connectors from Unity to ROS (robot
operation system) for sensor data transmission and
an existing 3D model which uses position data.
The student is asked to:
- Build a generic infrastructure which can
either play live data or snapshot data.
- The sensor data will include position data,
but also other properties which are displayed in
graphs and should be visualized by 2D plots
within Unity.
The software framework will be published under an
open source license after the end of the project. |
Doris Aschenbrenner
<d.aschenbrenner=>tudelft.nl>
|
|
|
3 |
To optimize or not: on
the impact of architectural optimizations on
network performance.
Project description: Networks are becoming extremely
fast. On our testbed with 100Gbps network cards, we
can send up to 150 millions of packets per second
with under 1us of latency. To support such speeds,
many microarchitectural optimizations such as the
use of huge pages and direct cache placement of
network packets need to be in effect. Unfortunately,
these optimizations if not done carefully can
significantly harm performance or security. While
the security aspects are becoming clear [1], the
end-to-end performance impacts remain unknown. In
this project, you will investigate the performance
impacts of using huge pages and last level cache
management in high-performance networking
environments. If you were always wondering what
happens when receiving millions of packets at
nanosecond scale, this project is for you!
Requirements: C programming, knowledge of computer
architecture and operating systems internals.
[1] NetCAT: Practical Cache Attacks from the
Network, Security and Privacy 2020.
|
Animesh Trivedi <(animesh.trivedi=>vu.nl>
Kaveh Razavi <kaveh=>cs.vu.nl> |
|
|
4 |
The other faces of RDMA
virtualization.
Project description: RDMA is a technology that
enabled very efficient transfer of data over the
network. With 100Gbps RDMA-enabled network cards, it
is possible to send hundreds of millions of messages
with under 1us latency. Traditionally RDMA has
mostly been used in single-user setups in HPC
environments. However, recently RDMA technology has
been commoditized and used in general purpose
workloads such as key-value stores and transaction
processing. Major data centers such as Microsoft
Azure are already using this technology in their
backend services. It is not surprising that there is
now support for RDMA virtualization to make it
available to virtual machines. We would like you to
investigate the limitations of this new technology
in terms of isolation and quality of service between
different tenants.
Requirements: C programming, knowledge of computer
architecture and operating systems internals.
Supervisors: Animesh Trivedi and Kaveh Razavi, VU
Amsterdam
|
Animesh Trivedi <animesh.trivedi=>vu.nl>
Kaveh Razavi <kaveh=>cs.vu.nl>
|
|
|
5 |
Verification of Objection
Location Data through Picture Data Mining
Techniques.
Shadows in the open give out more information about
the location of the objects in the pictures.
According to the positioning, length, and reflection
side of the shadow, verification of location
information found in the meta data of a picture can
be verified. The objective of this project is to
develop such algorithms that find freely available
images on the internet where tempering with the
location data has been performed. The deliverable
from this project are the location verification
algorithms, a live web service that verifies the
location information of the object, and a non-public
facing database that contains information about
images that had the location information in their
meta-data, removed or falsely altered.
|
Junaid Chaudhry <chaudhry=>ieee.org>
|
|
|
6
|
Artificial
Intelligence Assisted carving.
Problem Description:
Carving for data and locating files belonging to
Principal can be hard if we only use keywords. This
still requires a lot of manual work to create
keyword lists, which might not even be sufficient to
find what we are looking for.
Goal:
- Create a simple framework to detect documents
of a certain set (or company) within carved data
by utilizing machine learning. Closely related
to document identification.
The research project below is currently the only
open project at our Forensics department rated at
MSc level. Of course, if your students have any
ideas for a cybersecurity/forensics related project
they are always welcome to contact us.
|
Danny Kielman <danny.kielman=>fox-it.com>
Mattijs Dijkstra
<mattijs.dijkstra=>fox-it.com> |
|
|
7
|
Usage
Control in the inter data spaces data exchange.
Data Spaces is a new concept and a model for
organising and managing data in domain specific data
ecosystems. Data spaces include
technical/infrastructure aspects, semantic aspects,
organisational/governance aspects, and legal
frameworks. Data exchange and data processing are
main technical activities in data spaces that may be
organised in a data workflow that can span over
multiple domains and systems. Important aspects in
managing data workflows include access policy
enforcement and usage control that defines both
enforcement of the data usage policy (i.e. allowed
uses and actions) and recording of all activities on
data.
The thesis will involve the following steps:
- Propose an architecture based on IDS for a
selected use case incorporating the enforcement of
usage control policies
- Implement the architecture and evaluate its
performance.
References
[1] IDS Connector Architecture https://www.dataspaces.fraunhofer.de/en/software/connector.html
[2] IDS Connector Framework https://github.com/International-Data-Spaces-Association/IDS-Connector-Framework
https://www.dataspaces.fraunhofer.de/en/software/connector.html
[3] Jaehong Park, Ravi S. Sandhu: The UCONABC usage
control model. ACM Trans. Inf. Syst. Secur. 7(1):
128-174 (2004)
[4] Slim Trabelsi, Jakub Sendor: "Sticky policies
for data control in the cloud" PST 2012: 75-80
|
Yuri Demchenko <y.demchenko=>uva.nl> |
|
|
8
|
Security of embedded
technology.
Analyzing the security of embedded technology, which
operates in an ever changing environment, is
Riscure's primary business. Therefore, research and
development (R&D) is of utmost importance for
Riscure to stay relevant. The R&D conducted at
Riscure focuses on four domains: software, hardware,
fault injection and side-channel analysis. Potential
SNE Master projects can be shaped around the topics
of any of these fields. We would like to invite
interested students to discuss a potential Research
Project at Riscure in any of the mentioned fields.
Projects will be shaped according to the
requirements of the SNE Master.
;
Please have a look at our website for more
information: https://www.riscure.com
;
Previous Research Projects conducted by SNE
students:
- https://www.os3.nl/_media/2013-2014/courses/rp1/p67_report.pdf
- https://www.os3.nl/_media/2011-2012/courses/rp2/p61_report.pdf
- http://rp.os3.nl/2014-2015/p48/report.pdf
- https://www.os3.nl/_media/2011-2012/courses/rp2/p19_report.pdf
If you want to see what the atmosphere is at
Riscure, please have a look at: https://vimeo.com/78065043
Please let us know If you have any additional
questions! |
Ronan Loftus <loftus=>riscure.com>
Alexandru Geana <Geana=>riscure.com>
Karolina Mrozek <Mrozek=>riscure.com>
Dana Geist <geist=>riscure.com>
|
|
|
9 |
Cross-blockchain oracle.
Interconnection between different blockchain
instances, and smart contracts residing on those,
will be essential for a thriving multi-blockchain
business ecosystem. Technologies like hashed
timelock contracts (HTLC) enable atomic swaps of
cryptocurrencies and tokens between blockchains. A
next challenge is the cross-blockchain oracle, where
the status of an oracle value on one blockchain
enables or prevents a transaction on another
blockchain.
The goal of this research project is to explore the
possibilities, impossibilities, trust assumptions,
security and options for a cross-blockchain oracle,
as well as to provide a minimal viable
implementation.
(2018-05)
|
Oskar van Deventer
<oskar.vandeventer=>tno.nl>
Maarten Everts <maarten.everts=>tno.nl> |
|
|
24
|
Network aware performance
optimization for Big Data applications using
coflows.
Optimizing data transmission is crucial to improve
the performance of data intensive applications. In
many cases, network traffic control plays a key role
in optimising data transmission especially when data
volumes are very large. In many cases,
data-intensive jobs can be divided into multiple
successive computation stages, e.g., in MapReduce
type jobs. A computation stage relies on the outputs
of the the previous stage and cannot start until all
its required inputs are in place. Inter-stage data
transfer involves a group of parallel flows, which
share the same performance goal such as minimising
the flow's completion time.
CoFlow is an application-aware network control model
for cluster-based data centric computing. The CoFlow
framework is able to schedule the network usage
based on the abstract application data flows (called
coflows). However, customizing CoFlow for different
application patterns, e.g., choosing proper network
scheduling strategies, is often difficult, in
particular when the high level job scheduling tools
have their own optimizing strategies.
The project aims to profile the behavior of CoFlow
with different computing platforms, e.g., Hadoop and
Spark etc.
- Review the existing CoFlow scheduling
strategies and related work
- Prototyping test applications using; big data
platforms (including Apache Hadoop, Spark, Hive,
Tez).
- Set up coflow test bed (Aalo, Varys etc.)
using existing CoFlow installations.
- Benchmark the behavior of CoFlow in different
application patterns, and characterise the
behavior.
Background reading:
- CoFlow introduction: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-211.pdf
- Junchao Wang, Huan Zhouy, Yang Huz, Cees de
Laatx and Zhiming Zhao, Deadline-Aware Coflow
Scheduling in a DAG, in NetCloud 2017, Hongkong,
to appear [upon request]
More info: Junchao Wang, Spiros Koulouzis, Zhiming
Zhao |
Zhiming Zhao <z.zhao=>uva.nl> |
|
|
10 |
Elastic data services for
time critical distributed workflows.
Large-scale observations over extended periods of
time are necessary for constructing and validating
models of the environment. Therefore, it is
necessary to provide advanced computational
networked infrastructure for transporting large
datasets and performing data-intensive processing.
Data infrastructures manage the lifecycle of
observation data and provide services for users and
workflows to discover, subscribe and obtain data for
different application purposes. In many cases,
applications have high performance requirements,
e.g., disaster early warning systems.
This project focuses on data aggregation and
processing use-cases from European research
infrastructures, and investigates how to optimise
infrastructures to meet critical time requirements
of data services, in particular for different
patterns of data-intensive workflow. The student
will use some initial software components [1]
developed in the ENVRIPLUS [2] and SWITCH [3]
projects, and will:
- Model the time constraints for the data
services and the characteristics of data access
patterns found in given use cases.
- Review the state of the art technologies for
optimising virtual infrastructures.
- Propose and prototype an elastic data service
solution based on a number of selected workflow
patterns.
- Evaluate the results using a use case provided
by an environmental research infrastructure.
Reference:
- https://staff.fnwi.uva.nl/z.zhao/software/drip/
- http://www.envriplus.eu
- http://www.switchproject.eu
More info: "Spiros Koulouzis, Paul Martin, Zhiming
Zhao |
Zhiming Zhao <z.zhao=>uva.nl>
|
|
|
11 |
Contextual information
capture and analysis in data provenance.
Tracking the history of events and the evolution of
data plays a crucial role in data-centric
applications for ensuring reproducibility of
results, diagnosing faults, and performing
optimisation of data-flow. Data provenance systems
[1] are a typical solution, capturing and recording
the events generated in the course of a process
workflow using contextual metadata, and providing
querying and visualisation tools for use in
analysing such events later.
Conceptual models such as W3C PROV (and extensions
such as ProvONE), OPM and CERIF have been proposed
to describe data provenance, and a number of
different solutions have been developed. Choosing a
suitable provenance solution for a given workflow
system or data infrastructure requires consideration
of not only the high-level workflow or data
pipeline, but also performance issues such as the
overhead of event capture and the volume of
provenance data generated.
The project will be conducted in the context of EU
H2020 ENVRIPLUS project [1, 2]. The goal of this
project is to provide practical guidelines for
choosing provenance solutions. This entails:
- Reviewing the state of the art for provenance
systems.
- Prototyping sample workflows that demonstrate
selected provenance models.
- Benchmarking the results of sample workflows,
and defining guidelines for choosing between
different provenance solutions (considering
metadata, logging, analytics, etc.).
References:
- About project: http://www.envriplus.eu
- Provenance background in ENVRIPLUS: https://surfdrive.surf.nl/files/index.php/s/uRa1AdyURMtYxbb
- Michael Gerhards, Volker Sander, Torsten
Matzerath, Adam Belloum, Dmitry Vasunin, and
Ammar Benabdelkader. 2011. Provenance
opportunities for WS-VLAM: an exploration of an
e-science and an e-business approach. In
Proceedings of the 6th workshop on Workflows in
support of large-scale science (WORKS '11). http://dx.doi.org/10.1145/2110497.2110505
More info: - Zhiming Zhao, Adam Belloum, Paul Martin
|
Zhiming Zhao
<z.zhao=>uva.nl>
Rik Janssen <Rik.Janssen=>os3 |
|
2
|
12 |
Profiling Partitioning
Mechanisms for Graphs with Different
Characteristics.
In computer systems, graph is an important model for
describing many things, such as workflows, virtual
infrastructures, ontological model etc. Partitioning
is an frequently used graph operation in the
contexts like parallizing workflow execution,
mapping networked infrastructures onto distributed
data centers [1], and controlling load balance of
resources. However, developing an effective
partition solution is often not easy; it is often a
complex optimization issue involves constraints like
system performance and cost constraints.;
A comprehensive benchmark on graph partitioning
mechanisms is helpful to choose a partitioning
solver for a specific model. This portfolio can also
give advices on how to partition based on the
characteristics of the graph. This project aims at
benchmarking the existing partition algorithms for
graphs with different characteristics, and profiling
their applicability for specific type of graphs.;
This project will be conducted in the context of EU
SWITCH [2] project. the students will:
- Review the state of the art of the graph
partitioning algorithms and related tools, such
as Chaco, METIS and KaHIP, etc.
- Investigate how to define the characteristics
of a graph, such as sparse graph, skewed graph,
etc. This can also be discussed with different
graph models, like planar graph, DAG,
hypergraph, etc.
- Build a benchmark for different types of
graphs with various partitioning mechanisms and
find the relationship behind.;
- Discuss about how to choose a partitioning
mechanism based on the graph characteristics.
Reading material:
- Zhou, H., Hu Y., Wang, J., Martin, P., de
Laat, C. and Zhao, Z., (2016) Fast and Dynamic
Resource Provisioning for Quality Critical Cloud
Applications, IEEE International Symposium On
Real-time Computing (ISORC) 2016, York UK http://dx.doi.org/10.1109/ISORC.2016.22
- SWITCH: www.switchproject.eu
More info: Huan Zhou, Arie Taal, Zhiming Zhao
|
Zhiming Zhao <z.zhao=>uva.nl> |
|
|
13
|
Auto-Tuning for GPU
Pipelines and Fused Kernels.
Achieving high performance on many-core accelerators
is a complex task, even for experienced programmers.
This task is made even more challenging by the fact
that, to achieve high performance, code optimization
is not enough, and auto-tuning is often necessary.
The reason for this is that computational kernels
running on many-core accelerators need ad-hoc
configurations that are a function of kernel, input,
and accelerator characteristics to achieve high
performance. However, tuning kernels in isolation
may not be the best strategy for all scenarios.
Imagine having a pipeline that is composed by a
certain number of computational kernels. You can
tune each of these kernels in isolation, and find
the optimal configuration for each of them. Then you
can use these configurations in the pipeline, and
achieve some level of performance. But these kernels
may depend on each other, and may also influence
each other. What if the choice of a certain memory
layout for one kernel causes performance degradation
on another kernel?
One of the existing optimization strategies to deal
with pipelines is to fuse kernels together, to
simplify execution patterns and decrease overhead.
In this project we aim to measure the performance of
accelerated pipelines in three different tuning
scenarios:
- tuning each component in isolation,
- tuning the pipeline as a whole, and
- tuning the fused kernel. Measuring the
performance of one or more pipelines in these
scenarios we hope to, on one level, being able
to determine which is the best strategy for the
specific pipelines on different hardware
platform, and on another level we hope to better
understand which are the characteristics that
influence this behavior.
|
Rob van Nieuwpoort
<R.vanNieuwpoort=>uva.nl>
|
|
|
14
|
Auto-tuning for Power
Efficiency.
Auto-tuning is a well-known optimization technique
in computer science. It has been used to ease the
manual optimization process that is traditionally
performed by programmers, and to maximize the
performance portability. Auto-tuning works by just
executing the code that has to be tuned many times
on a small problem set, with different tuning
parameters. The best performing version is than
subsequently used for the real problems. Tuning can
be done with application-specific parameters
(different algorithms, granularity, convergence
heuristics, etc) or platform parameters (number of
parallel threads used, compiler flags, etc).
For this project, we apply auto-tuning on GPUs. We
have several GPU applications where the absolute
performance is not the most important bottleneck for
the application in the real world. Instead the power
dissipation of the total system is critical. This
can be due to the enormous scale of the application,
or because the application must run in an embedded
device. An example of the first is the Square
Kilometre Array, a large radio telescope that
currently is under construction. With current
technology, it will need more power than all of the
Netherlands combined. In embedded systems, power
usage can be critical as well. For instance, we have
GPU codes that make images for radar systems in
drones. The weight and power limitations are an
important bottleneck (batteries are heavy).
In this project, we use power dissipation as the
evaluation function for the auto-tuning system.
Earlier work by others investigated this, but only
for a single compute-bound application. However,
many realistic applications are memory-bound. This
is a problem, because loading a value from the L1
cache can already take 7-15x more energy than an
instruction that only performs a computation (e.g.,
multiply).
There also are interesting platform parameters than
can be changed in this context. It is possible to
change both core and memory clock frequencies, for
instance. It will be interesting to if we can at
runtime, achieve the optimal balance between these
frequencies.
We want to perform auto-tuning on a set of GPU
benchmark applications that we developed. |
Rob van Nieuwpoort
<R.vanNieuwpoort=>uva.nl> |
|
|
15
|
Applying and Generalizing
Data Locality Abstractions for Parallel Programs.
TIDA is a library for high-level programming of
parallel applications, focusing on data locality.
TIDA has been shown to work well for grid-based
operations, like stencils and convolutions. These
are in an important building block for many
simulations in astrophysics, climate simulations and
water management, for instance. The TIDA paper gives
more details on the programming model.
This projects aims to achieve several things and
answer several research questions:
TIDA currently only works with up to 3D. In many
applications we have, higher dimensionalities are
needed. Can we generalize the model to N dimensions?
The model currently only supports a two-level
hierarchy of data locality. However, modern memory
systems often have many more levels, both on CPUs
and GPUs (e.g., L1, L2 and L3 cache, main memory,
memory banks coupled to a different core, etc). Can
we generalize the model to support N-level memory
hierarchies?
The current implementation only works on CPUs, can
we generalize to GPUs as well?
Given the above generalizations, can we still
implement the model efficiently? How should we
perform the mapping from the abstract hierarchical
model to a real physical memory system?
We want to test the new extended model on a real
application. We have examples available in many
domains. The student can pick one that is of
interest to her/him. |
Rob van Nieuwpoort
<R.vanNieuwpoort=>uva.nl> |
|
|
16
|
Ethereum Smart Contract
Fuzz Testing.
An Ethereum smart contract can be seen as a computer
program that runs on the Ethereum Virtual Machine
(EVM), with the ability to accept, hold and transfer
funds programmatically. Once a smart contract has
been place on the blockchain, it can be executed by
anyone. Furthermore, many smart contracts accept
user input. Because smart contracts operate on a
cryptocurrency with real value, security of smart
contracts is of the utmost importance. I would like
to create a smart contract fuzzer that will check
for unexpected behaviour or crashes of the EVM.
Based on preliminary research, such a fuzzer does
not exist yet.
|
Rodrigo Marcos
<rodrigo.marcos=>secforce.com>
|
|
|
17
|
Smart contracts specified
as contracts.
Developing a distributed state of mind: from control
flow to control structure
The concepts of control flow, of data structure, as
well as that of data flow are well established in
the computational literature; in contrast, one can
find different definitions of control structures,
and typically these are not associated to the common
use of the term, referring to the power
relationships holding in society or in
organizations.
The goal of this project is the design and
development of a social architecture language that
cross-compile in a modern concurrent programming
language (Rust, Go, or Scala), in order to make
explicit a multi-threaded, distributed state of
mind, following results obtained in agent-based
programming. The starting point will be a minimal
language subset of AgentSpeak(L).
Potential applications: controlled machine learning
for Responsible AI, control of distributed
computation |
Giovanni Sileno <G.Sileno=>uva.nl>
Mosata Mohajeriparizi
<m.mohajeriparizi=>uva.nl> |
|
|
18
|
Zero Trust Validation
ON2IT advocates the zero Trust Validation
conceptual strategy [1] to strengthen
information security at the architectural level.
Zero Trust is often mistakenly perceived as an
architectural approach. However, it is, in the end,
a strategic approach towards protecting assets
regardless of location. To enable this approach,
controls are needed to provide sufficient insight
(visibility), to exert control, and to provide
operational feedback. However, these controls/probes
are not naturally available in all environments.
Finding ways to embed such controls, and
finding/applying them, can be challenging,
especially in the context of containerized, cloud
and virtualized workflows.
At the strategic level, Zero Trust is not
sufficiently perceived as a value contributor. At
the managerial level, it is perceived mainly as an
architectural ‘toy’. This makes it hard to
translate a Zero Trust strategic approach to the
operational level; there’s a lack overall
coherence. For this reason, ON2IT developed a Zero
Trust Readiness Assessment framework which
facilitates testing the readiness level on three
levels: governance, management and operations.
Research (sub)questions that emerge:
- What is missing in the current approach of ZTA
to make it resonate with the board?
- What are Critical Success Factors for
drafting and implementing ZTA?
- What is an easy to consume capability
maturity or readiness model for the adoption
of ZTA that guides boards and management
teams in making the right decisions?
- What does a management portal with
associated KPIs need to offer in order to
enable board and management to manage and
monitor the ZTA implementation process and
take appropriate ownership?
- How do we add the necessary controls and
leverage control and monitoring facilitities
thusly provided efficiently?
- Zero Trust Validation
- "On Exploring Research
Methods for Business Information Security
Alignment and Artefact Engineering" by Yuri
Bobbert, University of Antwerp
|
Jeroen Scheerder
<Jeroen.Scheerder=>on2it.net>
|
|
|
19
|
OSINT Washing Street.
At the moment more and more OSINT is available via
all kinds of sources,a lot them are legit services
that are used by malicious actors. Examples are
github, pastebin, twitter etc. If you look at
pastebin data you might find IOC/TTPS but usually
the payloads delivered in many stages so it is
important to have a system that follows the path
until it finds the real payload. The question here
is how can you build a generic pipeline that
unravels data like a matryoshka doll. So no matter
the input, the pipeline will try to decode, query or
perform whatever relevant action that is needed.
This would result in better insight in the later
stages of an attack. An example of a framework using
the method is Stoq
(https://github.com/PUNCH-Cyber/stoq), but this
lakes research in usability and if the results are
added value compared to other osint sources. |
Joao Novaismarques
<joao.novaismarques=>kpn.com> |
|
|
20 |
Building an
open-source, flexible, large-scale static code
analyzer.
Background information
Data drives business, and maybe even the world.
Businesses that make it their business to gather
data are often aggregators of clientside generated
data. Clientside generated data, however, is
inherently untrustworthy. Malicious users can
construct their data to exploit careless, or naive,
programming and use this malicious, untrusted data
to steal information or even take over systems.
It is no surprise that large companies such as
Google, Facebook and Yahoo spend considerable
resources in securing their own systems against
would be attackers. Generally, many methods have
been developed to make untrusted data cross the
trust boundary to trusted data, and effectively make
malicious data harmless. However, securing your
systems against malicious data often requires
expertise beyond what even skilled programmers might
reasonably possess.
Problem description
Ideally, tools that analyze code for vulnerabilities
would be used to detect common security issues. Such
tools, or static code analyzers, exist, but are
either outdated
(http://rips-scanner.sourceforge.net/) or part of
very expensive commercial packages
(https://www.checkmarx.com/ and
http://armorize.com/). Next to the need for an
opensource alternative to the previously mentioned
tools, we also need to look at increasing our scope.
Rather than focusing on a single codebase, the tool
would ideally be able to scan many remote, large
scale repositories and report the findings back in
an easily accessible way.
An interesting target for this research would be
very popular, open source (at this stage) Content
Management Systems (CMSs), and specifically plugins
created for these CMSs. CMS cores are held to a very
high coding standard and are often relatively
secure. Plugns, however, are necessarily less so,
but are generally as popular as the CMSs they are
created for. This is problematic, because an
insecure plugin is as dangerous as an insecure CMS.
Experienced programmers and security experts
generally audit the most popular plugins, but this
is: a) very timeintensive, b) prone to errors and c)
of limited scope, ie not every plugin can be
audited. For example, if it was feasible to audit
all aspects of a CMS repository (CMS core and
plugins), the DigiNotar debacle could have easily
been avoided.
Research proposal
Your research would consist of extending our proof
of concept static code analyzer written in Python
and using it to scan code repositories, possibly of
some major CMSs and their plugins, for security
issues and finding innovative ways of reporting on
the massive amount of possible issues you are sure
to find. Help others keep our data that little bit
more safe. |
Patrick Jagusiak
<patrick.jagusiak=>dongit.nl>
Wouter van Dongen
<wouter.vandongen=>dongit.nl>
|
|
|
21
|
Developing a Distributed
State of Mind.
A system required to be autonomous needs to be more
than just a computational black box that produces a
set of outputs from a set of inputs. Interpreted as
an agent provided with (some degree of) rationality,
it should act based on desires, goals and internal
knowledge for justifying its decisions. One could
then imagine a software agent much like a human
being or a human group, with multiple parallel
threads of thoughts and considerations which more
than often are in conflict with each other. This
distributed view contrasts the common centralized
view used in agent-based programming,and opens up to
potential cross-fertilization with distributed
computing applications which for the moment are for
the most unexplored.
The goal of this project is the design and
development of an efficient agent architecture in a
modern concurrent programming language (Rust, Go, or
Scala), in order to make explicit a multi-threaded,
distributed state of mind. |
Giovanni Sileno
<G.Sileno=>uva.nl>
Mostafa Mohajeriparizi
<m.mohajeriparizi=>uva.nl>
|
|
|
22
|
Development of a control
framework to guaranty the security of a
collaborative open-source project.
We’re now living in an information society, and
everyone is expecting to be able to find everything
on the Web. IT developers make no exception and
spend a large part of their working hours searching
for and reusing part of codes found on Public
Repositories (e.g. GitHub, Gitlab …) or web forums
(e.g. StackOverflow).
The use of open-source software has long been seen
as a secure alternative as the code is available for
review to everyone, and as a result, bugs and
vulnerability should more easily be found and fixed.
Multiple incidents related to the use of Open-source
software (NPM, Gentoo, Homebrew) have shown that the
greater security of open-source components turned
out to be theoretical.
This research aims to highlight the root causes of
major recent incidents related to open-source
collaborative projects, as well as to propose a
global open-source security framework that could
address those issues.
References:
|
Huub van Wieren <vanWieren.Huub=>kpmg.nl>
|
|
|
23
|
Security of IoT
communication protocols on the AWS platform.
In January 2020, Jason and Hoang from the OS3 master
worked on the project “Security Evaluation on
Amazon Web Services’ REST API Authentication
Protocol Signature Version 4”[1]. This project has
shown the resilience of the Sigv4 authentication
mechanism for HTTP protocol communications.
Since June 2017, AWS released a service called AWS
Greengrass[2] that can be used as an intermediate
server for low connectivity devices running AWS IoT
SDK[3]. This is an interesting configuration as it
allows to further challenge Sigv4 authentication on
a disconnected environment using the MQTT protocol.
Reference:
- https://homepages.staff.os3.nl/~delaat/rp/2019-2020/p65/report.pdf
- https://docs.aws.amazon.com/greengrass/latest/developerguide/what-is-gg.html
- https://github.com/aws/aws-iot-device-sdk-python
|
Huub van Wieren
<vanWieren.Huub=>kpmg.nl>
|
|
|
25
|
Version
management of project files in ICS.
Research in Industrial Control Systems: It is
difficult to have proper version management of the
project files as they usually are stored offline. We
would like to come up with a solution to backup and
store project files in real time on a server and
have the capability to revert back/take snapshots
etc. of the versions used. Sort of
Puppet/Chef/Ansible but then for ICS.
|
<mvanveen=>deloitte.nl>
|
|
|
26
|
Future tooling
and cyber defense strategy for ICS.
Research in Industrial Control Systems: Is zero
trust networking possible in ICS? This is one of the
questions we are wondering about to sharpen our
vision and story around where ICS security is going
and which solutions are emerging. |
Pavlos Lontorfos
<plontorfos=>deloitte.nl>
Dominika Rusek-Jonkers
<drusek=>deloitte.nl>
Leroy van der Steenhoven
<lsteenhoven=>os3.nl>
|
|
1
|
27
|
End-to-end
encryption for browser-based meeting technologies.
Investigating the possibilities and limitations of
end-to-end encrypted browser-based video
conferencing. With a specific focus on security and
preserving privacy.
- What are possible approaches?
- How would they compare to each other?
|
Jan Freudenreich
<jfreudenreich=>deloitte.nl>
|
|
|
38
|
Evaluation of
the Jitsi Meet approach for end-to-end encrypted
browser-based video conferencing.
Determining the security of the library,
implementation and the environment setup.
|
Jan Freudenreich
<jfreudenreich=>deloitte.nl>
|
|
|
31
|
Approximate
computing and side channels.
Approximate computer is an emerging computing
paradigm where the precision of the computation is
traded with other metrics such as energy consumption
or performance. This paradigm has been shown to be
effective in various application, including machine
learning and video streaming. However, the effect of
approximate computing on security are still unknown.
This project investigates the effects of approximate
computing paradigm on side channel attacks.
The specific use case considered here is the
exploration of the resistance against power analysis
attacks of devices when classical techniques used in
the approximate computing paradigm to reduce the
energy consumption (such as voltage scaling) are
applied. The research will address the following
challenges:
- Selection of the most appropriated techniques
for energy saving among the ones used in
approximate computing paradigm
- Realization of a number of simple
cryptographic benchmarks using HDL (VHDL of
Verilog) language
- Simulation of the power consumption in the
different scenarios
- Evaluation of the side channel resistance of
each
This thesis is in collaboration with University of
Stuttgart (Prof. Ilia Polian) |
Francesco Regazzoni
<f.regazzoni=>uva.nl>
Steef van Wooning (swooning=>os3.nl)
Brice Habets (bhabets=>os3.nl)
|
|
1
|
32
|
Decentralize a
legacy application using blockchain: a crowd
journalism case study.
Blockchain technologies demonstrated a huge
potential for application developers and operators
to improve service trustworthiness, e.g., in
logistics, finance and provenance. The migration of
a centralized distributed application into a
decentralized paradigm often requires not only a
conceptual re-design of the application
architecture, but also profound understanding of the
technical integration between business logic with
the blockchain technologies. This project, we will
use the social network application (crowd
journalism) as a test case to investigate the
integration possibilities between a legacy system
and the blockchain. Key activities in the project:
- investigate the integration possibilities
between social network application and
permissioned blockchain technologies,
- make a rapid prototype to demonstrate the
feasibility, and
- assess the operational cost of blockchain
services.
The software of the crowd journalism will be
provided by a SME partner of EU ARTICONF project.
References: http://www.articonf.eu
|
Zhiming Zhao <z.zhao=>uva.nl>
|
|
|
33
|
Location aware
data processing in the cloud environment.
Data intensive applications are often workflow
involving distributed data sources and services.
When the data volumes are very large, especially
with different access constraints, the workflow
system has to decide suitable locations to process
the data and to deliver the results. In this
project, we perform a case study of eco-Lida data
from different European countries; the processing
will be done using the test bed offered by the
European Open Science Cloud. The project will
investigate data location aware scheduling
strategies, and service automation technologies for
workflow execution. The data processing pipeline and
data sources in the use case will be provided by
partners in the EU Lifewatch, and the test bed will
be provided by the European Open Science Cloud
earlier adopter program. |
Zhiming Zhao <z.zhao=>uva.nl> |
|
|
34
|
Trust
bootstrapping for secure data exchange
infrastructure provisioned on demand.
Data exchange in the data market requires more than
just end to end secure connection that is well
supported by VPN. Data market and data exchange that
could be integrated into complex research,
industrial and business processes may require
connection to data market and data exchange services
supporting data search, combination and quality
assurance as well as delivery to data processing or
execution facilities. This can be achieved by
providing trusted data exchange and execution
environment on demand using cloud hosting platform.
This project will (1) investigate current state of
the art in trust management, trust bootstrapping and
key management in provisioned on demand cloud based
services; (2) test several available solutions, and
(3) implement a selected solution in a working
prototype.
References
[1] Bootstrapping and Maintaining Trust in the Cloud
https://www.ll.mit.edu/sites/default/files/publication/doc/2018-04/2016_12_07_SchearN_ACSAC_FP.pdf
[2] Keylime: Bootstrap and Maintain Trust on the
Edge/Cloud and IoT https://github.com/keylime
|
Yuri Demchenko <y.demchenko=>uva.nl> |
|
|
35
|
Supporting
infrastructure for distributed data exchange
scenarios when using IDS (Industrial Data Spaces)
Trusted Connector.
This project will investigate the International Data
Spaces Association (IDSA) Reference Architecture
Model (RAM) and the proposed IDS Connector and its
applicability to complex data exchange scenarios
that involve multiple data sources/suppliers and
multiple data consumers in a complex multi-staged
data centric workflow.
The project will assess the UCON library providing
native IDS Connector implementation, test it in a
proposed scenario that supports one of general uses
cases for secure and trusted data exchange, and
identify necessary infrastructure components to
support IDS Connector and RAM such as trust
management, data identification and lineage,
multi-stage session management, etc.
References
[1] International Data Spaces Architecture Reference
Architecture Model 3.0 (IDS-RAM) https://internationaldataspaces.org/ids-ram-3-0/
[2] IDS Connector Framework https://github.com/International-Data-Spaces-Association/IDS-Connector-Framework
https://www.dataspaces.fraunhofer.de/en/software/connector.html
|
Yuri Demchenko <y.demchenko=>uva.nl>
|
|
|
36
|
Security
projects at KPN.
The following are some ideas for RP that we would
like to propose from KPN Security. Moreover, I would
like to mention that we are open for other ideas as
long as those are related to the proposed ones. To
give a better impression, I added the "rough ideas"
section as example of topics we would be interested
to supervise. We are more than happy to assist the
students at the moment of finding the right angle
for their research.
Info stealer landscape 2021
Create an overview
of the info stealer landscape 2020/2021. What
stealers are used, how do they work, what are
similarities, config extraction of samples, how to
detect the info stealers. Hoping this could lead
to something similar as
https://azorult-tracker.net/ where data is
published from automatically analyzing info
stealers. An example of what can be used for that
is openCTI
(https://github.com/OpenCTI-Platform/opencti).
Hacked wordpress sites
In today’s threat
landscape several malicious groups including
Revil, Emotet, Qakbot and Dridex are using
compromised Wordpress website to aid in their
operations. This RP would be on analyzing how many
of those vulnerable websites are out there using
OSINT techniques like urlscan.io, Shodan and
Riskiq. Also identifying the vulnerable components
and if they are hacked already would help fight
this problem. Ideally some notification system is
put in place to warn owners and hosting companies
about their website.
Rough ideas (freestyle)
- Literature review of the state of the art of a
give malware category ( Trojans, Info stealers,
ransomware, etc) some examples:
- What cloud services are been the most abused
to for distributing malware? (Pastebin, GitHub,
drive, Dropbox, etc) . URLHaus, Public
sandboxes, and other sources could be starting
points. (Curious about cdns and social
applications like discord, telegram , and
others)
- Looking at raccine
https://github.com/Neo23x0/Raccine, what steps
do ransomware malware take and what
possibilities are there to create other vaccines
or how to improve Raccine.
- Building a non detectable web scraper
- A lot of time data from darknet is available
on website and no option for an API/feed is
available. These website tend to have scraping
detection is several ways, this could be rate
limiting to “human” behavior checks. What
is the best way to scrape these type of
website in such a way that is is hard to
impossible to detect a bot is retrieving data.
Can this be done while still maintaining a
good pace of retrieving data.
- Malware Aquarium
- Inspired by XKCD: https://xkcd.com/350/.
Can you create an open source malware
aquarium. There are several challenges in how
to setup up, how to get infection going,
keeping it contained and how to keep track of
everything (alerts on changes)?
|
Jordi Scharloo
<jordi.scharloo=>kpn.com>
Tom van Gorkom
<tom.vangorkom=>os3.nl>
|
|
1
|
37
|
Assessing data
remnants in modern smartphones after factory
reset.
Description:
Factory reset is a function built in modern
smartphones which restores the settings of a device
to the state it was shipped from the factory. While
its user data becomes inaccessible through the
device's user interface, research performed in 2018
reports that mobile forensic techniques can still
recover old data even after a smartphone undergoes
factory reset.
In recent smartphones, however, multiple security
measures are implemented by the vendors due to
growing concerns over security and privacy. The
implementation of encryption is especially supposed
to be effective for protecting user data from an
attacker after factory reset. In the meantime, its
impact on the digital forensics domain has not yet
been explored.
In this project, the effectiveness of factory reset
to digital forensics will be evaluated using modern
smartphones. Using the latest digital forensic
techniques, data remnants in factory reset
smartphones are investigated, and its applicability
to forensic domain will be evaluated.
Related research:
|
Zeno Geradts
<zeno=>holmes.nl>
Aya Fukami <ayaf=>safeguardcyber.com>
Mattijs Blankesteijn
<mblankesteijn=>os3.nl> |
|
|
39
|
Vocal Fakes.
Deep fakes are in the news, especially those where
real people are being copied. You see that really
good deepfakes use doubles and voice actors. Audio
deepfakes are not that good yet, and the available
tools are mainly trained on the English language.
> Voice clones can be used for good (for example,
for ALS patients), but also for evil, such as in CEO
fraud. It is important for the police to know the
latest state of affairs, on the one hand to combat
crime (think not only of fraud, but also of access
systems where the voice is used as biometric access
controls). But there are also applications where the
police can use voice cloning.
The central question is what the latest state of
technology is, specifically also for the Dutch
language, what the most important players are and
what are the starting points for recognizing it
and… to make a demo application with which the
possibilities can be demonstrated.
On the internet Corentin real time voice cloning is
promoted, with which you can create your own
voicecloning framework, so that you can also clone
other people's voices, this repository on Github was
open-sourced last year, as an implementation of this
research paper about a real-time working "vocoder".
Perhaps a good starting point?
|
Zeno Geradts
<zeno=>holmes.nl>
|
|
|
40
|
Web of Deepfakes.
According to the well-known magazine Wired, Text
Synthesis is at least as great a threat as
deepfakes. Thanks to a new language model, called
GPT-3, it has now become much easier to analyze
entered texts and generate variants and extensions
in large volumes. This can be used for guessing
passwords, automating social engineering and in many
forms of scams (friend-in-need fraud) and extortion.
It is therefore not expected that this will be used
to create incidents like deepfakes, but to create a
web of lies, disguised as regular conversations on
social media. This can also undermine the sincerity
of online debate. Europol also warns against text
synthesis because it allows the first steps of
phishing and fraud to be fully automated.
A lot of money is also invested in text synthesis
from marketing and services. For chatbots, but also
because you can tailor campaigns with the specific
language use of your target group. This technology
can also be used by criminals.
The central question is what the latest state of
affairs is, what the most important players are and
what are the starting points for recognizing text
synthesis in, for example, fraudulent emails /
chats, and for (soon) distinguishing real people
from chatbots. Perhaps interesting to build your own
example in slang or for another domain? |
Zeno Geradts
<zeno=>holmes.nl>
Steef vanWooning
<Steef.vanWooning=>os3.nl>
Danny Janssen <Danny.Janssen=>os3.nl>
|
|
2
|
42
|
Zero Trust architectures
applications in the University ICT environment.
Traditionally security in ICT is managed by creating
zones where within that zone everything is trusted
to be secure and security is seen as defending the
inside from attacks originating from the outside.
For that purpose firewall's and intrusion detection
systems are used. That model is considered broken.
One reason is that a significant part of the
security incidents are inside jobs with grave
consequences. Another reason is that even good
willing insiders (employees) may inadvertently
become the source of an incident because of phishing
or brute force hacking. For organizations such as
the university an additional problem is that an ever
changing population of students, (guest)
researchers, educators and staff with wildly varying
functions and goals (education, teaching, research
and basic operations) put an enormous strain on the
security and integrity of the ICT at the university.
A radical different approach is to trust nothing and
start from that viewpoint. This rp is to create an
overview of zero-trust literature and propose a
feasible approach & architecture that can work
at the University scale of about 40000 persons.
|
Roeland Reijers
<r.reijers=>uva.nl>
Cees de Laat <C.T.A.M.deLaat=>uva.nl>
|
|
|
43
|
High-speed implementation of lightweight
ciphers.
|
Kostas Papagiannopoulos
<k.papagiannopoulos=>uva.nl>
Gheorghe Pojoga
<Gheorghe.Pojoga=>os3.nl>
|
|
1
|
44 |
Federated Authentication
platform.
SURF operates a federated authentication platform
which amongst others can interface with 4500
Identity Providers (universities etc) from 73
countries, based on SAML2.0. In the authentication
flow, the Service Provider (SP) can ask the
Identity Provider (IdP) to force the user to
present a second factor during login. The SP
does this by adding a specific value in the AuthnContextClassRef
field. Unfortunately, the values for AuthnContextClassRef
are not standardized. Especially in
international context with so many different
actors this poses a huge problem and causes strong
authentication and second factor logins to be
disregarded in federated contexts, even though
many IdPs support it. In this project, you will
investigate possible solutions for this problem
and build a proof of concept with your own
mock-federation consisting of an SP and IdPs from
multiple vendors (in particular Microsoft
ADFS/Azure and Shibboleth), implement the chosen
approach and determine if this could be used in
practice without interfering with the user
experience too much.
|
Bas Zoetekouw
<bas.zoetekouw=>surf.nl>
Hilco de Lathouder
<hilco.delathouder=>os3.nl>
|
|
1
|
45 |
Researching efficiency of
Trendmicro's HAC-T algorithm.
HAC-T is
an algorithm for efficiently clustering TLSH
hashes for efficient ( O(log(n)) ) comparisons of
TLSH hashes. Trendmicro recently published a
Python implementation of the HAC-T algorithm,
using scikit-learn.
The examples given for this implementation are
~50,000 TLSH hashes. The question is: does this
particular implementation scale well enough for it
to be used in production when looking at many
millions of hashes?
A test set (either hashes of binary files, or
pre-processed open source license texts) will be
provided.
Techniques: Linux, Python, scikit-learn
Links:
|
Armijn Hemel
<armijn=>tjaldur.nl>
Tijmen van der Spijk
<tspijk=>os3.nl>
Imre Fodi <Imre.Fodi@os3.nl>
|
|
1
|
46 |
Towards unified software
package dependency resolution strategy.
To install code, software package management tools
need to determine which dependent package of which
version to install. Each ecosystem has evolved their
own ways to deal with versioning and resolve the
dependencies.
The goal of this project is to:
- Inventory and document the many different ways
dependencies are resolved today across ecosystems
such as for example Maven/Java, RPM, Debian, npm,
Rubygems, PyPI, Conda, R, Perl, Go, Dart, Rust,
Swift, Eclipse, Conan and PHP.
- Propose and apply a dependency resolution
classification based on the specific semantics of
each resolution approach
- Suggest a possible unified strategy for dependency
resolution to "debabelize" the important package
dependency resolution
The research question is: Is a unified dependency
resolution strategy attainable across all
ecosystems?
|
Philippe Ombredanne
<pombredanne=>nexb.com>
|
|
|
47 |
In search of popularity
and prominence metric for software packages.
Software is consumed as packages such Maven/Java,
RPM, Debian, npm, Rubygems. Each ecosystem typically
offers a centralized package repository though some
are fully decentralized (such as Go). Determining
the popularity and prominence of a software package
within its ecosystem in a somewhat unbiased way is
an unresolved issue and goes well beyond just
counting stars on GitHub.
The goal of this project is to:
- Inventory and research existing efforts to provide
metric proxies such as and including Libraries.io
Sourcerank, Sonatype MTTU, OpenSSF Criticality, and
others
- Inventory and document the metrics and data that
could be available in key ecosystems such as
Maven/Java, RPM, Debian, npm, Rubygems, PyPI, Conda,
R, Perl, Go, Dart, Rust, Swift, Eclipse, Conan and
PHP.
- Propose new metrics directions and a validation
process for a possible unified approach (or multiple
ecosystem-specific approaches)
The research question is: How to rank open source
packages relative prominence and popularity?
Bonus: write actual code to compute these.
|
Philippe Ombredanne
<pombredanne=>nexb.com>
|
|
|
48 |
Towards an estimation of
the volume of open source code.
There is no clear estimate of how much open source
code there is in the whole wide world today. A
simple count of the number of repo on GitHub or the
number packages in an ecosystem such as at
http://www.modulecounts.com/ provides an incomplete
and likely misleading estimation, as each package
may be of vastly different size (such as npm
one-liner packages)
The goal of this project is to:
- Research existing metrics to use to quantify open
source projects (such as number of packages, files,
lines of code, etc) possibly specialized by
ecosystem and language
- Propose new metrics directions and a
validation process to establish an improved estimate
of the volume of open source
- Using existing available data from sources such as
SWH, ClearlyDefined, Libraries.io, GitHub or past
research projects, provide a rough estimation using
these new metrics.
The research question is: How much open source code
is there in the world?
Bonus: write actual code to compute these
|
Philippe Ombredanne
<pombredanne=>nexb.com>
|
|
|
49 |
Energy consumption of
secure neural network inference using
multi-party computation.
Secure neural network inference assumes the
following setup. A service provider Bob offers
neural network inference as a service, and a client
Alice wants to use this service for a particular
input. The aim is that Alice gets the output of the
neural network for her input, but Alice should not
learn anything about the parameters (weights, biases
etc.) of the neural network, and Bob should not
learn anything about the input provided by Alice.
Secure multi-party computation is a family of
techniques that enable two or more actors to jointly
evaluate a function on their private inputs without
revealing anything but the function's output. Using
secure multi-party computation to achieve secure
neural network inference has been an active research
area in recent years. The main research objective
has been to make secure multi-party computation
efficient enough.
In this project, the student(s) will work with
software from Microsoft Research that implements
secure neural network inference based on secure
multi-party computation, available from
https://github.com/mpc-msri/EzPC and described in
[1-3]. While these papers showed that secure
multi-party computation has the potential to
efficiently perform secure neural network inference,
the evaluation in these papers was limited to
efficiency (latency, amount of data transfer).
However, energy consumption is a growing concern for
multiple reasons, including environmental impact,
energy costs, and the limited energy of
battery-powered devices. The aim of this project is
to evaluate the energy consumption of secure neural
network inference.
Specific questions to answer in the project may
include the following:
1. How much energy does secure neural network
inference consume on the client side, on the server
side, and in the network?
2. What factors influence the energy consumption?
3. From the methods available in the software
library, which is the most energy-efficient?
References
[1] Nishant Kumar, Mayank Rathee, Nishanth Chandran,
Divya Gupta, Aseem Rastogi, Rahul Sharma. CrypTFlow:
Secure TensorFlow Inference. 2020 IEEE Symposium on
Security and Privacy (SP 2020), pp. 336-353, 2020
[2] Deevashwer Rathee, Mayank Rathee, Nishant Kumar,
Nishanth Chandran, Divya Gupta, Aseem Rastogi, Rahul
Sharma. CrypTFlow2: Practical 2-Party Secure
Inference. 2020 ACM SIGSAC Conference on Computer
and Communications Security (CCS '20), pp. 325-342,
2020
[3] Deevashwer Rathee, Mayank Rathee, Rahul Kranti
Kiran Goli, Divya Gupta, Rahul Sharma, Nishanth
Chandran, Aseem Rastogi. SiRnn: A math library for
secure RNN inference. 2021 IEEE Symposium on
Security and Privacy (SP 2021), pp. 1003-1020, 2021
|
Zoltan Mann
<z.a.mann=>uva.nl>
Daphne Chabal <d.n.m.s.chabal=>uva.nl>
|
|
|
50
|
Key Management as a
Service for Blockchain Access Control
Applications.
Project description: Blockchain is a
distributed database that is shared and synchronised
across a peer-to-peer network with no single or
central control point. A fundamental aspect of
blockchain is the transparency of the transactions
on the ledger for its validity and auditability.
Transparency is a core feature of blockchain that
leads to a challenge: because some data we want to
store on the blockchain are sensitive, we may not
want to expose them to other peers in the network.
Some blockchain solutions suggest that the answer to
this problem is to store sensitive data off of the
blockchain altogether. Those solutions use smart
contract facilities to enable access control for the
sensitive data stored off-chain. However, as good as
these facilities might be, access control alone will
not address the data protection requirements, such
as confidentiality and integrity. Thus, such
solutions additionally need to encrypt sensitive
data off-chain and manage the keys.
Research proposal: Your research will
investigate the existing key management mechanisms
hosted on a decentralised platform to enable users
to manage the private keys required to protect
sensitive data off-chain on the blockchain
applications. Then, you will develop and implement a
mechanism that makes encryption keys available to
the users and smart contracts that need them. The
mechanism also restricts access to encryption keys,
granting and revoking access to these keys. This
project would extend our proof-of-concept access
control protocol based on smart contracts to access
control for sensitive data.
|
Marcela Tuler de Oliveira
|<m.tuler=>amsterdamumc.nl>
Dr. Silvia Olabarriaga
<s.d.olabarriaga=>amsterdamumc.nl>
|
|
|
51 |
Modeling of medical data
access logs for understanding and detecting
potential privacy breach.
Healthcare organizations keep electronic medical
records (EMRs) that provide information for patient
care. These organizations have the legal duty of
safeguarding access to EMRs by establishing
procedures to control, track and monitor all
accesses to the data, as well as to detect and act
upon intrusions. Extensive data access logs need to
be analysed regularly to detect illegitimate data
access actions, but such analysis is challenging due
to the volume of the logs and the difficulty to
recognize such rare events. Moreover, log data are
extremely sensitive because they contain references
to patients, employees and organizations. This
hampers access to such log data for research
purposes, for example, to develop machine learning
methods that can aid in the detection of
illegitimate events.
In this project we aim at taking initial steps for
understanding and modelling the statistical
properties of medical data access logs with the goal
of developing a computational model that enables the
generation of synthetic datasets to aid in the
development of new approaches for intrusion
detection in such logs. The structure of the logs
available at the Amsterdam UMC - location AMC will
be used as a starting point for the modelling.
The code of the model will be published as open
source at the end of the project.
|
Dr. Silvia
Olabarriaga
<s.d.olabarriaga=>amsterdamumc.nl>
Zwinderman, A.H.
(Koos)<a.h.zwinderman=>amsterdamumc.nl>
Luc Vink
<luc.vink=>os3.nl> |
|
1
|
52
|
Automated
Incident Response in the Cloud: An Environment
Agnostic Solution in AWS.
Organizations are moving to the cloud, some
organizations go for a hybrid setup and some go
full cloud. From an incident response perspective
the cloud offers some great possibilities (and
challenges). For this research we are looking for
one or more students that want to dive into one of
the big 3 clouds Microsoft
Azure, Amazon AWS or Google Cloud Platform to
research how the cloud can be leveraged for
incident response automation. Often during an
investigation you need to acquire data from a
cloud environment. What we are interested in is if
we can leverage serverless functions like AWS
Lambda or Azure Functions to efficiently acquire
and process data. The end goal is a PoC with
several serverless functions that can be used for
incident response cases. Together we can scope
this into something managaeble for the project
time period.
|
Korstiaan Stam
<korstiaan=>invictus-ir.com>
Antonio Macovei
<Antonio.Macovei=>os3.nl>
Rares Bratean <Rares.Bratean=>os3.nl> |
|
1
|
53
|
Cloud forensics of Docker
containers on Amazon AWS.
One of the challenges of the cloud
from an incident response or forensics
perspective is the volatility of data. This is
especially challenging for cloud environments
that make use of containers such as
Kubernetes/Docker. In this research we are
looking for someone to investigate what options
an investigator has when it comes to
investigating container-based systems. Ideally
at the end of the research you can answer
questions related to the availability, content,
and acquisition methods of containers in the
cloud.
|
Korstiaan Stam
<korstiaan=>invictus-ir.com>
Artemis Mytilinaios
<amytilinaios=>os3.nl> |
|
2
|
54
|
Forensic analysis of Google
Workspace evidence.
Google Workspace is the suite used
by many organizations around the world for email
and productivity tooling. As such investigating
a Google Workspace for possible misuse, insider
threat or a Business Email Compromise (BEC)
attack is becoming more common. The main
evidence for Google Workspace is stored in
Google Workspace Audit Logs. We are looking for
a student that wants to create a method for
forensic analysis of those logs. During the
research you will have a test environment where
you can simulate attacks. We want you to come up
with a forensic analysis method for identifying
attacks based on the available audit logs. We
want to publish this research to the world, and
this is your chance to participate in that
effort.
|
Korstiaan Stam
<korstiaan=>invictus-ir.com>
Bert-Jan Pals
<bpals=>os3.nl>
Greg Charitonos <gcharitonos=>os3.nl>
|
|
1
|
55
|
Poisoning Attacks against
LDP-based Federated Learning.
Federated learning is a
collaborative learning infrastructure in which
the data owners do not need to share raw data
with one another or rely on a single trusted
entity. Instead, the data owners jointly train a
Machine Learning model through executing the
model locally on their own data and only share
the model parameters with the aggregator.
While the participants only share the updated
parameters, still some private information about
underlying data can be revealed from the shared
parameters. To address this issue, Local
Differential Privacy has been used as effective
tool to protect information leakage over shared
parameters in Federated Learning, say LDP-FED.
However, it has not yet been investigated
whether (and to what extent) the LDP-FED is
resistant against data and model poisoning
attacks. Also, if LDP-FED is not resistant
against these attacks, how can we design a
robust LDP-FL where its performance is
negligibly affected by poisoning attacks.
This project aims to evaluate the
resistance of the LDP-FED against poisoning
attacks and to explore the possibilities of
reducing the success rate of these attacks. The
following papers are suggested to be studied for
this work:
1. Stacey Truex, Ling Liu, Ka-Ho
Chow, Mehmet Emre Gursoy, Wenqi Wei; LDP-Fed:
Federated Learning with Local Differential
Privacy, CoRR, 2020.
2. Mohammad Naseri, Jamie Hayes,
and Emiliano De Cristofaro; Toward Robustness
and Privacy in Federated Learning: Experimenting
with Local and Central Differential Privacy,
CoRR, 2020.
3. Lingjuan Lyu, Han Yu, Xingjun
Ma, Lichao Sun, Jun Zhao, Qiang Yan, Philip S.
Yu, Privacy and Robustness in Federated
Learning: Attacks and Defenses, arXiv, 2020.
4. Malhar Jere, Tyler Farnan, and
Farinaz Koushanfar; A Taxonomy of Attacks on
Federated Learning, IEEE Security & Privacy,
2021.
5. Xiaoyu Cao, Jinyuan Jia, Neil
Zhenqiang Gong, Data Poisoning Attacks to Local
Differential Privacy Protocols, CoRR, 2019.
6. Minghong Fang, Xiaoyu Cao,
Jinyuan Jia, Neil Zhenqiang Gong; Local Model
Poisoning Attacks to Byzantine-Robust Federated
Learning, the 29th Usenix Security Symposium,
2020.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl>
|
|
|
56
|
Privacy by Design in Smart
Cities.
Smart city has been emerged as a
new paradigm aiming to provide the citizens
better facilities and quality of life in terms
of transportation, healthcare, environment,
entertainment, education, and energy. To this
end, smart city monitors the physical world in
real time and collects data from sensing
devices, heterogeneous networks, and RFID
devices. As the cities become smarter, the
security and privacy of citizens is more and
more threatened that require to be carefully
addressed. Accordingly, it is in crucial
importance to understand the privacy threats in
smart cities such that the researchers,
stakeholders, and engineers can design a
privacy-friendly smart city.
This project aims to explore the
existing and future privacy threats in a smart
city and how they can be addressed by design.
The following papers are suggested to be studied
for this work:
1. Mehdi Sookhak, Helen Tang, Ying
He, F. Richard Yu; Security and Privacy of Smart
Cities: A Survey, Research Issues and
Challenges, IEEE Communications Surveys &
Tutorials, 2019.
2. David Eckhoff, Isabel Wagner;
Privacy in the Smart City-Applications,
Technologies, Challenges, and Solutions, IEEE
Communications Surveys & Tutorials, 2018.
3. Kuan Zhang, Jianbing Ni, Kan
Yang, Xiaohui Liang, Ju Ren, and Xuemin
(Sherman) Shen; Security and Privacy in Smart
City Applications: Challenges and Solutions,
IEEE Communications Magazine, 2017.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl>
Babak Rashidi
<brashidi@os3.nl>
Cesar Panaijo <cpanaijo@os3.nl>
|
|
1
|
57
|
Privacy Preserving
k-means/k-median Distributed Learning using
Local Differential Privacy.
The classical clustering algorithms
were designed to be implemented in central
server. However, in recent years data is
generally located in distributed sites in
different locations. Due to privacy concerns the
data owners are unwilling to share their
original data to public or even to each other.
Several approaches have been proposed in the
literature which protects the data owners
privacy while at the same time a clustering
algorithm is shaped over protected data.
Given that the proposed solutions mainly need
the presence of a trusted party, Local
Differential Privacy (LDP) can be used as an
effective solution that protects the data
owners data in her local device.
This study aims to investigate the
application of LDP in the distributed learning
of two well-known clustering algorithms, namely
k-means and k-medians, in terms of utility loss
and privacy leakage. Specifically, it explores
the resistance of LDP-based k-means/k-medians
clustering against poisoning attacks.
The following papers are suggested
to be studied for this work:
1. Maria Florina Balcan, Steven
Ehrlich, Yingyu Liang; Distributed k-Means and
k-Median Clustering on General Topologies, NIPS,
2013.
2. Geetha Jagannathan, Rebecca N.
Wright; Privacy-Preserving Distributed k-Means
Clustering over Arbitrarily Partitioned Data,
ACM SIGKDD, 2005.
3. Chang Xia, Jingyu Hua, Wei Tong,
Sheng Zhong; Distributed K-Means clustering
guaranteeing local differential privacy,
Computer & Security journal, 2020.
4. Pathum Chamikara, Mahawaga
Arachchige, Peter Bertok, Ibrahim Khalil, Dongxi
Liu, Seyit Camtepe, Mohammed Atiquzzaman; Local
Differential Privacy for Deep Learning, IEEE
Internet of Things Journal, 2020.
5. Malhar Jere, Tyler Farnan, and
Farinaz Koushanfar; A Taxonomy of Attacks on
Federated Learning, IEEE Security & Privacy,
2021.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl> |
|
|
58
|
Scalable Blockchain-based
framework in Internet of Things (IoT).
Internet of Things (IoT) is an
ever-increasing technology in which many of our
daily objects are connected via Internet (or
other networks) and transfer data for analysis
or performing certain tasks. This property makes
the IoT vulnerable to security threats needing
to be addressed for building trust among
clients.
Blockchain, a technology born with
cryptocurrency, has shown its effectiveness and
robustness against some security threats when
integrated with IoT. Against its capability, the
main issue of integrating blockchain with IoT is
its scalability and efficiency with a
large-scale network like IoT.
In this project we aim to explore the existing
solutions addressing the scalability of
blockchain in IoT and to investigate the
possibilities of improving the existing ones by
proposing new solutions.
The following papers are suggested
to be studied for this work:
1. Hong-Ning Dai, Zibin Zheng, Yan
Zhang; Blockchain for Internet of Things: A
Survey, IEEE Internet of Things Journal, VOL. 6,
NO. 5, 2019.
2. Hany F. Atlam, Muhammad Ajmal
Azad, Ahmed G. Alzahrani, Gary Wills; A Review
of Blockchain in Internet of Things and AI, Big
Data and Cognitive Compuing MDPI, 2020.
3. Tiago M. Fernandez-Carames,
Paula Fraga-Lamas; A Review on the Use of
Blockchain for the Internet of Things, IEEE
Access, 2020.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl> |
|
|
59
|
Deep Learning for Partial
Image Encryption.
Face recognition has increasingly
gained importance for a variety of applications,
e.g., surveillance in public places, access
control in organizations, in tagging photos in
social network, and border control at airports.
The widespread application of face recognition,
however, raises privacy risks as the
individuals biometric information can be used
to profile and track people against their
desire.
A typical solution to this problem is the
application of Homomorphic Encryption, where an
encrypted image is searched to be check in a
list of images for a possible match. However,
this solution is heavy in terms of both
computation and communication costs as it
requires all images pixels to be encrypted.
This is while all the pixels of an image do not
contain privacy-sensitive information.
In this project, we plan to
investigate the application of Deep Learning in
detecting the users identifiable pixels
(instead of all pixels) for partial encryption
of an image.
1. Zekeriya Erkin, Martin Franz, Jorge Guajardo,
Stefan Katzenbeisser, Inald Lagendijk, Tomas
Toft; Privacy-Preserving Face Recognition,
International Symposium on Privacy Enhancing
Technologies Symposium, 2009.
2. Peiyang He, Charlie Griffin,
Krzysztof Kacprzyk, Artjom Joosen, Michael
Collyer, Aleksandar Shtedritski, Yuki M. Asano;
Privacy-preserving Object Detection, arXive,
2021.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl>
Carmen Veenker
<c.m.i.veenker=>uva.nl>
Danny Opdam <dopdam=>os3.nl>
|
|
1
|
60
|
Deep Learning for Detecting
Network Traffic Attack.
The expansion of new communication
technologies and services, along with an
increasing number of interconnected network
devices, web users, services, and applications,
contributes to making computer networks
ever larger and more complex as systems.
However, network anomalies pose significant
challenges to many on-line services, which their
performance is highly dependent to network
performance. For instance, a faulty airport
network caused nine
hours delay in all of its fights in 2007. To
address the issues related to network anomalies,
the security solutions need to analyze, detect,
and stop such attacks in real time. Although
there is a significant amount of technical and
scientific literature on anomaly detection
methods for network traffic, still 1) a new
generated (simulated) dataset that contains a
wide range of network attacks (detectable
through network traffic monitoring) is missing;
2) the valuable step of feature selection is
often underrepresented and treated inattentively
in the literature; and 3) the detection
techniques suffer from considerable false error
rate.
The aim of this project is to
address these issues by analyzing network
traffic using Deep Learning.
The following papers are suggested
to be studied for this project:
1. A. Kind, M. P. Stoecklin, and X.
Dimitropoulos; Histogram-based traffic anomaly
detection, IEEE Transactions on Network and
Service Management, vol. 6, no. 2, 2009.
2. R. Chapaneri and S. Shah; A
comprehensive survey of machine learning based
network intrusion detection, in Smart
Intelligent Computing and Applications, S. C.
Satapathy, V. Bhateja, and S. Das, Eds.
Springer, 2019.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl> |
|
|
61
|
Local Differential Privacy
(LDP) in Protecting the Privacy of Inter-
net of Things (IoT).
The Internet of things (IoT)
include physical objects with sensors, software,
some processing technologies, which connect and
exchange data with other devices and systems
over the Internet or other communication
network. One of the main challenges in Internet
of Things is the users privacy. While several
approaches in the literature have been proposed
to protect the information privacy in IoT, still
a thorough analysis of the application of Local
Differential Privacy (LDP) in this setting is
missing. LDP offers a strong level of privacy in
which the individuals perturb their data locally
before sending them to the third party (named
aggregator). This manes that the LDP eliminates
the need of a trusted party in the middle. In
this project, we aim to investigate the
application of LDP in protecting the privacy of
IoT data, while still the results of some
statistical analyses over protected data is
practically useful.
The following papers are suggested
to be studied for this project:
1. Chao Li, Balaji Palanisamy; Privacy in
Internet of Things: From Principles to
Technologies, IEEE Internet Of Things Journal,
VOL. 6, NO. 1, 2019.
2. Diego Mendez, Ioannis
Papapanagiotou, Baijian Yang; Internet of
Things: Survey on Security and Privacy,
https://arxiv.org/pdf/1707.01879.pdf, 2017.
3. Mengmeng Yang, Lingjuan Lyu, Jun
Zhao, Tianqing Zhu, Kwok-Yan Lam; Local
Differential Privacy and Its Applications: A
Comprehensive Survey,
https://arxiv.org/pdf/2008.03686.pdf, 2015.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl> |
|
|
62
|
Game Theory Meets
Privacy-preserving Distributed Learning.
Companies, organization, and even
individuals find mutual benefits in sharing
their own information to make better decisions
or to increase their revenues. However,
generally for privacy concerns the data holders
are unwilling to share their own table of data
but are interested in getting information from
other parties data. Thence, it is an essential
task to define a platform in which several
aspects of data-sharing come under consideration
and through a game theoretic approach all
parties relax their privacy requirements as much
as possible to have a more effective output.
In this project, we plan to define
data sharing as a game in which several aspects
are considered as: 1) the value of shared data
(freshness, size,. . . ), 2) privacy gain (in
terms of anonymization, differential privacy,
etc.), 3) trust o reputation, and 4) the utility
of result. The output of the game is setting the
Nash Equilibrium in a way that the best balance
in terms of utility and privacy is obtained.
The following papers are suggested
to be studied for this project:
1. Ningning Ding, Zhixuan Fang, Jianwei Huang;
Incentive Mechanism Design for Federated
Learning with Multi-Dimensional Private
Information, 18th International Symposium on
Modeling and Optimization in Mobile, Ad Hoc, and
Wireless Networks (WiOPT), 2020.
2. Yufeng Zhan, Jie Zhang, Zicong
Hong, Leijie Wu, Peng Li, Song Guo; A Survey of
Incentive Mechanism Design for Federated
Learning, IEEE Transactions on Emerging Topics
in Computing, 2021.
3. Ningning Ding; Zhixuan Fang;
Lingjie Duan; Jianwei Huang; Incentive Mechanism
Design for Distributed Coded Machine Learning,
IEEE Conference on Computer Communications
(InfoComm), 2021.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl> |
|
|
63
|
The Output Privacy of
Collaborative Classifiers Learning.
Privacy preserving data mining has
focused on obtaining valid result when the input
data is private. For example, secure multi-party
computation techniques are utilized to construct
a data-mining algorithm on whole distributed
data, without revealing the original data.
However, these approaches still might leave
potential privacy breaches, e.g., by looking at
the structure of a decision tree constructed on
the protected shared data.
The aim of this project is to
investigate how the output of a classifier
constructed collaboratively over private data
violates the input data privacy. We then plan to
propose solutions to reduce the privacy leakage
in this setting.
The following papers are suggested
to be studied for this project:
1. Qi Jia, Linke Guo, Zhanpeng Jin,
Yuguang Fang; Preserving Model Privacy for
Machine Learning in Distributed Systems, IEEE
Transactions on Parallel and Distributed
Systems, 2018.
2. Reza Shokri, Marco Stronati,
Congzheng Song, Vitaly Shmatikov; Membership
Inference Attacks Against Machine Learning
Models, IEEE Symposium on Security and Privacy,
2017.
3. Ting Wang, Ling Liu, Output
Privacy in Data Mining, ACM Transactions on
Database Systems, 2011.
4. Radhika Kotecha, Sanjay Garg;
Preserving output-privacy in data stream
classification, Progress in Artificial
Intelligence, June 2017, Volume 6, Issue 2, pp
8710.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl> |
|
|
64
|
On the Trade-off between
Utility Loss and Privacy Gain in LDP-
based Distributed Learning.
Local Differential Privacy (LDP) is
a notion of privacy that provides a very strong
privacy guarantee by protecting confidential
information on users sides. In this setting
generally the users employ a randomization
mechanism to perturb their data on their devices
following the rules of a mechanism properly. The
collected data when aggregated preserves some
statistical properties, e.g., mean value can be
computed out of perturbed data. This interesting
property of LDP has lead to its wide application
in many real-world scenarios. In particular, it
has been used as an effective tool in privacy
preserving distributed machine learning.
However, a thorough analysis on finding the
trade-off of between the utility loss and
privacy gain on LDP-based distributed learning
is missing.
In this project we plan to
investigate the utility-privacy trade-offs in
learning some well-known classifiers when they
are trained on distributed data respecting LDP.
The following papers are suggested to be studied
for this project:
1. Emre Yilmaz, Mohammad Al-Rubaie,
Morris Chang; Locally Differentially Private
Naive Bayes Classification,
https://arxiv.org/pdf/1905.01039.pdf, 2019.
2. Mengmeng Yang, Lingjuan Lyu, Jun
Zhao, Tianqing Zhu, Kwok-Yan Lam; Local
Differential Privacy and Its Applications:
A Comprehensive Survey,
https://arxiv.org/pdf/2008.03686.pdf, 2015.
3. Mario S. Alvim, Miguel E.
Andres, Konstantinos Chatzikokolakis, Pierpaolo
Degano, Catuscia Palamidessi; Differential
Privacy: on the trade-off between Utility and
Information Leakage, 2011.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl> |
|
|
65
|
Deep Learning for Private
Text Generation.
The recent development of Deep
Learning has led to its success in tasks related
to text processing. In particular, Recurrent
Neural Network (specifically LSTM) has served as
effective tool in next-word prediction. However,
the application of Deep Learning in 1)
generating a text which respects some privacy
constrains and 2) predicting the next word in a
sentence in such a way protects confidential
information is currently missing.
In this project, we plan to employ
Deep Learning as a useful tool to detect the
words and sentences that cause privacy violation
through uniquely identifying a person (or other
confidential information linked to a person) in
a text and replace them with meaningful
substitute words. Also, we plan to design LSTM
that suggests the next-words
by considering the text privacy protection.
1. Shervin Minaee, Nal
Kalchbrenner, Erik Cambria, Narjes Nikzad,
Meysam Chenaghlu, Jianfeng Gao; Deep
Learningbased Text Classification: A
Comprehensive Review, ACM Computing Surveys,
2021.
2. Ankush Chatterjee, Umang Gupta,
Manoj Kumar Chinnakotla, Radhakrishnan Srikanth,
Michel Galley, Puneet Agrawal; Understanding
Emotions in Text Using Deep Learning and Big
Data; Computers in Human Behavior, 2019.
3. Hong Liang, Xiao Sun, Yunlei Sun
& Yuan Gao; Text feature extraction based on
deep learning: a review, EURASIP Journal on
Wireless Communications and Networking volume,
2017.
4. Andrew Hard, Kanishka Rao, Rajiv
Mathews, Swaroop Ramaswamy, Francoise Beaufays,
Sean Augenstein, Hubert Eichner; Federated
Learning for Mobile Keyboard Prediction, 2019.
|
Mina Alishahi
<mina.sheikhalishahi=>ou.nl>
|
|
|
66
|
Automatically calculate the
financial footprint of an application
(container) inside the (public) cloud.
Background information
Using (public) cloud resources is becoming a
more popular option today1. However moving (part
of) the infrastructure/applications to a
cloud environment introduces several
challenges, one of these is determining the
financial footprint of said applications. Many
cloud providers provide tools to determine a
(rough) estimate for moving to the cloud (e.g.
AWS Pricing Calculator). However much of the
information to determine the cost needs to be
done/entered manually into such tools. Having a
tool/framework that allows automatic
calculation/estimation of a (web) application
would provide valuable and more accurate insight
for companies that want to move to the cloud.
Problem Description
Currently it's hard to determine up front what
the effective costs of an application will be
inside the public cloud based static cost
analysis. Effective application behaviour cannot
be taken into account up front easily and could
be a blocking factor for a client to migrate to
the cloud or even adopt a cloud (native)
strategy.
Research
Determine feasibility of developing a
method/framework to automatically determine the
effective financial footprint in the public
cloud of an application and proof that with a
Proof of Concept.
Scope suggestions and requirements
on Proof of Concept implementation:
- an application is already containerized
- an application is not (yet) containerized
- input out specification is part of the
framework to generate input for the application
to start generating CPU cycles (inside the
public cloud) and to fetch the bill from e.g.
AWS after X amount of time.
https://www.gartner.com/en/newsroom/press-releases/2021-04-21-gartner-forecasts-worldw
ide-public-cloud-end-user-spending-to-grow-23-percent-in-2021
|
Maurice Mouw
<maurice.mouw=>sue.nl>
Serge van Namen <serge.van.namen=>sue.nl>
|
|
|
67
|
Future
proofing networks: On core routing and SRv6
IPv6 was introduced in 1998 and is intended to be
the successor of IPv4. The address space of IPv6 is
2^96 bigger than IPv4 and the transition period to
IPv6 has now been going on for two decades.
When there is the ability to move away from MPLS,
SRv6 might become an alternative. LDP is used over
IPv4/IPv6 to exchange labels in MPLS environments.
How can this task be accomplished in SRv6? Would it
be possible to operate both stacks subsequently
while moving away from MPLS towards SRv6? How does
4PE differ from the SRv6 technology and can they be
related at all? And does SRv6 have the same key
capabilities as MPLS like L2VPNs? It is interesting
to take a look at how SRv6 can be implemented in
existing environments instead of focussing on
greenfield situations.
This project focuses on the working of routing IPv4
traffic over SRv6. Additionally it will involve
routing exchange mechanisms using MP-BGP and the
difference between the OSPF and ISIS extensions for
SRv6. This research project will answer the question
whether the switch from MPLS towards SRv6 would be
feasible in an existing environment while providing
comparable features. |
Ruben Valke
<ruben=>2hip.nl>
Sander Post
<sander.post=>os3.nl>
Krik van der Vinne
<krik.vandervinne=>os3.nl> |
|
1
|
68
|
Post-Exploitation
Defence of Git Repositories using Honey Tokens
It is not unheard of that secrets leak through
open-source Git repositories or similar version
control software. Many tools have been developed
that assist in detecting secrets that are
accidentally pushed to the repository, notifying the
developer upon detection. However, much less
attention seems to be given to incidents in which
the leak is another type of sensitive data: the
complete content of the repository itself. This
study aims to shed some light on this issue by
determining whether an additional level of security
can be added to Git repositories in the form of
honey tokens. Adding honey tokens to Git
repositories means creating a Defense-in-Depth
measure that raises an alarm once a repository is
cloned or viewed. Moreover, the possibilities of
trip-wiring repositories with honey tokens are
reviewed by considering the applicability, usability
and effectiveness of the created tokens along with
the options of using these notifications to start
Incident Response workflows. Overall, the study
presents a way for security teams to create 'tokened
repositories' as a last line of defense for
compromised credentials to Git repositories. |
Melanie Rieback
<melanie=>radicallyopensecurity.com>
Max van der Horst
<Max.vanderHorst=>os3.nl> |
|
1
|
69
|
Audio
as an entropy source for true random number
generation.
There are many fields where there is a big reliance
on generating nondeterministic and unpredictable
sequences of numbers. True random number generators
(TRNGs) try to achieve this by extracting randomness
- also sometimes called entropy - from some type of
physical source. A wide variety of these physical
sources are available to choose from. Thermal noise,
natural clock jitter, and keyboard timing and mouse
movements - used by Intel, AMD, and Linux,
respectively - are all examples of physical sources
to extract randomness from. Sound is also an
interesting physical source where randomness could
be extracted from. This research will focus on how
random the numbers from a sound-based TRNG are, how
these numbers compare to numbers supplied by other
forms of random number generation, and how viable
the practical use of a sound-based TRNG is when
looking at efficiency and throughput. |
Taco Walstra
<t.r.walstra=>uva.nl>
Oscar Muller
<Oscar.Muller=>os3.nl>
|
|
1
|
70
|
Bring
your own Living of the Land binaries
In the past malware found and actively use new
techniques to perform malicious activities
undetected. Living Off The Land Binaries and Scripts
(LOLBAS) are Microsoft-signed file, either native to
the OS or downloaded from Microsoft. They sometimes
include extra "unexpected" functionality that are
not interesting, or not documented on purpose that
can be misused for malware or red teaming. Example
use cases are: Executing code, file operations like
downloading, uploading, coping, persistence, UAC
bypass, dumping process memory and/or DLL
side-loading. Various software types simplifies
maintainability in enterprise environment and are
thereby installed by sysadmins. This research
focuses on a set of common trusted, third party
enterprise applications which unpacks binaries and
libraries during the installation, that can be
misused for malicious activities?
|
Roy Duisters
<roy.duisters=>shell.com>
Vincent Denneman
<vincent.denneman=>os3.nl> |
|
1
|
71
|
Development
of an open source malicious network traffic
generator based on MITRE ATT&CK
Currently, most network intrusion detection systems
incorporate artificial intelligence, specifically
machine learning and deep learning. Such systems
need to be trained with a dataset simulating
realistic malicious traffic inside regular network
traffic. These datasets are rare to find due to
either containing sensitive information, outdated
traffic or the lack of realistic malicious traffic.
For that reason, this study aims to build a
framework through which malicious network traffic
can safely be generated and included inside a
dataset with realistic synthetic network traffic.
|
Irina Chiscop
<irina.chiscop=>tno.nl>
Jeroen van Saane
<jeroen.vansaane=>os3.nl>
Dennis van wijk <dennis.vanwijk=>os3.nl> |
|
1
|
41
|
An
analysis of the security of LiFi and WiFi
systems.
WiFi is the de facto standard for Wireless Local
Area Networks for communication service providers
globally. LiFi is a relatively new technology using
Optical Wireless Communication. The term LiFi was
coined in a TED Talk in July 2011 [1]. LiFi has now
been commissioned in defence [2] and standardisation
has commenced in the ITU [3] and IEEE [4].
One of the key claims of LiFi is the additional
security that the restriction of the physical
transmission medium brings. Light is unable to
penetrate solid objects and so any transmission in a
room, stays internal to the room. In addition,
standard AES encryption is added to communication
links. There are numerous claims that LiFi is more
secure than WiFi [5] but WiFi has made enormous
strides in recent years with the introduction of
WPA3 and other mechanisms.
Our challenge is to understand if LiFi is as, or
more secure, as a wireless transmission medium, than
WiFi (both legacy and latest versions). We
propose testing LiFi and WiFi using a proof of
concept environment based on the latest
generally available equipment to provide a side by
side comparison.
[1] TED Talks July 2011.
https://www.ted.com/talks/harald_haas_wireless_data_from_every_light_bulb
[2] BBC News April 2021
https://www.bbc.co.uk/news/uk-scotland-scotland-business-56900762
[3] ITU G9991 May 2019
https://www.itu.int/ITU-T/workprog/wp_item.aspx?isn=13397
[4] IEEE 802.11bb Nov 2021
https://www.ieee802.org/11/Reports/tgbb_update.htm
[5] Why LiFi is more secure than WiFi
https://lifi.co/why-lifi-is-more-secure-than-wifi/
|
Vegt, Arjan van der
<avdvegt=>libertyglobal.com>
Carmen Veenker
<c.m.i.veenker=>uva.nl> |
|
|
72
|
Misusage
of vulnerable Wordpress websites by malicious
actors.
This research will look into the misuage of
vulnerable Wordpress websites by malicious actor for
resource development as an aid for their cyber
operations. Using OSINT techniques and data analysis
on open HTTP & HTTPS data, we will look at how
many vulnerable websites are currently running
globally. Furthermore, we will analyse how many of
these websites are compromised and classified as
malicious by comparing the data against Open Cyber
Threat Intelligence feeds. We will determine what
purpose the compromised websites serve within the
cyber operations, and which actors and groups are
mostly affiliated with this attack methodology.
Lastly, we will propose a proof-of-concept mechanism
that uses Threat Intelligence to mitigate such
attacks on the webserver side. |
Jordi Scharloo
<jordi.scharloo=>kpn.com>
Talha Uar
<tucar@os3.nl>
|
|
1
|
73
|
Detecting
NTLM relay attacks using a honeypot.
Abstract:
NTLM relay attacks are very prevalent at the moment
and new ways to trigger NTLM authentication methods
have been found this year (printerbug and
petitpotam). These authentications can then be
relayed to targets like Active Directory Certificate
Services. In this research we will make a framework
to detect NTLM relay attacks in a generic way. The
goal is to build a framework that can detect all
NTLM relay attacks instead of just the
vulnerabilities that are already known. This way new
vulnerabilities can be detected, investigated and
mitigated. We will build a honeypot as a PoC for the
framework.
Steps:
* Giving a theoretical overview of NTLM
authentication and NTLM relay attacks
* Creating a framework to detect NTLM relay attacks
based on the behaviour of the systems
* Building a honeypot as PoC for the detection of
NTLM relay attacks
* Evaluating the framework based on the honeypot
data
Sources:
https://posts.specterops.io/certified-pre-owned-d95910965cd2
https://github.com/topotam/PetitPotam |
Robert Diepeveen
<robert.diepeveen=>northwave.nl>
Maurits Maas
<Maurits.Maas@os3.nl>
Freek Bax <fbax@os3.nl>
|
|
2
|
74
|
Industrial
programmable logic controller automation with
configuration management tools.
Research in Industrial Control Systems: The
configuration of PLC (Programmable Logic Controller)
devices in a ICS environment is not yet automated.
It would be interesting to research the feasibility
to automate this process using Siemens PLCs. This
solution would be based on Ansible, Chef or Puppet. |
Chandni Raghuraman
<craghuraman=>deloitte.nl>
Pavlos Lontorfos <plontorfos=>deloitte.nl>
Nathan Keyaerts
<nathan.keyaerts=>os3.nl> Mattijs
Blankesteijn <mblankesteijn=>os3.nl> |
|
1
|
75
|
Implementing
side channel resistance in QARMA.
QARMA is a tweakable lightweight block cipher. This
project explores the design of an implementation of
QARMA resistant against power analysis attacks.
|
Marco Brohet
<m.j.a.brohet@uva.nl>
Francesco Regazzoni <f.regazzoni=>uva.nl>
Joris Janssen
<Joris.Janssen=>os3.nl> |
|
2
|
76
|
Enriching
IDS detection on network protocols using
anomaly-based detection.
The growing cyberthreat has lead to the rise of
Network IDS (NIDS). However the anomaly based NIDS
suffers from high false positive rates and if
Machine Learning (ML) based, lack of explainablity.
Within this research an Domain Name System (DNS)
anomaly based ML solution is created with promising
results based on Zeek. The best performing model
without hyper-parameters is Local Outlier Factor,
while the best model with hyper-parameters is the
Isolation Forest. Overall it seems like the
hyper-parameters reduce performance. Additionally
steps are being made towards a cookbook for other
protocols. In discussion future work is lined out
like looking at other hyper-parameters, protocols
and real-world performance.
|
Francisco Dominguez
<francisco.dominguez=>huntandhackett.com>
Pim van Helvoirt
<Pim.VanHelvoirt=>os3.nl> |
|
1
|
77
|
Comparison
of state-of-the-art endpoint defence solutions
to (partially) open-source endpoint defence
Endpoint defence evolved a lot in the last decade
and the old anti-malware / anti-virus software a
small sub-section of the state-of-the-art endpoint
defence solutions. Instead of anti-malware /
anti-virus, we are now talking about Endpoint
Defense and Repsonse (EDR), Data Loss Protection
(DLP), File Integrity Monitoring (FIM) and other
fancy words that suppliers have the creativity to
come up with. The biggest suppliers on the market
are busy expanding their software with new features.
This project will allow the students to get access
to some vendor trial licences (1 or more) and
compare the functionality of the products with free
and open-source product offerings. Depending on
student ability the project can result in the
development of new features into open-source
products. A minimum expected deliverable of the
project is a comparison report and proposed
development path to improve the open-source or
proprietary products. |
Peter Prjevara
<peter.prjevara=>solvinity.com>
Dennis van Wijk
<dwijk=>os3.nl>
|
|
|
78
|
Baruwa
mail security solution - is it good enough?
Baruwa is an open-source mail security solution
(https://pythonhosted.org/baruwa/introduction.html).
It builds on other open-source components, such as
SpamAssassin or ClamAV, to deliver protection
against malicious e-mail. The effectiveness of the
solution depends on the individual components.
Spamassassin
(https://github.com/apache/spamassassin) for
instance promises that it "differentiates
successfully between spam and non-spam in between
95% and 100% of cases", but is this true? If so,
with what configuration? One research question could
be related to this: what is the ideal Spamassasin
configuration to achieve this ratio? Does the
default configuration suffice? A student however
could also choose to focus on a different component
of the Baruwa - the ultimate goal of this project is
to assess limitations and improve on the
capabilities of this open-source product. A broad
research question then could be: Could there be
additional components added to the Baruwa to
increase its capabilities? Another interesting
question could be the assessment of the Baruwa's
enterprise licences. Do they worth the buck compared
to the open-source product? |
Peter Prjevara
<peter.prjevara=>solvinity.com>
Bram Peters
<Bram.Peters=>os3.nl>
|
|
|
79
|
Investigate
hidden VNC methodologies for malware
During a Red Teaming engagement we simulate a
realistic cyber attack on an organisation. The end
goal of such a simulation is to achieve a real
impact within the organisation. These final "actions
on the objective" can require interaction with
applications that are specific to the client. For
example, a railroad company will use very specific
software to control the trains. The most practical,
and sometimes only, way to interact with these tools
is via a graphical user interface. However, common
methods of interacting with GUI applications, such
as the RDP protocol, are not as stealthy as an
attacker would like. One technique that is used in
the wild is the concept of a Hidden VNC service.
This technique provides an operator with a VNC like
experience, while remaining hidden from the real
desktop. The goal of this research is to investigate
various methods to create a Hidden VNC service,
compare the pros and cons of each method and
implement the best technique. |
Huub van Wieren
<vanWieren.Huub=>kpmg.nl>
Antonio Macovei
<Antonio.Macovei=>os3.nl>
Shadi Alhakimi<shadi.alhakimi=>os3.nl>
|
|
|
80
|
Implement
lateral movement techniques in Beacon Object
File (BOF) format
Due to increasingly effective endpoint detection and
response (EDR) solutions, attackers are required to
move from living off the land binaries to bring your
own code techniques. In other words, they no longer
use tools located on the systems themselves, but
instead dynamically introduce new code into the
malware process when necessary. Various techniques
exist to execute code in-memory, each creating
different artifacts that can be detected. Currently,
the most stealthy method is inline execution, where
byte code is introduced in the current thread.
Existing tooling mostly relies on less stealthy
execution techniques. The goal of this research is
to create stealthy, inline implementations of the
most important lateral movement techniques.. |
Huub van Wieren
<vanWieren.Huub=>kpmg.nl>
|
|
|
81
|
Does
the oscillation protection mechanism of a hard
disk drive provide enough vibration data to
recover low-quality, audible voice data from
their physical environment?
In 2014, Michalevsky et al. researched the ability
to use pattern recognition to recover voice data
from very low-quality oscillation signals. In 2017,
Ortega revealed at a conference that hard disks can
function as basic gyrophone using their oscillation
protection feature. In 2018, Bolton et al. discussed
this demonstration in a paper. However, up until
now, no scientific research has been carried out to
show if it is possible to use this behavior to
recover low-quality voice data. We would like to see
someone researching this question. If there is
enough time and the student has the ability to do
it: a proof of concept would be a "nice-to-have",
but this is not a requirement. |
Maarten van der
Slik (NL) <maarten.van.der.slik=>pwc.com>
Wouter Otterspeer (NL)
<wouter.otterspeer=>pwc.com>
Floris Heringa
<floris.heringa=>os3.nl> |
|
|
82
|
Project
with SURF
At SURF, we plan on providing virtualized routers to
our constituents. Virtual routers give us
scalability, efficiency and give an on-demand
character to services we offer to our constituents.
Such a service would be very interesting for our
international research partners that connect to
NetherLight, mostly with 100G connections, but also
for other use cases such as offering our EduVPN
service on the NFV platform.
While developing the virtual router "as a service"
several questions arise, such as:
- How can we give our constituents as much
functionality as possible?
- How do virtual routers perform?
- Which virtual router performs the best?
- How do we manage security?
- How do we 'slice' systems to cope with 'noisy
neighbors'?
For our NFV-infrastructure, we operate a heavily
tuned KVM/qemu setup, with VPP as a software
data-plane for acceleration configured on Lenovo
SR635 servers with AMD Epyc2 CPU's and Mellanox
Connect-X 5 100Gbit NICs.
How can we make virtual routing a success with this
set-up?
|
Marijke Kaat
<marijke.kaat=>surf.nl> Eyle Brinkhuis
<eyle.brinkhuis=>surf.nl>
Inigo Gonzalez de Galdeano
<Inigo.GonzalezdeGaldeano=>os3.nl>
Imre Fodi <Imre.Fodi=>os3.nl>
|
|
|
83
|
Researching
missing/incorrect (Linux) system calls in
Qiling Framework
Qiling is an advanced binary emulation framework
written in Python (and building on Unicorn Engine
and QEMU) that is cross-platform and
cross-architecture. It is possible to run binaries
for several
architectures and operating systems. Some uses are
instrumenting binaries for security research or live
patching of binaries. Not every binary can currently
be run successfully: several very standard Linux
binaries will fail because not every system call has
yet been supported or doesn't work correctly for 64
bit binaries, or for some big-endian platforms
(example: MIPS big endian).
Your task is to identify the system calls that are
missing or incorrect to run x86_64 bits Linux
binaries in Qiling Framework without error, search
for any anti-patterns that the developers might have
used and ideally add/fix the missing/incorrect
system calls and contribute back to Qiling
Framework.
For this project you will need to know how to
program Python and read some basic C (C library,
Linux kernel). |
Armijn Hemel - Tjaldur Software
Governance Solutions <armijn=>tjaldur.nl>
|
|
|
84
|
Performance
of RSA in OpenSSL vs. libgcrypt
The goal of this project is to investigate why RSA
(for blind signatures) with OpenSSL seems to be
about 7x as fast (on AMD64) as the implementation in
libgcrypt. Once the cause has been identified, the
goal is to modify the libgcrypt implementation to
catch up with OpenSSL --- unless of course there is
a good security reason for why libgcrypt is slower
(but that I cannot exactly believe, and would be a
surprise result). It should be noted that
while the underlying bignum arithmetic is likely in
assembler, it is doubtful that this is the cause of
the performance difference (maybe libgcyrpt fails to
properly implement CRT optimizations)?
Any resulting code should ideally be provided in a
way that is suitable to be merged into libgcrypt
(clearly demonstrated performance improvement, clear
coding style and licensing under LGPLv3+). |
Christian Grothoff
<grothoff=>gnu.org>
|
|
|
85
|
Research
usefulness of running semgrep on pseudo C code
obtained from decompilation with angr
Semgrep[1] is a tool to perform static analysis on
source code. Very often when analysing programs
source code is not available and only a binary file
is available. There are platforms, such as angr[2],
which make it possible to (partially) decompile the
code and generate pseudo C code that (to an extent)
resembles the original C code. Combining the two
might make it possible to apply the power of tools
such as semgrep to the domain of binary files. It
seems that (apart from Java) this has not yet been
tried (an Internet search did not reveal anything).
Your tasks:
1. decompile (Linux) ELF binaries for which source
code is available using angr and generate pseudo C
code
2. run semgrep on the generated pseudo C code (you
might need to write some semgrep rules for this
yourself)
3. run semgrep on the original source code
4. compare the results of 2. and 3. and report
You will need to know Linux and Python. Some
knowledge about the ELF format might be useful as
well.
[1] https://semgrep.dev/
[2] https://angr.io/ |
Armijn Hemel - Tjaldur Software
Governance Solutions <armijn=>tjaldur.nl>
|
|
|
86
|
Efficient
secure neural network inference using
multi-party computation
Secure neural network inference assumes the
following setup. A service provider Bob offers
neural network inference as a service, and a client
Alice wants to use this service for a particular
input. The aim is that Alice gets the output of the
neural network for her input, but Alice should not
learn anything about the parameters (weights, biases
etc.) of the neural network, and Bob should not
learn anything about the input provided by Alice.
Secure multi-party computation is a family of
techniques that enable two or more actors to jointly
evaluate a function on their private inputs without
revealing anything but the function's output. Using
secure multi-party computation to achieve secure
neural network inference has been an active research
area in recent years. The main challenge is how to
make secure multi-party computation efficient
enough.
In this project, the student(s) will work with
software from Microsoft Research that implements
secure neural network inference based on secure
multi-party computation, available from
https://github.com/mpc-msri/EzPC and described in
[1-3]. While these papers showed that secure
multi-party computation has the potential to
efficiently perform secure neural network inference,
the evaluation setup in these papers was rather
impractical (for example, using a single client that
is equally strong as the server). The aim of this
project is to evaluate the efficiency of this
approach in more practical settings.
Specific questions to answer in the project may
include the following:
1. If the computational capacity of the client and
server machines differs, how is overall performance
(latency, throughput) impacted by the difference in
the machines' computational capacity?
2. If the same server serves multiple clients, how
does the number of clients influence the system's
performance?
3. How can server-side parallelization be used to
improve system performance?
References
[1] Nishant Kumar, Mayank Rathee, Nishanth Chandran,
Divya Gupta, Aseem Rastogi, Rahul Sharma. CrypTFlow:
Secure TensorFlow Inference. 2020 IEEE Symposium on
Security and Privacy (SP 2020), pp. 336-353, 2020
[2] Deevashwer Rathee, Mayank Rathee, Nishant Kumar,
Nishanth Chandran, Divya Gupta, Aseem Rastogi, Rahul
Sharma. CrypTFlow2: Practical 2-Party Secure
Inference. 2020 ACM SIGSAC Conference on Computer
and Communications Security (CCS '20), pp. 325-342,
2020
[3] Deevashwer Rathee, Mayank Rathee, Rahul Kranti
Kiran Goli, Divya Gupta, Rahul Sharma, Nishanth
Chandran, Aseem Rastogi. SiRnn: A math library for
secure RNN inference. 2021 IEEE Symposium on
Security and Privacy (SP 2021), pp. 1003-1020, 2021
|
Zoltan Mann
<z.a.mann=>uva.nl>
Daphne Chabal <d.n.m.s.chabal=>uva.nl>
|
|
|
87
|
The
role of IXPs in a SCION ecosystem
SCION[1] is a promising future internet architecture
that guarantees secure end-to-end communication and
enhanced route control. Together with failure
isolation and explicit trust, it has attracted the
attention not only from multiple researchers and
institutions but also from the networking
industry. Although SCION is targeting a lot
the ISP world, it is interesting to investigate how
it is possible to combine the low-latency paths of
an Internet Exchange Point with the SCION
architecture without disrupting its added benefits
and functionalities.
The students of this project are called to discover
all the critical SCION functionalities that an IXP
needs to adopt to respect the new architecture.
Based on the previous outcome, the students can
design a SCION IXP and build a small PoC that can
run on the 2STiC testbed[2]. For saving time in the
implementation part, the students can utilize the
SCION code [3] of SIDN Labs and proceed to necessary
modifications in order to prove their theory. As a
last step, the students will compare their research
against the real-life scenario [4] of swiss-ix where
few SCION enabled networks are connected to it.
[1] https://scion-architecture.net/
[2] https://2stic.nl/testbed.html
[3] https://github.com/sidn/p4-scion
[4] https://www.swissix.ch/public/scion_flyer.pdf
|
Stavros
Konstantaras
<stavros.konstantaras=>ams-ix.net>
Krik van der Vinne
<krik.vandervinne=>os3.nl> leroy van der
Steenhoven <leroy.vandersteenhoven=>os3.nl>
|
|
|
88
|
Command
and Control over Microsoft Teams
Command and Control (C2) servers are used by
attackers to control operations of compromised
systems. C2 servers are typically used to store
stolen data and disseminate commands. Establishing
C2 communications is a vital component for
adversaries. For that reason, adversaries commonly
attempt to mimic expected traffic to avoid detection
and/or adhere to network restrictions. There are
various ways with varying stealth levels to
establish communication that are dependent on the
victim's network structure, and defenses. Since
enterprises are increasingly opting in for Microsoft
Teams, leveraging such a platform for the
malicious traffic might make it indistinguishable
from legitimate traffic. This research intends to
develop a novel command and control architecture
that leverages Microsoft Teams to establish
communication and disseminate commands.
|
Stefan Broeder
<stefan.broeder=>nl.abnamro.com>
Rob Mouris <rob.muris=>nl.abnamro.com>
Jeroen van Saane
<Jeroen.vanSaane=>os3.nl> |
|
|
89
|
Secure
Multiparty Computation
There is a lot of confidential data that is
interesting for researchers. Different technologies
exists to make this data available for re-use
without sharing the data. The most simple technology
is algorithm-to-data, which works great for single
datasets or federated machine learning. We created a
demonstrator to show this in action (see
https://dataexchange-test.lab.surf.nl/). Another
technology is to use cryptographic techniques:
secure multi-party computation (MPC) or (fully)
homomorphic encryption. TNO has made an demo of the
latter, using Pallier encryption:
https://mhe.github.io/jspaillier/. The goal of this
RP is to dive a bit deeper into these cryptographic
techniques, and answer the questions: (1) how does
this work in more detail? (2) What are the
properties of the different solutions? (3) Is there
a particular technology that is well suited for
re-using data for research? (4) If SURF would create
a convincing demonstrator for either secure MPC or
homomorphic encryption, how should this look like?
|
Freek Dijkstra
<freek.dijkstra=>surf.nl>
Cesar Panaijo
<Cesar.Panaijo=>os3.nl>
|
|
|
90
|
Measuring
Route Origin Validation of authoritative name
servers
The Domain Name System (DNS) and Border Gateway
Protocol (BGP) are two fundamental building blocks
of the internet. However, these protocols were
initially not developed with security in mind. For
instance, malicious groups can perform prefix
hijacking and additionally spoof a DNS name server
IP address in the hijacked IP prefix. Additionally,
BGP is also prone to route leaks. In 2008, Resource
Public Key Infrastructure (RPKI) [RFC6480, 1] was
proposed to address this issue.
RPKI is a hierarchical Public Key Infrastructure
(PKI) that binds Internet Number Resources (INRs),
such as Autonomous System Numbers (ASNs) and IP
addresses, to public keys via certificates. With the
RPKI certificate scheme, AS owners can prove that
they are authorized to advertise certain IP
prefixes. To make this certificate scheme work, the
Regional Internet Registries (RIRs) control the
trust anchors for each region.
We have started measuring Route Origin Validation
(ROV) of DNS resolvers since the beginning of 2020
with a research project performed by SNE students
[2]. RPKI has seen a rapid increase in deployment
since that project which we were able to monitor
closely thanks to this research.
With this project we aim to measure the other side
of the DNS spectrum: the authoritative servers. For
this we have a so-called RPKI beacon at our disposal
[3]. The beacon announces on purpose RPKI invalid
prefixes. Overlapping less specific prefixes are
announced validly from elsewhere. One approach to
measure the state of Route Origin Validation of an
authoritative name server is to send it a query with
an IP address out of the invalidly announced prefix.
The state of ROV can then be determined by detection
where the responses arrive.
This project is for you if you are interested in
(internet) measurements of real-life security which
will help create better future standards. Knowledge
of programming is useful in this project but not a
requirement.
[1] https://rpki.readthedocs.io/en/latest/
[2] https://rp.os3.nl/2019-2020/p04/report.pdf
[3]
https://docs.google.com/presentation/d/1Qb-HkRo4qMRxIJBqR54Dz7KYY7ITsIyhlDAtRF6CvVs/edit?usp=sharing |
Tom Carpay
<tom@nlnetlabs.nl> Willem Toorop
<willem@nlnetlabs.nl>
Brice Habets
<bhabets=>os3.nl>
Sander Post <sander.post=>os3.nl> |
|
|
91
|
Analysing
a real-world malicious Network Implant
Network Implants are a well-known tool for Red
Teamers and attackers. They can be bought
off-the-shelf (such as the Packet Squirrel) but
often attackers are looking for a more custom device
that fully adheres to their needs. One of our
clients found such a device that was used during an
actual attack. The device is already a bit older (+-
8 years) and was sophisticated in some aspects and
rather simple in others. It contains a a.o. LAN
port, a 3G-dongle with SIM card, a fan, and was
disguised as a ordinary network switch. It was found
by a Point-of-Sale terminal so the assumption is
that the goal was to steal credit card data. The
objective of this research is to analyse the device,
and determine the functionality, the components, the
software, and use history of the device. You have
full access to the device, but cannot use
destructive research methods. |
van Wieren, Huub
<vanWieren.Huub@kpmg.nl>
Floris Heringa
<floris.heringa=>os3.nl> |
|
|
92
|
Decentralized
proactive data protection in edge computing
Connected devices in the Internet of Things (IoT)
produce large amounts of valuable data. Data from
IoT devices is increasingly processed in edge
servers, i.e., geographically distributed computing
resources offering cloud-like services with low
latency to nearby IoT devices. In such a setting,
the protection of data from unauthorized access, as
required by data protection legislation for personal
data and by business imperatives for valuable
non-personal data, is challenging for multiple
reasons [1]. First, edge computing systems change
dynamically while being used (e.g., new edge servers
may become available or existing ones removed, new
applications may be deployed to the edge servers
etc.). Such changes may introduce new data
protection risks on the fly. Second, edge servers
typically have only a local view of a part of the
network, which may prohibit them from detecting data
protection risks that stem from the interplay of
multiple nodes. Third, successfully mitigating an
identified data protection risk may require changes
in multiple edge servers, thus requiring
coordination among otherwise independent entities
[2].
The aim of this project is to develop a software
framework which allows the simulation of an edge
computing system and the implementation of and
experimentation with different coordination schemes
to achieve decentralized proactive data protection.
In this software framework, every node should have
its own - dynamically updated - model, corresponding
to the node's local knowledge of the network. The
nodes should exchange information about their
knowledge, using a gossip protocol. In addition, the
nodes should inform each other about identified data
protection risks, and coordinate with each other
using an auction protocol to decide on the best
mitigation strategy for an identified data
protection risk.
This project involves (online) collaboration with
the University of Duisburg-Essen in Germany.
References
[1] Z. . Mann. Data protection in fog computing
through monitoring and adaptation. KuVS-Fachgesprch
Fog Computing 2018, Technical Report, Technische
Universitt Wien, pp. 25-28, 2018
[2] Z. . Mann, F. Kunz, J. Laufer, J. Bellendorf,
A. Metzger, K. Pohl. RADAR: Data protection in
cloud-based computer systems at run time. IEEE
Access, 9:70816-70842, 2021 |
Zoltan Mann
<z.a.mann=>uva.nl>
|
|
|
93
|
TCP-Prague
evaluation
Low Latency Low Loss Scalable Throughput (L4S) [1]
is a technology intended to reduce queue delay
problems, ensuring low latency to Internet Protocol
flows with a high throughput performance. TCP-Prague
is the reference implementation for the upcoming L4S
Internet service. Other congestion controls that
support L4S, such as Googles BBRv2, are already
available or will be released soon. The task of this
project is to compare the performance of TCP-Prague
against least one of these other congestion controls
(like BBRv2), on at least one of the following
criteria: (i) for steady state: fairness, RTT
(in)dependence and convergence speed, and for
dynamic behavior: fairness, responsiveness, and
stability. Further fine-tuning of the open-source
implementation will be required to line-up the
behavior of the congestion controls.
Supervisor: Chrysa Papagianni
(c.papagianni=>uva.nl), in collaboration with
Koen De Schepper
(koen.de_schepper=>nokia-bell-labs.com)
[1] B. Briscoe et al. Low Latency, Low Loss,
Scalable Throughput (L4S) Internet Service:
Architecture. Internet-Draft
draft-ietf-tsvwg-l4s-arch-09. Work in Progress.
Internet Engineering Task Force, March 2022.
https://datatracker.ietf.org/doc/draft-ietf-tsvwg-l4s-arch/
TCP-Prague enhancement
|
Chrysa Papagianni
<c.papagianni=>uva.nl>
Nathan Keyaerts
<nathan.keyaerts=>student.uva.nl>
|
|
|
94
|
TCP-Prague
enhancement
Low Latency Low Loss Scalable Throughput (L4S) [1]
is a technology intended to reduce queue delay
problems, ensuring low latency to Internet Protocol
flows with a high throughput performance. TCP-Prague
is the reference implementation for the upcoming L4S
Internet service. However, the reference
implementation could be further improved on aspects
such as an appropriate slow-start response to
achieve faster full link utilization and a faster
flow convergence time in case other flows are
active. The goal of the project is to modify the
Linux TCP-Prague kernel module towards this
direction and validate possible improvements to the
code.
Supervisor: Chrysa Papagianni
(c.papagianni=>uva.nl), in collaboration with
Koen De Schepper
(koen.de_schepper=>nokia-bell-labs.com)
[1] B. Briscoe et al. Low Latency, Low Loss,
Scalable Throughput (L4S) Internet Service:
Architecture. Internet-Draft
draft-ietf-tsvwg-l4s-arch-09. Work in Progress.
Internet Engineering Task Force, March 2022.
https://datatracker.ietf.org/doc/draft-ietf-tsvwg-l4s-arch/ |
Chrysa Papagianni
<c.papagianni=>uva.nl>
|
|
|
95
|
Research
user initiated tracing with low overhead
At ASE2014 a paper about tracing builds to find out
what really goes into a binary was presented (
https://rebels.cs.uwaterloo.ca/confpaper/2014/09/14/tracing-software-build-processes-to-uncover-license-compliance-inconsistencies.html
). The method that was described uses the strace
tool to trace builds on Linux. While strace captures
all the necessary output it adds a lot of overhead,
creates a lot of output and slows down builds very
significantly. In-kernel tracing has become
more popular (for example: DTrace, Systemtap), but
if understood correctly probes have to be defined in
advance, for example to monitor access to a certain
directory. Since during a build it is not known
which directories will be accessed (and finding out
is the actual goal of tracing builds) it seems that
in kernel tracing is perhaps not suitable. User
events ( https://lwn.net/Articles/889607/ ) might or
might not change this.
The research question: is it actually possible to
use in kernel tracing such as DTrace or SystemTap to
do something like strace?
Constraints:
1. no configuration of enabling certain probes
before running the build should be necessary (but
having a wrapper script that does set up work is
perfectly fine)
2. only system calls related to the build process
and all its children should be traced, as to not
pollute results
3. all relevant system calls related to I/O (which
might include some network calls for build systems
that automatically download programs during a build)
need to be captured |
Armijn Hemel
<armijn=>tjaldur.nl>
|
|
|
96
|
Malware
Aquarium: A virtualized Infrastructure where
malware resides and is being monitored
This research project investigates the requirements
of having an isolated virtual infrastructure which
will have strong monitoring capabilities and offers
the possibility of deploying multiple malware inside
it and analyzing it for a longer period of time. The
research tackles the problems that are encounter in
a sandbox environment such as limited analysis time,
no visibility on lateral movement action and
development and one malware at a time deployment
restriction.
|
Roy Duisters
<roy.duisters=>shell.com>
Arjan Sturkenboom
Rares Bratean <Rares.Bratean=>os3.nl>
Rio Kierkels <rkierkels=>os3.nl>
|
|
|
97
|
Path
tracing to Increase internet transparency
Due to the resiliancy en redundancy of the internet,
changes in the underlying routing table are mostly
unnoticable to the users.
This opaqueness could cause their network traffic to
be redirected via potentially harmful networks or
jurisdictions.
One way of bringing this transparency is to monitor
the path to an end destination using tools like
traceroute.
Traceroute has some limitations (e.g. one-way path,
packets being dropped, limited, information etc.)
that affect the reliability of the output.
The goal of this work would be to find innovative
ways to improve the reliability of such path
measurements output without changes to the workings
of the Internet.
Some initial ideas are, a different way of
tracerouting, combining UDP/ICMP/TCP probes to
improve results, improving results using tools like
(RIPE) RIS, ATLAS or BGP looking glasses.
|
Ralph Koning
<ralph.koning=>sidn.nl>
Gerlof Fokkema
<gerlof.fokkema@os3.nl> |
|
1
|
98
|
Hieararchical
Classiification for side-channel analysis
Current side-channel attacks rely on the common
`flat' classifier. That is, finding the secret key
requires a single classifier decision. However, it
can be beneficial for the attack accuracy to
replace this single, flat decision with a
multi-layer approach that finds the secret key after
several hiearchical decisions. In this project we
will work towards combinining various types of
classfiers across different decision layers, aiming
to improve the attack accuracy |
Kostas
Papagiannopoulos
<k.papagiannopoulos=>uva.nl>
Gheorghe Pojoga
<Gheorghe.Pojoga=>os3.nl> |
|
|
99
|
Persistent
Fault Analysis -- attacks and countermeasures
PFA has been a very simple yet potent attack on
modern cryptography. With just a single fault it can
bypass redundancy protection and recover the secret
key. In this project we will work towards expanding
the attack, combining it with side-channel analysis
and fault sensitivity analysis and looking into
countermeasures. |
Kostas
Papagiannopoulos
<k.papagiannopoulos=>uva.nl>
Greg Charitonos
<gcharitonos=>os3.nl> |
|
|
100
|
Benefits
of applying machine learning to Cilium
Cilium is an open source project that enables cloud
native networking, security, and observability in
environments such as kubernetes and other
containerized systems. Cilium makes use of a Linux
kernel feature known as eBPF (Extended Berkeley
Packet Filter). As a result, the Linux kernel may
incorporate security, visibility, and networking
control logic dynamically. Due to the volume of data
contained within, only a portion of it is exported
by default. This article investigates the Cilium
dataset in order to determine the possible benefits
of machine learning. There are a number of
advantages to combining machine learning and Cilium.
Among the advantages is that anomaly detection may
be used to assess whether pods within a kubernetes
cluster are misbehaving.
|
Serge van Namen
<serge.van.namen=>sue.nl>
Bart van Dongen
<Bart.vanDongen=>os3.nl> |
|
2
|
101 |
Efficient
identification of passwords in large
quantities of plaintext data
Password policies requiring passwords to contain at
least one capital letter, number, and special
character make them relatively easy to distinguish
from normal text by humans. This is, however, not
the case for all passwords and manual identification
does not scale when a lot of passwords need to be
identified. The aim of this research is to develop
ways to perform efficient password identification at
a large scale. Multiple methods will be
investigated. The Bidirectional Encoder
Representations from Transformers (BERT) language
model for natural language processing (NLP) will be
used to identify text that may contain a password.
Next, password leak lists and their corresponding
passwords will be used to identify any passwords.
Finally, password generation algorithms like OMEN
and PassGAN - will be used to generate candidate
passwords based on password leak lists to generate
additional passwords that can be used for
comparison.
|
Zeno Geradts
<zeno=>holmes.nl> Romke van Dijk
<romke=>holmes.nl>
Oscar Muller
<Oscar.Muller=>os3.nl> Tijmen van der
Spijk <tspijk=>os3.nl>
|
|
|
Tuesday
July 5 2022, SP C0.110, online using bigbluebutton |
Time |
#RP |
Title |
Name(s) |
RP |
10h00 |
|
Introduction and welcome
|
|
|
10h30
|
93
|
Work in Progress:
TCP-Prague evaluation
|
Nathan Keyaerts
|
2
|
10h55 |
|
Break
|
|
|
11h05 |
77
|
Comparison of state-of-the-art endpoint defence
solutions to (partially) open-source endpoint
defence
|
Dennis van Wijk
|
2
|
11h30 |
82 |
Virtualized Routers |
Inigo Gonzalez de Galdeano, Imre Fodi |
2
|
11h55 |
|
Lunch
|
|
|
13h05 |
37 |
Assessing data remnants in modern smartphones
after factory reset |
Mattijs Blankesteijn |
2
|
13h30 |
40 |
Web of Deepfakes |
Steef vanWooning, Danny Janssen |
2
|
13h55 |
|
Break |
|
|
14h05 |
101 |
Efficient identification of passwords in large
quantities of plaintext data |
Oscar Muller, Tijmen van der Spijk |
2
|
14h30 |
88 |
Command and Control over Microsoft Teams |
Jeroen van Saane |
2
|
14h55
|
|
Close |
|
|
Wednesday
July 6 2022, SP C0.110 - online using bigbluebutton
|
Time |
#RP |
Title |
Name(s) |
RP |
10h00 |
|
Introduction |
|
|
10h05
|
99
|
Persistent Fault Analysis -- attacks and
countermeasures
|
Greg Charitonos
|
2
|
10h30
|
41
|
An analysis of the security of LiFi and WiFi
systems.
|
Carmen Veenker
|
2
|
10h55 |
|
Break
|
|
|
11h05 |
87
|
The role of IXPs in a SCION ecosystem
|
Krik van der Vinne, van der Steenhoven
|
2
|
11h30 |
100 |
Benefits of applying machine learning to Cilium |
Bart van Dongen |
2
|
11h55 |
|
Lunch
|
|
|
13h05 |
90
|
Measuring Route Origin Validation of authoritative
name servers
|
Brice Habets, Sander Post
|
2
|
13h30 |
96
|
Malware Aquarium: A virtualized Infrastructure
where malware resides and is being monitored
|
Rares Bratean, Rio Kierkels
|
2
|
13h55 |
|
Break |
|
|
14h05 |
79 |
Investigate hidden VNC methodologies for malware |
Antonio Macovei, Shadi Alhakimi |
2
|
14h30 |
89
|
Secure Multiparty Computation |
Cesar Panaijo |
2
|
14:55
|
11
|
Work in Progress: Contextual
information capture and analysis in data provenance |
Rik Janssen |
2
|
15:20
|
|
Close
|
|
|
Tuesday
Feb 8, 2022, hybrid, to
connect on line use bigbluebutton |
Time |
#RP |
Title |
Name(s) |
LOC |
RP |
10h25
|
|
Introduction |
Francesco Regazzoni |
|
|
10h30
|
74
|
Industrial programmable logic controller
automation with configuration management tools.
|
Nathan Keyaerts, Mattijs Blankesteijn |
|
|
10h55
|
|
Break
|
|
|
|
11h05 |
73
|
Detecting NTLM relay attacks using a honeypot
|
Maurits Maas, Freek Bax
|
|
2
|
11h30 |
26
|
Future tooling and cyber defense strategy for ICS
|
Leroy van der Steenhoven
|
|
1
|
11h55 |
|
Lunch
|
|
|
|
13h30
|
51
|
Modeling of medical data access logs for
understanding and detecting potential privacy breach
|
Luc Vink |
online
|
1
|
13h55 |
|
Break |
|
|
|
14h05 |
43 |
High-speed implementation of lightweight
ciphers |
Gheorghe Pojoga |
|
1
|
14h30 |
53
|
Research cloud container evidence
|
Artemis Mytilinaios |
online
|
2
|
14h55 |
|
Break |
|
|
|
15h30 |
72
|
Misusage of vulnerable Wordpress websites by
malicious actors
|
Talha Uar |
|
|
15h55 |
|
Break |
|
|
|
16h05 |
45
|
Researching efficiency of Trendmicro's HAC-T
algorithm
|
Tijmen van der Spijk, Imre Fodi
|
online
|
1
|
16h30 |
71
|
Development of an open source malicious network
traffic generator based on MITRE ATT&CK
|
Dennis van Wijk, Jeroen van Saane
|
|
|
16h55 |
|
Close
|
|
|
|
Wednesday
Feb 9 2022, hybrid, to connect on line use bigbluebutton
|
Time |
#RP |
Title |
Name(s) |
LOC |
RP |
10h00 |
|
Introduction |
Francesco Regazzoni |
|
|
10h05
|
69
|
Using sound to facilitate true random number
generation
|
Oscar Muller
|
|
1
|
10h30
|
44
|
Federated Authentication platform
|
Hilco de Lathouder
|
|
|
10h55 |
|
Break
|
|
|
|
11h05 |
67
|
Future proofing networks: On core routing and SRv6
|
Sander Post, Krik van der Vinne |
|
1
|
11h30 |
52
|
Cloud native IR automation
|
Antonio Macovei Rares Bratean
|
|
1
|
11h55 |
|
Lunch
|
|
|
|
13h05 |
56
|
Privacy by Design in Smart Cities
|
Babak Rashidi, Cesar Panaijo
|
|
1
|
13h30 |
36
|
Characteristics of Info Stealers in 2021
|
Tom van Gorkom |
|
|
13h55 |
|
Break |
|
|
|
14h05 |
54
|
Forensic analysis of Google Workspace evidence
|
Bert-Jan Pals, Greg Charitonos |
|
|
14h30 |
70 |
Bring your own Living of the Land binaries |
Vincent Denneman
|
|
|
14h55 |
|
Break |
|
|
|
15h05 |
76
|
Enriching IDS detection on network protocols using
anomaly-based detection
|
Pin van Helvoirt |
|
|
15h30
|
75
|
Implementing side channel resistance
in QARMA
|
Joris Janssen
|
|
2
|
15h55
|
|
Break |
|
|
|
16h05
|
68
|
Post-Exploitation Defence of Git
Repositories using Honey Tokens
|
Max vanderHorst |
|
1
|
16:30
|
59
|
Deep Learning for Partial Image
Encryption |
Carmen Veenker, Danny Opdam
|
|
|
16h30
|
|
Close |
|
|
|
Out of
normal schedule presentations
Room B1.23 at Science Park
904 NL-1098XH Amsterdam. |
Date |
Time |
Place |
#RP |
Title |
Name(s) |
LOC |
RP |
2021-09-xx
|
10h00
|
online
|
|
|
|
|
|
|
11h00
|
online
|
|
|
|
|
|
|