SNE Master Research Projects 2010 - 2011

Contact TimeLine Projects LeftOver Projects Presentations-rp1 Presentations-rp2 Objective Process Tips Project Proposal


Research Projects 1 and 2 (RP1 and RP2)

Code: MSNRP1-6 and MSN2NRP6

The course objective is to ensure that students become acquainted with problems from the field of practice through two short projects, which require the development of non-trivial methods, concepts and solutions. After this course, students should be able to:

Intro slides this year: cdl-2010-09-22-a.pdf


RP1: RP2:


Here is a list of student projects for Jan 2011 and/or June 2011. In a futile lightweight way to prevent spamming I replaced "@" by "=>" below.
Find here the left over projects from this year.

# title
supervisor contact


PersLink Security.

Perslink is a phone directory website containing information that you won't find in your average white pages. It is the little black book of Dutch journalism. This is the place where journalists and producers find their collegues, spokespersons, top level managers, politicians and experts in all kinds of fields. The data consists of about 10.000 organisations and 25.000 people. The need for a completely secure environment is evident.

A new version is now ready for deployment. Although the creators of the website are confident about the security, they have not actually performed a formal audit. They would love to be proven wrong and use that proof to strengthen their security.
Jo Lahaye <Jo=>>, Martijn Stegeman <stegeman=>>

Eleonora Petridou <Eleonora.Petridou=>>
Pascal Cuylaerts <pascal.cuylaerts=>>


Comparing TCP connection performance over VPN-tunnels vs non-tunneled traffic.

A single TCP connection over a gigabit LAN can easily achieve a performance of over 900 Mbps without special TCP stack tuning. If a VPN tunnel is created on Linux using the 'tun' or 'tap' device the performance drops to less than 400 Mbps, with or without encryption. The root cause of this performance drop is unknown, but it is suspected that it is a Linux kernel inefficiency. It could also be caused by the fact that TCP offloading is not available for the tun/tap drivers. The main goals of this project are:
  • determine the root cause of the aforementioned performance hit on Linux
  • determine what can be changed in the VPN client software (e.g. OpenVPN) to circumvent this behaviour
  • investigate whether the same performance hit is found on other platforms (e.g. Windows, NetBSD)
Note that over a 100 Mbps Ethernet connection this performance drop is NOT seen: both a non-tunneled connection and a VPN connection can easily saturate the network (~94 Mbps).
Jan Just Keijser <janjust=>>

Berry Hoekstra <berry.hoekstra=>>
Damir Musulin <Damir.Musulin=>>


Large scale GPU based password cracking.

Recent research (also by OS3 students) has shown a notable increase of power for password cracking when GPU cards are used compared to traditional CPU based password cracking. However, when putting this into practice several significant downsides appear of which the immaturity of the tools, the lack of scalability and the many unsupported hash types are the most important. Currently, KPMG has achieved great results with a CPU based password cracking cluster: 1 server + 30 desktops operated by a MPI patched version of John The Ripper combining dictionary attacks, rainbow tables and brute force cracking via an easy to use interface. At the side a GPU server is used. But ideally the GPU power should be included in the cluster and possibly the cluster members should be equipped with GPU power.
Research the possibilities for integration of the CPU based password cluster with GPU possibilities and advice on the optimal password cracking strategy per common hash types. The selection of hash types will be focused on hash types much used in enterprises and decided at the start of the research.
Marc Smeets <Smeets.Marc=>>

Jochem van Kerkwijk <jochem.vankerkwijk=>>, Aleksandar Kasabov <Aleksandar.Kasabov=>>


Synergy of social networks defeat online privacy.

Many papers and articles have been written about the dangers of online social networks like LinkedIn, FaceBook, Twitter, Flickr, etc. Although the general conclusion is that no info is safe, many papers stay without proof. Also it still is unclear to what extend information about a person is revealed when you put the info from different networks together. What if we take info together from tags and GPS coordinates on Flickr and Twitpic, list of connections on LinkedIn but not on FaceBook (and vice versa) and the mapping of different usernames for one person on the net? Do we only know who his friends are, his girlfriend is and what food he buys for his cat? Or can we also reveal more detailed info on the location of his house, his way and whereabouts of his daily commute and travelling, his sleeping pattern and his nighttime life as a superhero?
Research the synergy of the info about a person on these social networks and illustrate what targeted intelligence on a person's online social life can reveal. It is preferred to make a proof of concept tool that can support in this structured search. Finally, recommendations should be made that can help in the safeguarding of your online identity.
Marc Smeets <Smeets.Marc=>>

Eleonora Petridou <Eleonora.Petridou=>>,
Marek Kuczynski <Marek.Kuczynski=>>


DNS anomaly detection.

Most Internet communication starts with one or more DNS lookups. Command and control traffic used by malware is no exception to this behavior.

Some types of C&C traffic such as IRC is often blocked by internal corporate firewalls making it invisible for border IDS systems. In these cases infections could probably still be detected by analyzing DNS requests originating from the corporate network.

Examine the feasibility of detecting malware infected systems using DNS log data and develop a scheme for detecting these anomalies in DNS traffic. This could include anomalies in the actual content of the request (e.g. hostnames, RR) and meta-information such as timing and reoccurrence of specific lookups. Develop a simple proof of concept capable of processing text based output from our DNS logger.
Bart Roos <roos=>>,
Sander Peters <peters=>>

Nick Barendregt <Nick.Barendregt=>>, Hidde van der Heide <hidde.vanderheide=>>


DNSSEC Certificate validation.

Remote trust on the web but also in systems like mail servers is mostly handled by so called Certificate Authorities, companies, government bodies or other types of organisations that users go to to obtain their own certificates. There is a significant leap of faith involved: why should you blindly trust the hundreds of Certificate Authorities preloaded in your browser to not abuse their root certificates, when many Certificate Authorities are organisations you don't know anything about - which means you might not want to trust them for all purposes, certainly not if you can avoid it. What if the websites and services you care about can publish the certificates they use safely and authoritatively through the DNS? Historically, the answer was that DNS itself was not safe enough. With DNSSEC you get a chain of trust from the signed root of the internet to the service you want to connect to. This opens the way for many new exciting opportunities for humans and computers to exchange information safely. You will implement a working proof of concept that shows the potential of an certificate less internet, through a browser, mail server or IM client plugin that checks for the availability of valid certificates over DNS using DNSSEC, and where relevant uses them and/or compares them with regularly available certificates.
Michiel Leenaars <michiel=>>

Pieter Lange <pieter.lange=>>
Danny Groenewegen <danny.groenewegen=>>


Automatic SSH public key fingerprint retrieval and publication in DNSSEC.

When a user logs into a remote shell through the SSH protocol through a user name and password for the first time, he or she is offered a public key and a key fingerprint by the destination system (or a system imposing as such). In such a situation, there is no automatic way to check if this public key or its fingerprint is actually legitimate because most of the times there is no public record of them a machine could retrieve by itself. Currently a user has to manually check the fingerprint of the keyfile using ssh-keygen, where the public key is handed over in a separate way by the system administrator of the machine. Other methods fail because an eavesdropper can sent you a malicious server key with matching fingerprint of the malicious key. Of course this creates an unnecessary vulnerability in situations where a sysadmin is forced to change a key. In your project you will investigate and recommend a mechanism for system administrators to automatically push public keys in the DNS, and you will write concept code for OpenSSH to automatically handle these keys once they are retrieved (of course using DNSSEC). This will allow 'unmanned' yet secure use of SSH.
Michiel Leenaars <michiel=>>

Marc Buijsman <Marc.Buijsman=>>, Pascal Cuylaerts <pascal.cuylaerts=>>



Android Market monitoring.

Android is a popular open source operating system, that runs on mobile devices like cell phones and tablets. In most devices on the market system administration tasks through the so called Android Market are handled over XMPP, through which a remote operator takes over full control over a phone and remotely installs and updates all software. This is a serious security concern for some users: the same invisible mechanism that may be desirable for use by 'trusted parties' might be abused by others - for instance for stealing GPG-keys, browser certificates, passwords or other forms of abuse. Cell phones are increasingly used beyond mere human communication, and business usage (like sysadmin tasks) demand a more proactive monitoring of security of devices - otherwise end users cannot trust their device to the extent they need. During your assignment you will investigate the remote admin technology used by Google and create a proposal for a prototype application ("Android Guardian") - an app that can monitor and (when required) block all activities of remote operators.
Michiel Leenaars <michiel=>>

Bastiaan Wissingh <Bastiaan.Wissingh=>>, Thorben Krueger <Thorben.Krueger=>>


Analysis of network measurement data.

RIPE Atlas ( is a prototype service for a new, large scale internet measurement network officially stared in November 2010. It has the potential to organise thousands or even tens of thousands of measurement nodes distributed around the Internet.

In this project, students should analyse measurement data coming out of the system and work on one or more of the following tasks:
  • find interesting events, research their background and explain them; potentially using other information sources such as BGP information, known network events, etc.
  • develop methods for automatically recognising such events
  • evaluate correlations between clusters of similarly behaving nodes from the same provider, different providers in the same geographic area, or geographically diverse providers
  • explore similarities between different oddly behaving nodes; correlate these with their common geographical, topological or other properties
Robert Kisteleki <robert=>>
Emile Aben <emile.aben=>>

Roy Duisters <roy.duisters=>>, Damir Musulin <Damir.Musulin=>>


Exploiting Jailbrakes.

Recently new ?bootrom? level exploits have been published in iPhone and iPad hardware. These exploits allow all existing iPhone?s and iPad?s to be ?jailbroken?, allowing users to install unofficial apps. In theory these same exploits could also be used to obtain information from a stolen or lost iPhone or iPad without knowing the security PIN code. We would like the student to verify whether these exploits can be used in this manner and which information can be obtained in this way. Using this information we can better advice our clients related to the security of iPhone and iPad systems in practice. We would also like the student(s) to evaluate the effectiveness of various counter measures (such as using dedicated security software on the device).
Gijs Hollestelle <ghollestelle=>>, Derk Wieringa <dwieringa=>>

Jochem van Kerkwijk <jochem.vankerkwijk=>>


Effectiveness of Automated Application Penetration Testing Tools.

In the market we see that a lot of different tools are used for automated penetration testing. We also identify that companies rely more and more on the automated tools for internal usage. However, we also see that a lot of false positives are generated by the tools that are used, that need to be assessed manually afterwards. Therefore we would like the students to perform an analysis on various tools that are available for the testing of web applications and the reliability of the analysis that is performed by these tools.
The following should be performed:
  • Create or setup a realistic (web) application (including vulnerabilities) that can be used to compare the various tooling that is available;
  • Define and perform various tests that can be used to compare the results of the automated testing tools.
Coen Steenbeek <csteenbeek=>>,
Derk Wieringa <dwieringa=>>

Harald Kleppe <harald.kleppe=>>, Alexandre.Miguel Ferreira <Alexandre.MiguelFerreira=>>


Measuring DNSSEC Validation.

In July 2010 the root zone of the DNS was signed using DNSSEC. Since then, DNSSEC deployment has taken off on a larger scale with many top-level domains signing their zones and making secure delegations available to their registrars and registrants. Many people are signing their zones and measurements on the SURFnet resolver infrastructure shows a steady climb in the validation rate (i.e. the percentage of queries that can be validated). This rate has now risen to about 3% (November 2010).

Problem statement:
We have a pretty good insight into the deployment of signed zones across the Internet. It is, however, much harder to determine whether or not people are actually validating queries on their resolvers. Resolvers that request DNSSEC information set the so-called 'DO' bit in a query. Unfortunately, this is not a good way to measure the amount of validating resolvers; a lot of resolver software now sets this 'DO' bit by default even if they don't actually validate the signatures returned in the answer (BIND, for instance, has the DO bit enabled in the default resolver configuration since version 9.6).

The goal of this assignment is to find a good way to determine the amount of clients on an authoritative name server that perform DNSSEC validation. If time allows it, it would be very welcome if a simple tool/proof-of-concept can be developed as part of the assignment.

It is helpful to have knowledge about the DNS(SEC) protocol and wire format, some familiarity with tools like tcpdump and Wireshark is probably also warranted.
Roland van Rijswijk <roland.vanrijswijk=>>

Niels Monen <Niels.Monen=>>


Emulating network latency on high speed networks.

The netem or Network emulator is a component of the Linux kernel that allows emulation of properties of wide area networks. While netem has been around for a while and has been proven to work reliable on relatively low bandwidth setups, the emergence of new high speed network interfaces requires more insight on its capabilities. We are interested in evaluating netem as a tool to emulate latency on very high speed networks. This would allow us to research the effects of latency on various applications without using expensive long distance links. The efficiency of netem greatly depends of the timer resolution provided by the Linux kernel and therefore the research should focus on differences between the vanilla kernel and the real time version (linux-rt) in terms of system throughput. Is linux-rt usable in such a setting or is the performance hit unacceptable for 10Gbit/s+ speeds?
Cosmin Dumitru <c.dumitru=>>, Ralph Koning <r.koning=>>

Niels Monen <Niels.Monen=>>,
Berry Hoekstra <berry.hoekstra=>>


Power measuremeents in DAS4.

The GreenClouds project studies how to reduce the energy footprint of modern High Performance Computing systems (like Clouds) that are distributed, elastically scalable, and contain a variety of hardware (accelerators and hybrid networks). The project takes a system-level approach and studies the problem of how to map high-performance applications onto such distributed systems, taking both performance and energy consumption into account. We will explore three ideas to reduce energy:
  • Exploit the diversity of computing architectures (e.g. GPUs, multicores) to run computations on those architectures that perform them in the most energy-efficient way;
  • Dynamically adapt the number of resources to the application needs accounting for computational and energy efficiency;
  • Use optical and photonic networks to transport data and computations in a more energy-efficient way.
The project will create the GreenClouds Knowledge Base System (GKBS) based on semantic web technology, which will provide detailed information on the energy characteristics of various applications (e.g., obtained from previous execution runs) and the different parts of the distributed system, including the network. Also, the project will study a broad range of applications and determine which classes of applications can reduce their energy consumption using accelerators. Finally, it will study energy reductions through dynamic adaptation of computing and networking resources. The project will make extensive use of the DAS-4 infrastructure, which is a wide-area testbed for computer scientists, to be equipped with many types of accelerators, a photonic network, and energy sensors.

In this RP a study and baseline testing will be made of the power measurement capabilities of DAS4 and recommendations for more precise measurements will be made.
See also:
Ralph Koning <r.koning=>>, Paola Grosso <p.grosso=>>

Vesselin Hadjitodorov


Security of IPv6 and DNSSEC for penetration testers.

In the infrastructure level we have seen new and more secure new arise: DNSSEC and IPv6. For both technologies there are known insecurities already. But these new technologies also provide the penetration testers with challenges in the inventory scanning and vulnerability phases of his work (e.g. scanning a /64 IPv6 range takes a while and existing tools need to adept to IPv6). We want you to:
  1. Provide an overview of known security errors in IPv6 and DNSSEC.
  2. Provide an overview for the penetration tester of different techniques that can be used to still perform the inventory scanning phase.
By the way: IPv6 and DNSSEC are far from entirely researched. If there is time during the RP we encourage further research on protocol weaknesses.
Jaap van Ginkel <J.A.vanGinkel=>>

Vesselin Hadjitodorov <Vesselin.Hadjitodorov=>>


Passive LAN information gathering.

On corporate LAN's the local area network is full of broadcast traffic from various protocols (application protocols like HTTP, SSDP, SMB, etc but also network protocols like STP, HSRP, ARP, etc). By simply listening to these 'free' packets one can gather a lot of information
about the setup of the network and about the main systems in the network.
Students are asked to:
  1. research the broad set of information that can be gathered by sniffing these protocols passively
  2. create proof-of-concept tooling that gathers the data, interpreters the data and presents it in a usable form.
The PoC can be for example in the form of metasploit module or as a stand-alone OS. This goes beyond the level sniffing of clear text protocols in search for credentials. Students should be educated in the many different protocols that you find on a LAN prior to the start of the project. We will help the students on identifying protocols to focus on.
Marc Smeets <Smeets.Marc=>>

Roy Duisters <Roy.Duisters=>>


A mapping daemon for the Locator/ID Separation Protocol (LISP).

The Locator/ID Separation Protocol (LISP) is a protocol that is currently being developed by a Working Group in the Internet Engineering Task Force (IETF). LISP is a network architecture and a protocol that implements a new approach to addressing and routing on top of regular IP. In a nutshell: LISP separates the 'where' and the 'who' in networking and uses a mapping system to couple the routing location (where) and endpoint identifier (who).

A non-propriatory Linux implementation of LISP is in development, but there currently is no opensource control-plane software. During the RP, the candidate will work on the development of an opensource MapServer/MapResolver that can become an essential part of the LISP ecosystem.
Job Snijders <job=>>
Rager Ossel <rager=>>

Marek Kuczynski <Marek.Kuczynski=>>


BufferBloat detection.

Bufferbloat, or the oversized buffer problem, is caused by too large buffer sizes in all sorts of network devices. People also argue that large buffers alone are not the reason for this phenomena. The argument is that large buffers are needed to absorb short, intensive bursts in traffic. But large buffers which are generally filled and not managed by for ex AQM will provoke "bufferbloat".

We will look into identifying the symptoms of bufferbloat and research ways to identify the device (router or switch) that causes the problem.

Given a link,

Source ==== Buffer --?-- Buffer --?-- Buffer --?-- Destination

Where bufferbloat is observed, would we look into methods of identifying which of the "unknown" links causes the problem.
Michiel Leenaars <michiel=>>

Harald Kleppe <harald.kleppe=>>
Danny Groenewegen <danny.groenewegen=>>


CurveCP protocol efficiency.

The author of CurveCP (Dan Bernstein) makes a number of claims about various features of his protocol. In general, these can be divided into the following topics:

Security (
  • mandatory server authentication (can't be disabled)
  • optional client authentication
  • no man-in-the-middle attacks possible
  • no replay attacks possible
  • forward secrecy (both active and passive)
Availability (
  • no RST attacks on TCP sessions (such as SSH) possible
  • protection against traffic prediction by randomness in transmission schedule
  • enforced packetlengths and other policies in client/server handshakes to not amplify the resources of an attacker
  • server does not allocate memory before handshake complete, protection against "SYN flooding"-inspired attacks
  • worst case cpu loads kept very small -> performance not CPU-bound but network-bound
Efficiency (
  • CurveCP packet has more overhead than TCP packet
  • for short connections, less traffic than HTTPS
  • for short connections, much less traffic than SSH
Decongestion (
  • CurveCP tries to minimize packet-loss AND significant increases in latencies (claims to decongest routers and minimize bufferbloat)
  • CurveCP scheduler keeps track of long-term congestion statistics
  • CurveCP scheduler achieves tolerable share of bandwidth even running alongside an aggressive competing (TCP) scheduler.
  • CurveCP distinguishes between (e.g.,) lossy wireless and congestion-induced packet loss to use wifi connection with reasonable efficiency.
Addressing (
  1. multiple CurveCP servers can share a single IPv4 address and port
  2. CurveCP servers inherently anti-aliased from (IP) addresses
  3. Rapid failover to redundant server if original is down
  4. CurveCP session/connection not invalidated if client changes IP addresses
In this research project we concentrate on verifying claims about protocol efficiency and addressing. To this end, we propose to create a CurveCP-enabled remote shell/copy utility, which can then be used to benchmark CurveCP against SSH-based TCP sessions. In particular, we propose to pursue the following questions:
  1. In an ideal setting (no competing traffic), how does CPU usage relate to used bandwidth?
  2. How do SSH and CurveCP handshakes compare in terms of exchanged pakets (and their sizes)?
  3. How large is the general (packet) overhead of CurveCP over TCP (theory vs practice)?
  4. How do latencies between CurveCP and TCP-based sessions compare?
  5. How much bandwidth can a fresh CurveCP session achieve alongside a (saturating) TCP file transfer (on the server/client side)?
  6. How much bandwidth can a fresh TCP session achieve alongside a (saturating) CurveCP file transfer (on the server/client side)?
  7. How feasible is the possibility of using a single IP address and port for multiple servers?
  8. How feasible are persistent CurveCP sessions when switching client IPs?
  9. In failure situations, how does the failover happen and what are the parameters involved?
And if time allows:
  1. Investigate how to reproducibly simulate a lossy network and perform above mentioned measurements under such conditions as well.
Jeroen Scheerder <js=>>
Jeroen Scheerder <jeroenscheerder=>>

Thorben Krueger <Thorben.Krueger=>>

Color of cell background: purple = currently chosen project. Light blue = project plan received. Light green = presentation received. Dark green = also report received. light purple = confidentiallity was requested.


Wednesday feb 2th in room 645.C1.112 at Science Park 904 NL-1098XH Amsterdam.
#RP Title
Welcome, introduction. Cees de Laat
10h00 33 Power measuremeents in DAS. Vesselin Hadjitodorov 1
10h20 22 Real Time Text. Ares Christou 1
10h40 10 Desktop Virtualisatie. Sudesh Jethoe 1

11h15 13 Synergy of social networks defeat online privacy.
Eleonora Petridou, Marek Kuczynski 1
11h45 11 Large scale GPU based password cracking. Jochem van Kerkwijk, Aleksandar Kasabov 1
12h15 32 Emulating network latency on high speed networks. Niels Monen, Berry Hoekstra 1

13h30 27 Quality analysis of automated penetration testing tools. Harald Kleppe, Alexandre.Miguel Ferreira 1
14h00 25 Analysis of network measurement data. Roy Duisters, Damir Musulin 1
14h30 21 Android Market monitoring. Bastiaan Wissingh, Thorben Krueger 1

15h15 20 OpenSSH automatic public key retrieval. Marc Buijsman, Pascal Cuylaerts 1
15h45 19 DNSSEC Certificate validation. Pieter Lange, Danny Groenewegen 1
16h15 17 DNS anomaly detection. Nick Barendregt, Hidde van der Heide 1
Cees de Laat & OS3 team


Thursday June 30th, 2011 in room C1.110 at Science Park 904 NL-1098 XH Amsterdam.
Welcome, introduction. Cees de Laat
10h05 29 Measuring DNSSEC Validation. Niels Moonen 2
10h30 40 Security of IPv6 and DNSSEC for penetration testers. Vesselin Hadjitodorov 2
10h55 43 Passive LAN information gathering. Roy Duisters 2

11h40 8
PersLink Security. Eleonora Petridou, Pascal Cuylaerts 2
12h10 26
Exploiting Jailbrakes. Jochem van Kerkwijk 2

13h40 46
BufferBloat detection. Harald Kleppe, Danny Groenewegen 2
14h10 48
CurveCP protocol efficiency. Thorben Krueger 2
14h35 2
Experiment reproducibility in OneLab/PlanetLab. Sudesh Jethoe 1

Comparing TCP connection performance over VPN-tunnels vs non-tunneled traffic. Berry Hoekstra, Damir Musulin 2
15h50 44
A mapping daemon for the Locator/ID Separation Protocol (LISP). Marek Kuczynski 2
Closing Cees de Laat
Borrel in SNE lab