The dissertation project is undoubtfully the largest piece of work a student has to produce during their time at University. When selecting the topic, I wanted to go for a practical project which would allow me to develop new skills. I also wanted the work on something which is somehow related to me and which I feel can be useful for society.
As I love the Apple’s ecosystem, I have always been interested in iOS development. I started studying Swift when it was released back in 2014. In 2019, I got especially intrigued by the then newly announced front-end framework – SwiftUI. During my Placement Year, I managed to find time to do some iOS development and build upon the skills I gained years ago. I know that the best way to learn new things is via practical experience. However, I have always found it difficult to do a project just for the sake of doing something.
I was quite fortunate to be able to select a dissertation that would allow me to further develop my skills in mobile development. The original title, “Building an iOS application using ResearchKit”, was open-ended enough for me to be able to expand the idea and work on a medical condition I am interested in and have personal experience with.
I spent the Autumn semester developing the idea, doing research, communicating with third parties, as well as working on the system’s architecture and design. The Spring semester will be dedicated to development and testing.
ClimaFever is an experimental platform aimed at researching how climate change affects hay fever sufferers. As the prevalence of allergic diseases continues to grow, the project will prepare the ground for further research which can be of vital importance for improving people’s quality of life.
The iOS app will show the air quality index, as well as the pollen count for the different types of pollens (grass, trees, weed). Using innovative technologies and machine learning, up-to-date data will always be available. In combination with the user-collected data, an association between the severity of the symptoms with the air quality, as well as the geographical location, will be made. The collected data will be accessible on a website where it can be studied by researchers. The application will also contain several additional features to provide users with important information about the condition, such as its diagnosis and treatment.
Available in 2021
A few experimental studies have shown that increased CO2 levels in the environment lead to an increase in pollen allergenicity. High ozone levels have also been linked to worsening in hay fever symptoms. The rising temperature is another consequence of climate change that affects the hay fever season duration and strength. Asthma sufferers are at even greater risk.
None of the currently available services, aimed at hay fever sufferers, provide real-time data about the pollen count outside the established pollen season. Patients are not receiving vital information that can be useful for managing symptoms and thus, improving wellbeing. It can also affect how people are being diagnosed. This is caused by outdated information being used around the world. Furthermore, the current trend to start preventive medications in mid-March is no longer effective.
The few available studies on how climate change relates to hay fever are largely experimental. Moreover, there is a lack of evidence on how these changes translate to sufferers’ wellbeing, as quantitative research on the human population has not been conducted. The aims of the project are:
- to provide a way of collecting quantitative data to be used for studying the relation between climate change and the onset of hay fever symptoms
- to change the established beliefs about the hay fever season duration by raising awareness of the dynamic nature of the problem
- to make the platform publicly available
The aims will be achieved via a platform, which will allow the correlation between the environment and the onset, duration, and strength of hay fever symptoms to be studied. The main objectives of the project are:
- to build an iOS application that allows users to take part in the research by taking advantage of Apple’s ResearchKit framework
- to build a website that displays the collected data in a concise manner
The module emphasised on modern artificial intelligence (AI) techniques and their inspiration from biological systems. Some of the topics covered were evolution, neural systems, the immune system, swarms, and the counterpart concepts they have inspired – evolutionary and swarm-based optimization algorithms, neural computing, as well as cellular automata and agent-based models. The lectures showed the application of these concepts for real-world problems. Python was used for the lab sessions, as well as for the group assessment.
Group Project: Modelling a forest fire with CA
A problem, inspired by the real world, with a set of requirements was presented. We were required to simulate a forest fire using Cellular Automata (CA). The simulation had to take into account the burnability of the terrain. The simulations had to be performed under different wind speeds and directions. Apart from coding, a scientific report, addressing the requirements and summarising the results, was written.
The simulations matched the expectations – with the fire spreading faster towards the wind direction and accelerating based on the wind speed. Various similar papers were researched in order to take inspiration, as well as to compare the findings and the conclusions made. While challenging, the assignment showed that complex systems can be represented and simulated using simple models. However, the results and their interpretation can vary greatly based on the assumptions made and the factors taken into consideration when building the model.
The submission received the highest grade in the cohort — 89%.
The module presented the fundamental concepts and ideas in natural language text processing (NLP). Information Retrieval, Text Compression, Information Extraction, and Sentiment Analysis were the main topics introduced. The module focused on the challenges in NLP, as well as the state-of-the-art techniques. Python was used extensively for the various labs and assignments.
Assignment: Document Retrieval
The first assignment consisted of building and experimenting with a document retrieval system, as well as writing a report summarizing the findings. Five different weighting schemes were implemented – binary, raw term frequency, algorithmic term frequency, tf-idf, using raw term frequency and wf-idf, using algorithmic term frequency.
Special consideration was taken in order to ensure the high performance of the system. Operations, needed for a specific scheme, are only executed if the particular scheme is used. Sets are used whenever possible, as they give better performance than lists. The inverted index gets filtered, so only terms that appear both in the query and the document are taken into account when calculating the document scores. Operations related to the whole document collection, such as calculating term frequencies and IDF, are executed only once.
As anticipated, binary weighting produced the lowest results, followed by more sophisticated methods, such as TF and TFIDF. The IR system built clearly showed the importance of pre-processing before performing the retrieval.
The feedback complimented the “excellent performance” achieved with “recall, precision and f-measure scores at the level of the best-known scores”. The assignment was awarded a first-class grade.
Illustration by Freepik Storyset
Assignment: Sentiment Analysis
The second assignment required building a sentiment analysis using a Naïve Bayes classifier, as well as producing a report summarising the extend of the implementation, the experiments carried out and the results obtained. The provided data set was based on movie reviews. The classification was performed on three and five classes. As expected, classification of three classes performed much better.
Different pre-processing and feature extraction steps were experimented with. Some of the strategies tried were lowercasing, spell checking, stop list removal, punctuation removal, stemming and lemmatisation, single character and digits removal. Additionally, most and least frequent words removal was applied. Part of speech (POS) tagging was used in order to filter out features and perform classification on subsets.
The assignment demonstrated the difficulties related to sentiment analysis. First, the order of the pre-processing steps and feature extraction can lead to different results. The difference in accuracy after applying many of the steps was minimal – sometimes less than 2%. This makes it hard to predict which one will perform better on an unseen example. Second, each step can introduce errors on its own - for example, applying a spell checker might change a misspelled word to different than the intended one. Finally, most of the neutral sentiments were misclassified – this might be due to the fact that these sentences contain less or none strong words that can be used for the classification. Furthermore, sentences with neural sentiment might introduce polar opinions — positive and negative — which makes the classification task harder.
As part of the assignment, a program for evaluating the performance of the system was also developed. It calculates the accuracy and displays the confusion matrix, which is a useful way to show the results of the classification. The submission was awarded 94%.
Illustration by Freepik Storyset
Love vector created by tartila - Freepik
Cyber Security Team Project
The module was about keeping an organisation secure from cybersecurity threats, as well as learning about how to react and what steps to take when a breach occurs. The topics covered were Threat Analysis & Modelling, Security Policies & Awareness, Physical & Technical Defences, Security Monitoring Strategies, and Incident Handling & Response. As the name of the module suggests, most of the work was done in teams.
Assignment: Threat Modelling
The individual assessment of the module consisted of carrying out threat modelling of a system, based on its description. The assignment was split into three tasks. The first one was about creating a data flow diagram (DFD). Twenty cybersecurity threats had to be identified as part of the second task. This had to be done in according the STRIDE framework which provides mnemonic for security threats in six categories – spoofing, tampering, repudiation, information disclosure, denial of service and elevation of privilege. Finally, mitigation had to be suggested for each of the threats identified.
The assignment helped me to think more about threats I have heard about, but have never actually spent enough time investigating and thinking of possible mitigation strategies. As the system description was quite broad and included various entities, I thought of a wide number of threats which threaten the system's security both directly and indirectly. The submission was awarded a first-class grade.
Illustration by Freepik Storyset
The team project was done in groups of four-five students. Similar to the individual assignment, it was split into several tasks. The first one was concerned with producing an incident response review to a detailed timeline of an incident and a company’s response to it. The strengths and weaknesses were identified and advice was provided on what could have been done differently.
The second part required assessing the company’s preparedness for Cyber Essentials Plus. The framework consists of five key controls – firewalls, secure configurations, access control, malware protection and patch management. The existing security controls and their appropriateness to the company were evaluated. As a result, new security controls were proposed to increase the company’s cyber preparedness.
The third task consisted of preparing a security awareness program. It embodied a 6-month schedule, a presentation for a classroom-based online session, as well as an awareness poster. Finally, two non-security technical controls were selected and proposed to further secure the company’s digital assets.
As the Cyber Essentials Plus framework has requirements in regards to many aspects of a company’s IT infrastructure, we managed to come up with numerous of both short and long-term interventions which we identified as appropriate and necessary based on the company’s description and its current security posture. A lot of the suggestions were inspired by strategies and methods we have seen in the industry (during our Year in Industry), as well as rigorous research on the latest techniques against cyber threats. A first-class grade was achieved.
Illustration by Freepik Storyset
Finance and Law for Engineers
The module covered a large amount of finance and legal issues likely to be encountered in the industry. The finance component focused on the practical issues of budgeting, raising finance, assessing financial risks and making financial decisions in the context of engineering projects and/or product development. This includes preparing budgets and analysing financial plans, as well as determining the financial needs of an organisation and identifying appropriate sources of finance. The law part covered the law of contract, intellectual property law, including copyright and data protection, as well as the tort law. The environmental law was also covered, outlining the environmental legislation one might have to adhere to.