Feedback from the SICSA Data Engineering Bootcamp

By Dr Andrei Petrovski, SGA Director
11 April 2023

The SICSA Data Engineering Bootcamp was held on 20 January 2023 in the beautiful ARC Building, University of Glasgow

The main intent of the event was to raise awareness amongst members of the SICSA Graduate Academy of the goals, challenges, and tools used by the industry in designing, building and maintaining data products.

During the bootcamp three practical workshops were run by representatives from various sectors of industry – Engineering, FinTech and Public.   Experts from the industry explained and demonstrated how the modern data stack, data pipelining tools and different parts of the data engineering lifecycle work together.  The bootcamp programme started with the topics of data acquisition, collection and pre-processing presented by Steve Aitken from Intelligent Plant.  Then, the attendees were exposed to the process of creating value from data using real-life examples by colleagues from Corporate Data Services in Barclays Bank.  The final workshop on the day was presented by National Services Scotland, NHS focused on maintaining and enhancing the quality of data products.

During the bootcamp the SICSA PhD students who participated in the event had plenty of opportunities to speak to managers, engineers and researchers working in the Data Engineering field, who gave their advice on handling typical challenges, using specific tools and even provided interviewing tips for potential candidates interested in working for their companies and organisations.  The event received very positive feedback, including “Real-life examples, honest answers and openness, “Loved the views of industrial experts”, “Very engaging!!” the overall theme of the feedback was a desire to have more events like this and include additional practical and interactive challenges and activities for the students.  SICSA, in collaboration with DataLab, are now considering what other events could be organised in the near future.  We will keep you posted!!

SICSA Education – Learning & Teaching Scholars Update

And that’s a wrap – the first cohort of SICSA Education L&T Scholars has now completed the programme!

Over the last nine months, 16 L&T-focused colleagues from eight different SICSA member institutions came together to develop scholarship projects and share their teaching practice.

Beginning with an online speed networking session, our Scholars then identified a number of scholarship ideas relating to CS Education that they might like to explore, and formed teams around the most popular ideas.

Online speed networking session

 

At our face-to-face meet-up in June, the teams then began to flesh out plans for undertaking these scholarship projects. Projects included looking at the challenges of teaching CS to students with no prior CS experience, dealing with plagiarism in programming assignments, hybrid teaching models, and determining what students struggle with the most when learning to program.

Each group was offered a support session with colleagues from the University of Glasgow’s Academic and Digital Development Unit, to discuss the shape of their projects, and possible routes to publication. For the remainder of the event, Scholars worked on developing their projects, with input from a facilitator. The Scholars have continued to work on these projects, with the aim of publishing papers based on the outputs.

The Scholars’ next engagement was an online workshop on Influencing and Leadership Skills, led by the highly experienced Dr Robin Henderson from MY Consultants. This is an area that many teaching staff are expected to evidence in promotion and fellowship applications, but it’s not always obvious how we develop these skills – that’s where Robin comes in!

Each Scholar was also allocated a mentor from another institution, with whom they were encouraged to meet at least twice over the course of the programme. Our mentors were all expert educators, who were able to share their experience and career insights with our Scholars.

At our final meet-up, Scholars provided an update on their scholarship projects and discussed their plans for publication. The intention is also for Scholars to present their work at this year’s SICSA Conference, which will be great to see.

This inaugural cohort of L&T Scholars has been amazing, and I hope that we will all stay in touch. The plan is to run the programme again next year, so watch this space!

— Matt Barr

Our final, online meet-up

 

HRI Winter school visit on embodied AI 2022

12 December 2022,

by Jacqueline Borgstedt, University of Glasgow

I am Jacqueline Borgstedt and currently I am completing an interdisciplinary PhD with the UKRI CDT for socially intelligent artificial agents at the University of Glasgow. My doctoral research on Human-Robot Interaction (HRI) explores how augmenting social robots with haptic modalities affects users’ perception of and relationship with a robot. Moreover, I assess the potential of such multi-modal social robots to aid users in regulating their emotions during stress-inducing situations.

As an interdisciplinary researcher, I aim to bridge methodologies and technologies used across multiple disciplines such as Psychology and Human-Computer Interaction. To achieve this, it is vital to network and collaborate with researchers across multiple disciplines. Furthermore, it is vital to expand my skillset and get a better understanding of technical concepts and methods relevant for Human-Robot Interaction. I was thus interested in participating in a winter school that would allow me to expand my network in the HRI community and to gain a better understanding of such technical concepts.

After looking for suitable winter schools, I decided to apply for the HRI winter school on embodied AI 2022 at Gent University. The winter school was an excellent training opportunity as it attracted an international and diverse pool of researchers, which allowed me to establish novel connections and to meet potential collaborators. Moreover, participation in the program allowed me to engage with influential researchers in the field, whose work I have been following since the start of my PhD.

The program of the winter school closely aligned with my research interests in designing meaningful Human-Robot Interactions that have a positive impact on the individual user as well as the broader society. Some of my personal highlights included a tutorial on participatory design, which helped me to improve the experimental design of my next study. Participating in the winter school has thus had a direct impact on my research practices. Furthermore, there were multiple talks discussing the ethical and societal implications of implementing embodied robots in society. Discussing the ethical considerations of integrating robots in society is vital as researchers have a responsibility to consider how interactions between humans and robots can affect the individual users but also society in general. It was an amazing opportunity to discuss such considerations with other early career researchers and experts in the field. Finally, the diverse program allowed me to gain new technical skills.

At the end of the winter school, I felt better connected with my research community, fueled with enthusiasm, and full of ideas on how to improve my research practices. Without SICSA’s support this would have not been possible. I would like to extend my sincere gratitude for SICSA’s support and would encourage all early career researchers to attend a summer or winter school in their field.

SICSA Virtual Conference Funding: 14th International Conference on Agents and Artifical Intelligence

24 October 2022,

by Adil Ibrahim, Heriot-Watt University

Hi there! I’m Adil, a PhD student studying at Heriot-Watt University – Edinburgh. My research focuses on binary classification using artificial immune systems, and I use one of the primary immune algorithms, the Negative Selection Algorithm. It has always intrigued me to use artificial immune systems in data classification. My curiosity has always been piqued by how artificial immune systems can differentiate between self and nonself, depicting one of the main features of the biological immune system of our own.

For that to be achieved, there must be an affinity function that we could use so it works like the natural immune system. This affinity function has always been a challenge. In most cases, the suggested affinity functions did not work well, to the extent that many researchers concluded that the Negative Selection is not a reasonable classification algorithm.

I searched for a suitable affinity function and looked at the techniques used in bioinformatics. The protein sequence alignment schemes are used initially to measure the similarities between DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. So, I wondered if this could be the best affinity function to be used. So, I focused on the Negative Selection algorithm with bioinformatics protein sequence alignment schemes and used their mathematical techniques to build an affinity function required for the self and nonself discrimination in binary classification. The methods have been tested using datasets in the health domain to diagnose breast cancer and other datasets in different domains for comparisons.

My supervisor, Professor Nick Taylor, has told me about SICSA and the possibilities of getting support through their programs to attend a conference. So, as part of my research, I attended the 14th International Conference on Agents and Artificial Intelligence (ICAART 2022). Due to COVID-19, the conference was streaming online from Feb 3, 2022, to Feb 5.

The ICAART was an excellent opportunity for me as it brought together researchers, engineers and practitioners interested in the theory and applications in the areas of Agents and Artificial Intelligence.

The ICAART2022 was my first conference to attend as a speaker, and I needed to learn how it goes in such a contribution. The ICAART22 introduced me to many other researchers in different fields. The conference events and contributing to the discussions at the conference have expanded my experience considerably, bringing up several new research questions to explore since after.

It was a dream come true because the support I got from SICSA was highly fruitful, and I’m very grateful for that.

SICSA Saltire Exchange Scheme – Visit at Crysys Lab, Budapest University of Technology and Economics

10 October 2022,

by Rob Flood, University of Edinburgh

Hi there! I’m Rob, a Ph.D student studying at the University of Edinburgh. As part of my Saltire project, I spent one month in Budapest, Hungary.

My research is primarily concerned with synthetic data generation for use in security related analysis. By its very nature, real-world security data is sensitive. Legislative frameworks such as GDPR or the DPA limit researchers’ abilities to capture and release such data, for good reason. It may contain personally-identifiable information or provide malicious actors with important intelligence about an organisation’s security capabilities. As a result, researchers often rely on synthetic datasets. However, despite the explosion of research into machine learning applied to security, there has been little work discussing how these datasets should be designed, generated and released. In contrast, my work argues that, because malicious data, network topologies and threat models are all highly variable, great care needs to be taken when building these datasets.

I undertook my visit with Prof. Levente Buttyan at the Crysys Lab at the Budapest University of Technology and Economics. During this time, we focused on the generation of synthetic data for Industrial Control Systems, using data from the lab’s data generation testbed. Specifically, we discussed ways in which we could adapt domain randomisation, a technique common in other areas of synthetic data generation, to the field of network security.

I’ve never visited Hungary and had little idea of what to expect. My trip introduced me to Hungarian cuisine, architecture and history, all of which I enjoyed thoroughly. Moreover, my host treated me graciously, showing me his hometown of Szentendre and showing me where to get the best Lángos in Hungary.

Despite the visit lasting only a month, the scope of this work has expanded considerably, bringing up a number of new research questions to explore in the coming months. This SICSA exchange was extremely fruitful for me and I’m very grateful for being given the opportunity.

SICSA Saltire Exchange Scheme – Visit to ICS-FORTH

30 September 2022,

by Lito Michala, University of Glasgow

I am Dr Lito Michala and I am an early career researcher who works as a lecturer at the School of Computing Science, University of Glasgow. For my visit I chose to travel to Greece to ICS-FORTH and particularly the Computer Architecture and VLSI Systems Laboratory (CARV). My work is in the space of Interest of Things and the exchange enabled me to explore a new direction that will better align with the SICSA theme of generating a new, more sustainable computing landscape in the near future. CARV has lately generated traction with their research in FPGAs as low power accelerators for data centres and my interest was to explore moving this into smaller embedded devices.

During my visit I met and presented my work to several academics, PDRAs and PhDs who work in directly relevant areas but also in the wider systems stack from High Performance Computing to Systems Infrastructure and Engineering. The visit was very productive with 3 EU proposals submitted and at least one more in development. I read very interesting papers and got access to local facilities that could enable collaborative student projects and summer internships. I have already invited one of my collaborators to speak at our local Systems Seminars and hope to continue working with them for years to come.

Other than the work component the visit was brilliant as the weather was nice, I stayed relatively close to the institute and was able to enjoy an office with a view of an olive grove. The food was excellent and quite a few of the evenings I was able to spent with new-found and old colleagues. I would advise any ECR to take Exchange opportunities if they can, as there is no better way to build collaborations than breaking bread together!

 

SICSA Saltire Emerging Researcher Scheme – Visit at the University of Glasgow

27 September 2022,

by Maximilian Häring, University of Bonn

Do you trust the persons writing the software you use not to have malicious intent? Or the company they work for? The answer is probably „yes“ for most software. How do you judge that? Modern software is built with many software packages from other developers. So, the same questions from above need developers to answer themselves. One part of my research focuses on dependencies between developers in the IT landscape. Thanks to SICSA, I had the chance to visit the University of Glasgow and extend my knowledge.

I visited the SIRIUS Lab from Mohamed Khamis. I met Mohamed at CHI’19 (the biggest HCI conference), and early on, we talked about a research visit to his group. His group works on privacy and security research in AR/VR. My research visit allowed me to get a new perspective on the challenges that come with these technologies. I could discuss ideas and questions with members inside and outside the group at the University of Glasgow. I also made a few connections to other universities (Bristol and Edinburgh) in the UK. The discussion with fellow researchers allowed me to learn more about the privacy and security research community. It was constructive to see a different research group’s inner works and profit from their university’s context, e.g., the university regularly hosts speakers from other institutes (Sameer Patil gave a talk about stopping the spread of misinformation)

During my time, I also could utilize the setting of Glasgow, e.g., I attended a meetup of open source enthusiasts and developers. In general, Glasgow and the population have a different culture of using technology in their everyday life than Bonn. The trip helped me sharpen my research profile and get a bigger picture of the academic world. I made very interesting contacts for collaboration in the further process of my projects.

The Open Source Summit happened at the end of my visit. On my way home, I stopped there (in Dublin) and had the chance to talk to more practitioners and their situations. Seeing where the industry is currently going gives new perspectives on the questions I try to answer in my research.

SICSA Saltire Emerging Researcher Scheme – Visit at the University of Glasgow

27 September 2022,

by Eva Gerlitz, University of Bonn

Hi! I am Eva, a Ph.D. student at the University of Bonn, and I spent the last one and a half months in Glasgow.

My research interests lie in the field of usable security, especially concerning authentication and expert users, such as developers and administrators. I find authentication interesting, as almost any person who uses a computer or the internet will come in contact with it (and let’s face it: Even though most are well aware that ‘123456’ or ‘qwerty’ are bad choices for a password, both still end up in the list of most commonly used passwords each year). Authentication can be annoying and frustrating, but as it is so important in a digitalized world, I believe that we should do our best to make it less of a burden for everyone. While expert users might be hard to study (more on that later on), I believe it is worth the effort. End users might make bad choices, probably creating a problem for themselves, but the decision of expert users can impact thousands of users, depending on what they are working on. Understanding their needs, issues and priorities can thus lead to increased overall security.

But now, to my actual stay: I have always liked Scotland, so my joy was immense when SICSA accepted my application for a research visit in Glasgow this summer. I completed my research visit with Dr. Mohamed Khamis and his SIRIUS Lab at the University of Glasgow, where I learned a lot about VR, AR, and security and privacy-related issues that might come with their use.

My initial plan for my stay was to recruit expert users responsible for a company’s authentication system. However, after sending out many emails and not receiving any feedback, I realized that I had to switch to something else. I thus concentrated on the end user perspective and took the opportunity to talk to employees of the University of Glasgow about 2-factor authentication, which was currently offered but not enforced. This served as a first step in understanding the threat model that people have in their minds and against which they want to protect themselves from, but also to identify gaps in the imagination of how an account could be used next to simply accessing the data that is linked to this account. Even though I concentrated on end users, the results are interesting for expert users as well, as those are the ones who might have to communicate the needs and risks of different aspects of authentication to non-experts.

This work has brought up further research questions that we will work on in future (online) collaboration.

Apart from the professional aspects of meeting excellent researchers, getting great feedback for my own work, and getting a chance to peek into related research areas, I had the opportunity to learn more about Scottish culture. I enjoyed Scottish food, Glasgow street art, the (as I have been told many times quite rare) summer days in beautiful parks, small and independent shops, and buildings (including the University) that evoke the memory of Harry Potter.

None of this would have been possible without SICSA’s Saltire Exchange Award, and I am highly grateful for this opportunity!

SICSA Remote Collaboration Activities – Recruiting Participants

13 September 2022,

by Isa Inuwa-Dutse from the University of Hertfordshire, and Salma ElSayed from Abertay University

Introduction/Overview

Crowdsourcing involves the act of engaging a wide collection of diverse individuals to work on a paid or voluntary activity submitted by requesters. The individuals or  participant are not constraints by geographical location. Prolific is an online crowdsourcing platform that enables the recruitment of participants to engage
in a paid activity/task.

The support for remote activities fund provided means to award participants in our studies which positively impacted the turnout. We would like to share our experiences with Prolific as a crowdsourcing platform for recruiting participants and how this facilitated our research projects. For a comprehensive understanding, see
the quick guide offered by Prolific.

Depending on the nature of the study, engaging with participants varies. We share our experiences (as researchers) and relevant tips for recruiting participants. We refer to standard approach as the classic way of recruiting N participants via Prolific to partake in a study over a given period. The multi-participant approach requires some degree of coordination because the study participants are required to simultaneously engage in the study activity.

Standard Study – Isa Inuwa-Dutse, University of Hertfordshire.

This study is within explainable AI (XAI) and the goal is to study the impact of explanation on social loafing. As a crowdsourcing platform, Prolific offers a rich array of features to enable the recruitment of a diverse range of participants. One of the commonly used approaches is the classic way of recruiting N participants to engaged in an activity (asynchronously) over a given period. We refer to this method as the Standard approach in this report. The main areas to focus attention include:

  • Task description

Typically a study or task is designed elsewhere, such as Qualtrics, and a link to the task is shared with recruited participants to partake in the study. On completion of the task/activity, participants are returned to Prolific for payments pending a successful review by the researcher(s). It is useful to succinctly describe the task to the participants and what is expected of them.

  • Task duration, bonus, review and making payment

The participants should be informed about the average completion time of the activity
beforehand. This will be useful, especially during review of the responses from the
participants. For instance, if the average completion time is 15mins and a participant spent <5mins, then you would like to carefully ascertain whether such participant heeded to the instructions. The inclusion of attention check questions is also crucial in this regard.

  • Restarting and adding participants to a study

Sometimes after a prolonged period of completing a study, the need to increase participants may arise. You may likely encounter a no response scenario or not receiving any response simply because the task is not visible to the participants. The easiest way to solve the issue is to contact the researcher help centre to restart the activity and make it visible to the participants.

  • Miscellaneous
Multi-participant Study – Salma ElSayed, Abertay University, Dundee.

To study the impact of affective non-player characters in multi-player games, two participants had to play a web game slice at the same time then fill a questionnaire. We tried different approaches.

At first, we advertised for the study using word of mouth and social media. This was before any funding, and we managed to recruit 12 testers in 3 months. It was way into COVID and the lockdowns which heavily impacted running the experiments. Also, to succeed in finding two available subjects at the same time was cumbersome and involved several rounds of communication, even with the use of Doodle for booking. The collected data was not enough for analysis.

It was then decided to pursue a small grant and try crowdsourcing with Prolific. The SICSA Remote Collaboration Activities fund enabled paying participants and the experiment design was altered to accommodate the use of the platform. We created two studies on Prolific; the first had general information about the research, consent statements, and a Calendly to book a slot from available times. Participants were then matched on time slots and sent a Zoom link on the day/time they selected. They would anonymously join the call to play the web game with another. The second study is released to them after the play session and included a Qualtrics link to the survey and debriefing information. The use of Prolific was an attempt to reach more participants and collect more responses; however, the drop-off rate was high and too many candidates would complete the consent and booking, but never show to play the game, although confirmations were sent a day before the event. In two months, only 9 participants have properly completed the study.

Given how slow the process was going, and how this phase was exhausting its timeline, we decided it was best to pursue another path in parallel. We advertised again within the school but used part of the SICSA fund to offer Amazon vouchers as awards. We managed to recruit 30 testers in a week and since most were students or staff, the drop-off rate was marginal, and it was easier to verify the booking on Calendly and arrange the Zoom call. We finally have reasonably enough data to analyse!

Prolific provides integration with several survey platforms like Gorilla, Google forms, and MS forms. Our surveys were hosted on Qualtrics, and it was a smooth process to have participants navigate between the two platforms. We faced some issues during the lifetime of the two studies ranging from technical blips, participants drop-off, and processing refunds. Prolific’ s customer service were immense help. They have a very quick response rate and provide practical information and guidance. Also being part of the Prolific Research Community allowed for useful exchanges with peers around experiment design and Prolific features.

Overall, the Prolific experience was informative and provided good guidelines to designing and running studies and minimising participants drop-off rate. The support pages are very organised and helpful. However, it is felt that multi-participant scenarios require a non-standard approach. It was a tedious task to make sure the testers are available or find a substitute tester on the fly. The game porotype currently accommodates two players which facilitated arrangements, but it is expected that for more than two players, an elaborate scheme should be in place to accommodate the communication and coordination burden.

The unique scenario of needing several participants simultaneously should be considered and a more solid framework proposed.

We would like to thank SICSA for the Support for Remote Collaboration Activities funding which facilitated recruiting participants and collecting data. We hope sharing our experience and reflection will inspire and motivate fellow researchers.

SICSA Research Scholar Funding: IADS Summer School

13 September 2022

by Ipshita Roy Chowdhury, University of Stirling

I am a current PhD student in the division of Computing science at the University of Stirling. I have attended the Analytics, Data Science & Decision Making Summer School at the Institute for Analytics and Data Science, University of Essex from 25th July-29th July, 2022 funded by SICSA.

I would like to express my sincere gratitude to The Scottish Informatics and Computer Science Alliance (SICSA) for their contribution to my summer school visit this year. The funding from SICSA really helped me out to take part in the summer school and provided a great opportunity to experience a wide area of Computing and Data Science research. I got the chance to meet wonderful peers to exchange ideas and attended courses lead by well known academics and researchers of the field.

My current research interest focuses on working with malware behaviour capturing using ontology-driven knowledge-graphs and malware detection using Deep Learning based models. During my PhD, I’ve explored 1) use of image processing in describing the malware and their classifications using convolutional neural network models and 2) malware classification using their behaviours as features to machine learning models.

The summer school, organized by the Institute for Analytics and Data Science at University of Essex was a great opportunity to gain knowledge about the Big Data, Artificial Intelligence, Analytics and Decision Making. This was an immense help to learn about advanced techniques of Data Science and AI and establish a connection with the academics and industrial partners also to connect with peers for sharing knowledge.

Being a PhD student in Data Science and Artificial Intelligence field this program was very much aligned with my research from all its activities.

As a PhD student in Computing Science, I want to explore new fields of research and connect with other researchers. The summer school organized by IADS, University of Essex has provided me the opportunity to attend cutting-edge courses in Data Science field. This was a blending of teaching courses and research techniques related to Deep Learning, Analytics, Security and Machine Learning.

From the schedule of course topics I found that the courses like Data Protection and Security, Deep Learning for images, Natural Language Processing and Machine Learning will be extremely beneficial for me as I am working on malware behaviour capturing using API calls and detection using malware image processing. This program has provided me the practical experiences of the coursework.

Attending all these sessions was very much beneficial to me and it was a great pleasure to have keynotes from the experts of the field also having knowledge about the Data Science technology landscape and building a network with the peers.

I am immensely grateful to SICSA for the generous support to expand my knowledge and experience in the Data Science and AI-related fields. The fund was very important to me not only financially but to strengthen my experience and have an exposure to the wide array research community.