Gaps and Challenges

Authors:
Marco Anisetti, Università degli studi di Milano
Claudio Ardagna, Università degli studi di Milano
Marco Cremonini, Università degli studi di Milano
Ernesto Damiani, Università degli studi di Milano
Jadran Sessa, Università degli studi di Milano

On this page, we show gaps and challenges prior to and after the emergence of COVID-19 pandemic as identified by the H2020 CONCORDIA project.


Gaps and Challenges prior to COVID-19 Era

The following list showcases gaps that were driven by the threats prior to COVID-19 and that have also remained relevant in COVID-19 era.

  • G4.1 – Gaps on data protection. The major gaps on data protection include threats to privacy and confidentiality of sensor data streams. Furthermore, loss of information, interception of sensitive data, and unauthorized acquisition of information are the main threats, while phishing and identity fraud due to traffic capture and data mining have been exacerbated by COVID-19. In order to facilitate the issue of the privacy intrusion, it is crucial to define solutions spanning beyond the application of smart cryptographic techniques. The employment of solutions such as real-time monitoring, assurance, and anonymization techniques provides fairly limited benefits. Some emerging potential solutions include privacy-preserving data mining [1] and privacy-preserving machine learning [2]. To alleviate another critical issue affecting current systems, namely user identity falsification, protecting data confidentiality and privacy through the means of advanced authentication, authorization, and access control solutions is of the essence. Accordingly, streams of trustworthy data from sensors should be certified when possible (see G4.6).
  • G4.2 – Gaps on the use of cryptography in applications and back-end data intensive services. The adoption of cryptographic solutions in Big Data environment is often challenging, owing to their complexity, flexibility, performance, and scalability issues in such environments. Another important aspect that has to be considered in distributed scenarios like the cloud, as well as when the data streaming has to be verified and certified is key management. Since integrity verification solutions are not appropriate due to the sheer size and collection rate of Big Data, alternate solutions, such as TPMs (see G4.3), the evaluation of sensor behaviors (see G4.5), and the monitoring of sensor configuration (see G4.6) have to be considered.
  • G4.3 – Gaps on computing and storage models and infrastructures. Lack of standard solutions, security controls’ portability issues in open source projects (e.g., different Hadoop versions) and Big Data vendors, inadequate design and planning or incorrect adaptation of a Big Data platforms, as well as the models’ and infrastructures’ complexity can lead to a variety of problems, including data management threats, misconfigurations and human errors. Moreover, correctness of data collection and ingestion activities represents a challenge related to the data protection problem (G4.1), whereas the design and deployment of a trustworthy Big Data platform require in-depth testing and verification.
  • G4.4 – Gaps on roles (skill shortage). Nowadays, the difficult task of managing security in a landscape consisting of small services with high-rate deployment changes is performed by incorporating so-called DevOps methodologies [3], which allow early detection of bugs and potential security issues. However, these methodologies necessitate the presence of “security culture” in the development teams and the application of security in novel ways, which in turn requires knowledge beyond traditional cybersecurity skills. When it comes to data intensive process applications, such as Big Data, there is a gap in terms of roles and skills. While the positions in high demand, including data scientist, data engineers, and Big Data system administrators are unlikely to be filled in the near future, users mays stay unaware of the legal implications of data storage. Awareness, education, and training are the keys for closing this gap. This gap directly reflects on the gap G6.4, especially on universities who have started offering Data Science degrees only recently.
  • G4.5 – Gaps on data trustworthiness. Distinguishing between correct and fake data is of a critical importance and represents a major gap in systems. The implementation of safe autonomic and adaptive processes at the basis of ICT system functioning depends on the trustworthiness of data. Unfortunately, in the existing literature data trustworthiness is often neglected and taken for granted. As a result, current autonomic and adaptive systems make data decisions without filtering data beforehand. To solve this problem, trustworthy data collection and ingestion should be implemented, while proper data domain should be based a standard and trustworthy data collection, which can differentiate between correct and fake data. Extending assurance verification to data collection and ingestion would further contribute in filling this gap. Detection of the adversarial AI is another challenge encompassed by this gap.
  • G4.6 – Gaps on decision support systems. Decisions in autonomic and adaptive systems are made based on the collected field data. However, since the humans are components of the systems, all the human-related risks and unpredictability are present. Even though such traditional systems strive to maximize the decision quality, it is often difficult to prove/audit the correctness of decisions due to untrusted/unverified data that is accepted based on the provider reputation. Solution of this problem is directly connected to the data trustworthiness (G4.5), whereas solutions based on false positives reduction could potentially aid in management of the false alarms.
  • G4.7 – Gaps on ethics. It is essential to properly adopt emerging technologies, such as data analytics, machine learning, and artificial intelligence for maintaining human rights. These technologies have raised many ethics-related concerns, including surveillance, manipulation of behavior, opacity of AI systems, bias in decision systems, human-robot interaction, artificial moral agents, as well as fairness vs. bias question in ML/AI [4][5]. In all of the cases, bias in decision system plays an important role, which is connected to gap G4.6.
  • G5.1 – Gaps on microservices-aware security. Microservices and distributed systems follow an established communication pattern on the HTTP protocol, while traditional network security is based on firewalls. However, when encountering easily scalable microservices, standard firewalls struggle to implement necessary address- and port-based security rules. As a consequence, web application attacks are on the rise. Most of the existing and emerging solutions are bound to a specific platform, thus, WAF (Web Application Firewall) is being increasingly adopted by companies.
  • G5.2 – Gaps on authentication and authorization. The heterogeneity and the complexity of microservice-based deployment pose significant challenges to authentication and authorization. Moreover, since microservices enable writing applications in different languages/frameworks, the way authentication and authorization are handled introduces vulnerabilities. Concurrently, client applications that rely on password-based authentication (in which users have to select strong passwords) face the same issue.
  • G5.3 – Gaps on orchestration and composition. Managing security in applications composed of multitude of small components is difficult, since it requires securing microservices themselves (Gap G5.1), external software being used (e.g., databases, message brokers, and the orchestration platform, for which deployment a number of concepts have to be mastered beforehand. Also, some distributed architectures have single points of failure (e.g. API gateways), which is even further exacerbated by CI/CD methodologies. In such methodologies, software is deployed automatically, making it prone to bugs and security issues (see Gap 5.6).
  • G5.4 – Gaps on safety and security by default. Traditional desktop applications and operating systems are written in low-level programming languages, making them prone to security vulnerabilities, in particular memory bugs. Although this issue is not new, memory-safe programming languages have been considered unfit for critical systems performance until recently, when many vendors started adopting safer languages, such as Rust. However, there are still certain gaps that have to be filled. Since it is infeasible to rewrite the whole code from the scratch, automatic translation tools should be utilized to rewrite the most critical parts of the code, i.e. modules dealing with inputs. Moreover, both academia and industry should promote the use of sager languages for the development of safe-by-design products. Even though there are ways for increasing the safety of intrinsically unsafe languages, such solutions are often not integrated with the main toolchains, hence requiring additional steps. Lastly, this gap is even further exacerbated when considering IoT devices.
  • G5.5 – Gaps on the proper management of configurations. The management of configurations, and especially credential management, is one of the main challenges in the development and deployment of modern distributed systems. The best solutions that involve the deployment of encrypted storage systems are notoriously difficult to use. Some of the recent breaches have shown that configuration stores are often not properly secured and that the credentials within configuration stores are also insecure. Hence, stronger and more reliable credential management is required not only for APIs environment, but also for IoT devices, client-side configuration, and credential storage.
  • G5.6 – Gaps on supply-chain security. The security and the safety of an ICT product inherit security from all of its components. Usually, supply-chain security indicates that the application security is not completely under the control of the developers. According to ENISA, supply-chain security represents a critical gap in the age of CPU vulnerabilities, fake mobile apps, and state-sponsored attacks.
  • G6.1 – Gaps on modelling user behavior. The question of how user behavior, relevant to cybersecurity, should be addressed with the respect of reducing the frequency of errors and preventing cascade effects, and forecasting threats likelihood has remained unanswered for decades. Different efforts of developing behavioral analysis, such as the ones utilizing Host-based Intrusion Detection Systems (HIDS) or AI have not managed to produce robust solutions until today. On the other hand, despite being successful, the integration of research on human errors with cybersecurity issues is still in the early stages. Similarly, other than characterizing the so-called “hacker mindset”, behavioral studies, which have been intensively researched in different fields, such as psychology, sociology, and economy, have never seen much success in converging with cybersecurity.
  • G6.2 – Gaps on the relation between user behavior and adverse security-related effects. Recently, there have been many studies on phishing, and testing employees’ likelihood of falling victim to social engineering has become a standard measure for security-conscious enterprises. Quantitative statistics indicate that the number of social engineering cases has dropped considerably over the years. On the other hand, social engineering, phishing, and ransomware attack have become more relevant. In the past, there was a misconception that phishing, social engineering, and in general user-cantered threats should be counted instead of weighted. Even though the total number of such cases has dropped significantly, difficult cases that can potentially wreak havoc persevered. Such threats can halt the operations of critical health divisions, the production lines, or deceive executives into making dreadful decisions. The legacy solutions are not able to alleviate the problem, rendering companies defenseless.
  • G6.3 – Gaps on security information. Data and knowledge about the threat landscape are limited in quantity and have low quality. Consequently, the same questions and uncertainties are repeated multiple times. Also, significant uncertainty about the relative importance between internal and external sources of attacks, the type and nature of main threats, and ranking threats and vulnerabilities remains. For those issues and the issues regarding data knowledge sharing among difficult subjects, analyses are repeated over and over. Data and knowledge sharing flow with difficulty and have poor quality, resulting in slow and partial awareness build-up. Moreover, assessing the quality of reports and surveys has a low quality, which can be reflected from the equal classification of good and poor quality ones.
  • G6.4 – Gaps on security training and education. There is yet no general agreement as to what is adequate cybersecurity education and who should be responsible for providing it. There are several proposals in place, ranging from enrolling teenagers in cybersecurity programs to advanced studies in cybersecurity. Most of the proposals in between, including cybersecurity programs offered by Computer Science/Engineering or Law Management programs or hands-on Vs. theoretic approaches have achieved recognition to a certain degree. Other unanswered questions consider the topics of appropriate content, the amount of required knowledge for the cybersecurity workforce, and coherent curricula. In an attempt to answer those questions, the recent consensus has been leaning towards multidisciplinary education. However, this proposal has risen a plethora of doubts, the most prominent being that it could lead to diverse superficial programs that will cover many issues without teaching, ultimately leaving a sense of vagueness and lacking characterization.
  • G6.5 – Gaps in collaborative protocols for disclosure. Even though their relevance remained uncertain, vulnerability disclosure procedures have been the point of discussion on which security researchers and software companies have not agreed for years. More recently, the rise of bug bounty programs and the standard ISO/IEC 29147:2018 Information technology — Security techniques — Vulnerability disclosure have started making a progress in this debate. Even though the business-oriented approach introduced by bug bounties seemed to function flawlessly in the start, some old and novel problems surfaced, including reward problems, the degree of freedom for the researchers, and the doubts surrounding the effectiveness of the approach. On the other hand, the ISO 29147 standard did not find support from the industry and thus did not manage to materialize. Consequently, the problem of governing the vulnerability disclosure process remains mostly unaddressed by the public, while at the same it has become the central business of the shadowy industrial sector.

Emerging Gaps and Challenges in COVID-19 Era

The advent of COVID-19 enhanced:

  1. Gaps on data protection and on data trustworthiness (G4.1 and G4.5),
  2. Gaps on decision support systems (G4.6),
  3. Gaps on ethics (G4.7),
  4. Gaps on supply-chain security (G5.6),
  5. gaps on supply-chain security (G5.6)

Furthermore, COVID-19 also generated the following new gaps:

  • G4.8 – Gaps on videoconferencing tools. With the emergence of COVID-19 traditional meetings were completely replaced by virtual meetings hosted by video-conferencing tools. However, these tools were not designed to withstand the increasing resource demands nor to achieve the required scalability and support necessary security and identity management requirements. In turn, this resulted in an increased risk of unauthorized participants.
  • G4.9 – Gaps on data management across borders. The pandemic has affected the boundaries of IT systems and data centers, and users’ online behavior in a way that they started connecting to private networks from hostile sites. Hence, new approaches for managing remote accesses and safeguarding data availability and integrity are mandatory to mitigate the risks of malware and ransomware.
  • G5.8 – Gaps on interoperability. COVID-19 demonstrated an urgent need for systems interoperability and revision, especially in public services, which are often obsolete and difficult to integrate with modern systems and apps. In the EU, systems interoperability across national borders is of the essence. Education has a critical role in enabling people to take advantage of digital services.
  • G5.9 – Gaps on education. Education denotes the necessity to educate everyone to use digital technologies correctly and safely. Users should also be more aware of attacks relying on social engineering and phishing, such as the Twitter bitcoin scam. Even though user awareness documents a rise, another peril comes from the increasingly personalized and complex attacks, that are not easily recognizable.
  • G5.10 – Gaps on sophisticated protection. According to ENISA, the attack surface is continuously expanding. Concurrently, boundaries of cybersecurity protection are also enlarging and becoming difficult to define due to remote and smart working. Therefore, novel and more sophisticated forms of protection that consider the human factor are of the essence for safeguarding trust boundaries, as well as AI, which is also becoming increasingly targeted recently.
  • G9.1 – Gaps on protection from online scammers. With the emergence of COVID-19, the notorious FUD triple (fear, uncertainty, and doubt) has resurfaced in society. Similarly, as in the previous states of distress, the scammers are again ready to exploit people distraught, desperate, and depressive people. During the course of the pandemic, cybercriminals have carried out a wide range of well-known online scams, including phishing email campaigns, fake products, fraudulent advertising, and preposterous pseudoscientific theories.

[1] Lei Xu, Chunxiao Jiang, Jian Wang, Jian Yuan, and Yong Ren. Information security in big data: privacy and data mining. Ieee Access, 2:1149–1176, 2014.

[2] Kaihe Xu, Hao Yue, Linke Guo, Yuanxiong Guo, and Yuguang Fang. Privacy-preserving machine learning algorithms
for big data systems. In 2015 IEEE 35th international conference on distributed computing systems, pages 318–327. IEEE, 2015.

[3] Akond Ashfaque Ur Rahman and Laurie Williams. Software security in devops: synthesizing practitioners’ perceptions and practices. In 2016 IEEE/ACM International Workshop on Continuous Software Evolution and Delivery (CSED), pages 70–76. IEEE, 2016.

[4] Reuben Binns. Fairness in machine learning: Lessons from political philosophy. In Conference on Fairness, Accountability and Transparency, pages 149–159. PMLR, 2018

[5] Elizabeth Gibney. The battle for ethical ai at the world’s biggest machine-learning conference. Nature, 577(7791):609–610, 2020.