Authors and Affiliations
- Anna-Katharina Wickert: Technische Universität Darmstadt, Darmstadt, Germany ([email protected])
- Lars Baumgärtner: Technische Universität Darmstadt, Darmstadt, Germany ([email protected])
- Florian Breitfelder: Technische Universität Darmstadt, Darmstadt, Germany ([email protected])
- Mira Mezini: Technische Universität Darmstadt, Darmstadt, Germany ([email protected])
Table of Contents
- Abstract
- Introduction
- Background
- Design and Implementation of LICMA
- Design
- Implementation
- Methodology
- Searching and Downloading Python Apps
- Comparison with Previous Studies
- Evaluation
- GitHub Python Projects
- MicroPython
- Comparison with Previous Studies
- Threats to Validity
- Related Work
- Conclusion, Acknowledgments, and References
Abstract
Cryptography, often referred to as "crypto," is essential in today’s digital world for keeping our online activities secure, from banking to shopping. However, misuse of cryptographic APIs can lead to significant security issues. Previous studies indicated that nearly all Java apps misused crypto APIs. But what about Python? Our study dives into this by analyzing real-world Python projects.
Aims
Our goal was to see if Python projects also have crypto misuses like Java and C. To do this, we developed a tool to analyze Python code for common crypto mistakes.
Method
We examined 895 popular Python projects on GitHub and 51 MicroPython projects for embedded devices. We compared the results with previous studies on Java and C.
Results
Surprisingly, only about 52.26% of Python projects had crypto misuses, a significant improvement over Java. This suggests that Python’s API design might help developers avoid mistakes. However, many misuses were due to dependencies rather than the application code itself. Our study also highlighted the importance of checking how different languages interact in embedded systems.
Conclusion
Good API design seems to reduce crypto misuses in Python compared to Java and C. Our findings highlight the need for tools that can analyze code across multiple languages.
Introduction
Cryptography plays a vital role in securing our digital world. Without it, online transactions and communications would be vulnerable. However, research shows that cryptography is often used incorrectly, leading to security issues. Tools like CryptoREX, CryptoLint, CogniCryptSAST, and Cryptoguard help identify these problems, mainly in Java and C.
Studies suggest that Python’s crypto APIs might be less prone to misuse. One study found that 68.5% of developers could write secure Python code for crypto tasks. Another study highlighted that simple API designs, like those in Python, could aid developers in avoiding mistakes. Yet, no large-scale study had confirmed Python’s advantage until now.
To explore this, we developed LICMA, a tool that checks for crypto misuses in Python and Java. We used it to analyze numerous Python apps to see how often mistakes occur and how they compare to Java and C.
Background
Previous research highlighted that nearly every Java app misuses crypto APIs, posing security risks. These studies focused on Java and C, leaving a gap in understanding for languages like Python. With Python’s growing popularity, especially in machine learning and web development, understanding its crypto practices is crucial.
Simple, well-designed APIs can help developers avoid mistakes. Python’s libraries, like cryptography
, aim to do just that. They offer straightforward, secure defaults, reducing the chances of errors. Our study builds on this by examining real-world Python projects to see if they live up to this promise.
Design and Implementation of LICMA
Design
LICMA is a multi-language analysis framework that supports Python and Java. It uses a set of rules to detect common crypto misuses across different APIs. For Python, it covers five widely-used crypto libraries. For Java, it examines the standard JCA API.
Implementation
We implemented LICMA to scan codebases and identify crypto misuses efficiently. It analyzes both the application code and its dependencies, providing a comprehensive view of potential security issues. This approach is vital as many misuses arise from libraries rather than the code itself.
Methodology
Searching and Downloading Python Apps
To conduct our study, we selected 895 popular Python projects from GitHub. These projects span various domains, ensuring a diverse sample. Additionally, we included 51 MicroPython projects to understand crypto use in embedded systems.
Comparison with Previous Studies
We compared our findings with earlier studies on Java and C. This comparison helps highlight differences and similarities in crypto practices across languages. It also sheds light on Python’s potential advantages and areas for improvement.
Evaluation
GitHub Python Projects
Our analysis revealed that 52.26% of Python projects had at least one crypto misuse. While this is a significant number, it’s much lower than the 99.59% seen in Java studies. Most misuses in Python were due to dependencies, not the application code itself.
MicroPython
In the embedded domain, developers often use C for crypto tasks, even in MicroPython projects. This practice underscores the need for tools that can analyze multi-language codebases.
Comparison with Previous Studies
Our study found that Python projects tend to have fewer crypto misuses than Java or C. This result supports the idea that Python’s API design helps developers avoid mistakes. However, the types of misuses varied, suggesting that each language has unique challenges.
Threats to Validity
Several factors could affect our study’s validity. The selection of projects, the accuracy of LICMA, and the evolving nature of software development are potential concerns. Despite these, our findings provide valuable insights into crypto practices in Python.
Related Work
Previous studies focused primarily on Java and C, revealing widespread crypto misuses. Our work expands this by examining Python, providing a broader understanding of crypto practices across languages.
Conclusion, Acknowledgments, and References
In conclusion, Python’s API design seems to reduce crypto misuses compared to Java and C. However, dependencies still pose a risk, highlighting the need for comprehensive analysis tools. Our study emphasizes the importance of good API design and cross-language analysis in improving software security.
We thank everyone who contributed to this research and acknowledge the support from Technische Universität Darmstadt. For further reading and references, please consult the full study documentation and our replication package [1].
[1] dx.doi.org/10.6084/m9.figshare.16499085
Image Credit: hackernoon.com