CWE-94: Improper Control of Generation of Code ('Code Injection')
Learn about CWE-94 (Improper Control of Generation of Code ('Code Injection')), its security impact, exploitation methods, and prevention guidelines.
What is Improper Control of Generation of Code ('Code Injection')?
• Overview: CWE-94, Improper Control of Generation of Code ('Code Injection'), occurs when software constructs code segments using inputs that are not properly sanitized. This can lead to unintended code execution if attackers inject malicious code.
• Exploitation Methods:
- Attackers can exploit this vulnerability by injecting malicious code into inputs that are used in code generation.
- Common attack patterns include input manipulation in web applications, such as inserting scripts or commands into fields that directly influence code execution.
• Security Impact:
- Direct consequences of successful exploitation include unauthorized code execution, which can lead to data theft, data corruption, or system compromise.
- Potential cascading effects involve the attacker gaining further access or control over other parts of the system.
- Business impact includes financial loss, reputational damage, and potential legal liabilities.
• Prevention Guidelines:
- Specific code-level fixes involve validating and sanitizing all inputs before using them in code generation.
- Security best practices include employing strict input validation, using parameterized queries, and avoiding dynamic code execution whenever possible.
- Recommended tools and frameworks for prevention involve using static code analysis tools to detect code injection vulnerabilities and adopting secure coding frameworks that provide built-in protection against such attacks.
Technical Details
Likelihood of Exploit:
Affected Languages: Interpreted
Affected Technologies: AI/ML
Vulnerable Code Example
Python Example
import os
def execute_code(user_input):
# WARNING: Directly executing user input can lead to code injection!
exec(user_input) # Dangerous: Executes any code input by the user
user_input = input("Enter your code: ")
execute_code(user_input) # User input is passed without validation or sanitization
Explanation:
The above code demonstrates a classic CWE-94 vulnerability where user input is directly executed using Python's exec()
function. This allows an attacker to execute arbitrary code on the server, potentially compromising the application and the system it runs on. For example, an attacker could input os.system('rm -rf /')
to delete all files on the server, demonstrating the severity of this vulnerability.
How to fix Improper Control of Generation of Code ('Code Injection')?
To fix this vulnerability, avoid executing direct user input as code. Instead, use safer alternatives like:
- Validating and sanitizing input to ensure it meets expected patterns.
- Using safer execution environments or sandboxing techniques.
- Employing restricted execution libraries or functions that limit what can be executed based on user input. For instance, using
ast.literal_eval()
for evaluating simple expressions safely.
Fixed Code Example
import ast
def evaluate_expression(user_input):
# Use ast.literal_eval to safely evaluate the user input
try:
# Only evaluates strings that represent Python literal structures
result = ast.literal_eval(user_input) # Safe: Evaluates only literals, not arbitrary code
print("Result:", result)
except (ValueError, SyntaxError) as e:
print("Invalid input:", e) # Proper error handling for invalid inputs
user_input = input("Enter a Python literal (e.g., number, list): ")
evaluate_expression(user_input) # User input is evaluated safely, avoiding code execution
Explanation:
In the fixed code, ast.literal_eval()
is used to safely evaluate expressions. This function can only evaluate strings that represent Python literal structures, such as numbers, strings, tuples, lists, dicts, booleans, and None
. By using this method, the risk of executing arbitrary code is mitigated, as it does not execute any user-defined functions or methods. Additionally, proper error handling is added to deal with invalid inputs gracefully. This approach ensures that only safe, predefined data types are processed, significantly reducing the risk of code injection attacks.