CWE-20 输入验证不恰当

Improper Input Validation

结构: Simple

Abstraction: Class

状态: Stable

被利用可能性: High

基本描述

The product does not validate or incorrectly validates input that can affect the control flow or data flow of a program.

扩展描述

When software does not validate input properly, an attacker is able to craft the input in a form that is not expected by the rest of the application. This will lead to parts of the system receiving unintended input, which may result in altered control flow, arbitrary control of a resource, or arbitrary code execution.

相关缺陷

cwe_Nature: ChildOf cwe_CWE_ID: 693 cwe_View_ID: 1000 cwe_Ordinal: Primary
cwe_Nature: CanPrecede cwe_CWE_ID: 22 cwe_View_ID: 1000
cwe_Nature: CanPrecede cwe_CWE_ID: 41 cwe_View_ID: 1000
cwe_Nature: CanPrecede cwe_CWE_ID: 74 cwe_View_ID: 1000
cwe_Nature: CanPrecede cwe_CWE_ID: 119 cwe_View_ID: 1000

适用平台

Language: {'cwe_Class': 'Language-Independent', 'cwe_Prevalence': 'Undetermined'}

常见的影响

范围	影响	注释
Availability	['DoS: Crash, Exit, or Restart', 'DoS: Resource Consumption (CPU)', 'DoS: Resource Consumption (Memory)']	An attacker could provide unexpected values and cause a program crash or excessive consumption of resources, such as memory and CPU.
Confidentiality	['Read Memory', 'Read Files or Directories']	An attacker could read confidential data if they are able to control resource references.
['Integrity', 'Confidentiality', 'Availability']	['Modify Memory', 'Execute Unauthorized Code or Commands']	An attacker could use malicious input to modify data or possibly alter control flow in unexpected ways, including arbitrary command execution.

检测方法

DM-3 Automated Static Analysis

Some instances of improper input validation can be detected using automated static analysis.

A static analysis tool might allow the user to specify which application-specific methods or functions perform input validation; the tool might also have built-in knowledge of validation frameworks such as Struts. The tool may then suppress or de-prioritize any associated warnings. This allows the analyst to focus on areas of the software in which input validation does not appear to be present.

Except in the cases described in the previous paragraph, automated static analysis might not be able to recognize when proper input validation is being performed, leading to false positives - i.e., warnings that do not have any security consequences or require any code changes.

DM-4 Manual Static Analysis

When custom input validation is required, such as when enforcing business rules, manual analysis is necessary to ensure that the validation is properly implemented.

DM-5 Fuzzing

Fuzzing techniques can be useful for detecting input validation errors. When unexpected inputs are provided to the software, the software should not crash or otherwise become unstable, and it should generate application-controlled error messages. If exceptions or interpreter-generated error messages occur, this indicates that the input was not detected and handled within the application logic itself.

Automated Static Analysis - Binary or Bytecode

According to SOAR, the following detection techniques may be useful:

Cost effective for partial coverage:

Bytecode Weakness Analysis - including disassembler + source code weakness analysis
Binary Weakness Analysis - including disassembler + source code weakness analysis

Manual Static Analysis - Binary or Bytecode

According to SOAR, the following detection techniques may be useful:

Cost effective for partial coverage:

Binary / Bytecode disassembler - then use manual analysis for vulnerabilities & anomalies

Dynamic Analysis with Automated Results Interpretation

According to SOAR, the following detection techniques may be useful:

Highly cost effective:

Web Application Scanner
Web Services Scanner
Database Scanners

Dynamic Analysis with Manual Results Interpretation

According to SOAR, the following detection techniques may be useful:

Highly cost effective:

Fuzz Tester
Framework-based Fuzzer

Cost effective for partial coverage:

Host Application Interface Scanner
Monitored Virtual Environment - run potentially malicious code in sandbox / wrapper / virtual machine, see if it does anything suspicious

Manual Static Analysis - Source Code

According to SOAR, the following detection techniques may be useful:

Highly cost effective:

Focused Manual Spotcheck - Focused manual analysis of source
Manual Source Code Review (not inspections)

Automated Static Analysis - Source Code

According to SOAR, the following detection techniques may be useful:

Highly cost effective:

Source code Weakness Analyzer
Context-configured Source Code Weakness Analyzer

Architecture or Design Review

According to SOAR, the following detection techniques may be useful:

Highly cost effective:

Inspection (IEEE 1028 standard) (can apply to requirements, design, source code, etc.)
Formal Methods / Correct-By-Construction

Cost effective for partial coverage:

Attack Modeling

可能的缓解方案

MIT-7 Architecture and Design

策略: Input Validation

Use an input validation framework such as Struts or the OWASP ESAPI Validation API. If you use Struts, be mindful of weaknesses covered by the CWE-101 category.

MIT-7 Architecture and Design

策略: Libraries or Frameworks

Use an input validation framework such as Struts or the OWASP ESAPI Validation API. If you use Struts, be mindful of weaknesses covered by the CWE-101 category.

MIT-6 ['Architecture and Design', 'Implementation']

策略: Attack Surface Reduction

Understand all the potential areas where untrusted inputs can enter your software: parameters or arguments, cookies, anything read from the network, environment variables, reverse DNS lookups, query results, request headers, URL components, e-mail, files, filenames, databases, and any external systems that provide data to the application. Remember that such inputs may be obtained indirectly through API calls.

MIT-5 Implementation

策略: Input Validation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a whitelist of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs (i.e., do not rely on a blacklist). A blacklist is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, blacklists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.

Architecture and Design

策略:

For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server. Even though client-side checks provide minimal benefits with respect to server-side security, they are still useful. First, they can support intrusion detection. If the server receives input that should have been rejected by the client, then it may be an indication of an attack. Second, client-side error-checking can provide helpful feedback to the user about the expectations for valid input. Third, there may be a reduction in server-side processing time for accidental input errors, although this is typically a small savings.

Implementation

策略:

When your application combines data from multiple sources, perform the validation after the sources have been combined. The individual data elements may pass the validation step but violate the intended restrictions after they have been combined.

MIT-35 Implementation

策略:

Be especially careful to validate all input when invoking code that crosses language boundaries, such as from an interpreted language to native code. This could create an unexpected interaction between the language boundaries. Ensure that you are not violating any of the expectations of the language with which you are interfacing. For example, even though Java may not be susceptible to buffer overflows, providing a large argument in a call to native code might trigger an overflow.

Implementation

策略:

Directly convert your input type into the expected data type, such as using a conversion function that translates a string into a number. After converting to the expected data type, ensure that the input's values fall within the expected range of allowable values and that multi-field consistencies are maintained.

Implementation

策略:

Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180, CWE-181). Make sure that your application does not inadvertently decode the same input twice (CWE-174). Such errors could be used to bypass whitelist schemes by introducing dangerous inputs after they have been checked. Use libraries such as the OWASP ESAPI Canonicalization control. Consider performing repeated canonicalization until your input does not change any more. This will avoid double-decoding and similar scenarios, but it might inadvertently modify inputs that are allowed to contain properly-encoded dangerous content.

Implementation

策略:

When exchanging data between components, ensure that both components are using the same character encoding. Ensure that the proper encoding is applied at each interface. Explicitly set the encoding you are using whenever the protocol allows you to do so.

Testing

策略:

Use automated static analysis tools that target this type of weakness. Many modern techniques use data flow analysis to minimize the number of false positives. This is not a perfect solution, since 100% accuracy and coverage are not feasible.

Testing

策略:

Use dynamic tools and techniques that interact with the software using large test suites with many diverse inputs, such as fuzz testing (fuzzing), robustness testing, and fault injection. The software's operation may slow down, but it should not become unstable, crash, or generate incorrect results.

示例代码

例

This example demonstrates a shopping interaction in which the user is free to specify the quantity of items to be purchased and a total is calculated.

bad Java

...
public static final double price = 20.00;
int quantity = currentUser.getAttribute("quantity");
double total = price * quantity;
chargeUser(total);
...

The user has no control over the price variable, however the code does not prevent a negative value from being specified for quantity. If an attacker were to provide a negative value, then the user would have their account credited instead of debited.

例

This example asks the user for a height and width of an m X n game board with a maximum dimension of 100 squares.

bad C

...
#define MAX_DIM 100
...
/ board dimensions /

int m,n, error;
board_square_t board;
printf("Please specify the board height: \n");
error = scanf("%d", &m);
if ( EOF == error ){
die("No integer passed: Die evil hacker!\n");
}
printf("Please specify the board width: \n");
error = scanf("%d", &n);
if ( EOF == error ){
die("No integer passed: Die evil hacker!\n");
}
if ( m > MAX_DIM || n > MAX_DIM ) {
die("Value too large: Die evil hacker!\n");
}
board = (board_square_t) malloc( m * n * sizeof(board_square_t));
...

While this code checks to make sure the user cannot specify large, positive integers and consume too much memory, it does not check for negative values supplied by the user. As a result, an attacker can perform a resource consumption (CWE-400) attack against this program by specifying two, large negative values that will not overflow, resulting in a very large memory allocation (CWE-789) and possibly a system crash. Alternatively, an attacker can provide very large negative values which will cause an integer overflow (CWE-190) and unexpected behavior will follow depending on how the values are treated in the remainder of the program.

例

The following example shows a PHP application in which the programmer attempts to display a user's birthday and homepage.

bad PHP

$birthday = $_GET['birthday'];
$homepage = $_GET['homepage'];
echo "Birthday: $birthday<br>Homepage: <a href=$homepage>click here</a>"

The programmer intended for $birthday to be in a date format and $homepage to be a valid URL. However, since the values are derived from an HTTP request, if an attacker can trick a victim into clicking a crafted URL with <script> tags providing the values for birthday and / or homepage, then the script will run on the client's browser when the web server echoes the content. Notice that even if the programmer were to defend the $birthday variable by restricting input to integers and dashes, it would still be possible for an attacker to provide a string of the form:

attack

2009-01-09--

If this data were used in a SQL statement, it would treat the remainder of the statement as a comment. The comment could disable other security-related logic in the statement. In this case, encoding combined with input validation would be a more useful protection mechanism.

Furthermore, an XSS (CWE-79) attack or SQL injection (CWE-89) are just a few of the potential consequences when input validation is not used. Depending on the context of the code, CRLF Injection (CWE-93), Argument Injection (CWE-88), or Command Injection (CWE-77) may also be possible.

例

This function attempts to extract a pair of numbers from a user-supplied string.

bad C

void parse_data(char untrusted_input){

int m, n, error;
error = sscanf(untrusted_input, "%d:%d", &m, &n);
if ( EOF == error ){
die("Did not specify integer value. Die evil hacker!\n");
}
/ proceed assuming n and m are initialized correctly */

}

This code attempts to extract two integer values out of a formatted, user-supplied input. However, if an attacker were to provide an input of the form:

attack

123:

then only the m variable will be initialized. Subsequent use of n may result in the use of an uninitialized variable (CWE-457).

例

The following example takes a user-supplied value to allocate an array of objects and then operates on the array.

bad Java

private void buildList ( int untrustedListSize ){

if ( 0 > untrustedListSize ){

die("Negative value supplied for list size, die evil hacker!");

}
Widget[] list = new Widget [ untrustedListSize ];
list[0] = new Widget();

}

This example attempts to build a list from a user-specified value, and even checks to ensure a non-negative value is supplied. If, however, a 0 value is provided, the code will build an array of size 0 and then try to store a new Widget in the first location, causing an exception to be thrown.

例

This application has registered to handle a URL when sent an intent:

bad Java

...
IntentFilter filter = new IntentFilter("com.example.URLHandler.openURL");
MyReceiver receiver = new MyReceiver();
registerReceiver(receiver, filter);
...

public class UrlHandlerReceiver extends BroadcastReceiver {

@Override
public void onReceive(Context context, Intent intent) {

if("com.example.URLHandler.openURL".equals(intent.getAction())) {

String URL = intent.getStringExtra("URLToOpen");
int length = URL.length();

...
}

}

}

The application assumes the URL will always be included in the intent. When the URL is not present, the call to getStringExtra() will return null, thus causing a null pointer exception when length() is called.

分析过的案例

标识	说明	链接
CVE-2008-5305	Eval injection in Perl program using an ID that should only contain hyphens and numbers.	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5305
CVE-2008-2223	SQL injection through an ID that was supposed to be numeric.	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-2223
CVE-2008-3477	lack of input validation in spreadsheet program leads to buffer overflows, integer overflows, array index errors, and memory corruption.	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3477
CVE-2008-3843	insufficient validation enables XSS	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3843
CVE-2008-3174	driver in security product allows code execution due to insufficient validation	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3174
CVE-2007-3409	infinite loop from DNS packet with a label that points to itself	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-3409
CVE-2006-6870	infinite loop from DNS packet with a label that points to itself	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-6870
CVE-2008-1303	missing parameter leads to crash	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1303
CVE-2007-5893	HTTP request with missing protocol version number leads to crash	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-5893
CVE-2006-6658	request with missing parameters leads to information exposure	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-6658
CVE-2008-4114	system crash with offset value that is inconsistent with packet size	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-4114
CVE-2006-3790	size field that is inconsistent with packet size leads to buffer over-read	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3790
CVE-2008-2309	product uses a blacklist to identify potentially dangerous content, allowing attacker to bypass a warning	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-2309
CVE-2008-3494	security bypass via an extra header	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3494
CVE-2006-5462	use of extra data in a signature allows certificate signature forging	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-5462
CVE-2008-3571	empty packet triggers reboot	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3571
CVE-2006-5525	incomplete blacklist allows SQL injection	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-5525
CVE-2008-1284	NUL byte in theme name cause directory traversal impact to be worse	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1284
CVE-2008-0600	kernel does not validate an incoming pointer before dereferencing it	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-0600
CVE-2008-1738	anti-virus product has insufficient input validation of hooked SSDT functions, allowing code execution	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1738
CVE-2008-1737	anti-virus product allows DoS via zero-length field	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1737
CVE-2008-3464	driver does not validate input from userland to the kernel	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3464
CVE-2008-2252	kernel does not validate parameters sent in from userland, allowing code execution	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-2252
CVE-2008-2374	lack of validation of string length fields allows memory consumption or buffer over-read	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-2374
CVE-2008-1440	lack of validation of length field leads to infinite loop	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1440
CVE-2008-1625	lack of validation of input to an IOCTL allows code execution	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-1625
CVE-2008-3177	zero-length attachment causes crash	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3177
CVE-2007-2442	zero-length input causes free of uninitialized pointer	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-2442
CVE-2008-5563	crash via a malformed frame structure	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5563
CVE-2008-5285	infinite loop from a long SMTP request	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-5285
CVE-2008-3812	router crashes with a malformed packet	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3812
CVE-2008-3680	packet with invalid version number leads to NULL pointer dereference	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3680
CVE-2008-3660	crash via multiple "." characters in file extension	https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-3660

Notes

Relationship

Applicable Platform

Maintenance Input validation - whether missing or incorrect - is such an essential and widespread part of secure development that it is implicit in many different weaknesses. Traditionally, problems such as buffer overflows and XSS have been classified as input validation problems by many security professionals. However, input validation is not necessarily the only protection mechanism available for avoiding such problems, and in some cases it is not even sufficient. The CWE team has begun capturing these subtleties in chains within the Research Concepts view (CWE-1000), but more work is needed. Terminology

Research Gap There is not much research into the classification of input validation techniques and their application. Many publicly-disclosed vulnerabilities simply characterize a problem as "input validation" without providing more specific details that might contribute to a deeper understanding of validation techniques and the weaknesses they can prevent or reduce. Validation is over-emphasized in contrast to other neutralization techniques such as filtering and enforcement by conversion. See the vulnerability theory paper.

分类映射

映射的分类名	ImNode ID	Fit	Mapped Node Name
7 Pernicious Kingdoms			Input validation and representation
OWASP Top Ten 2004	A1	CWE More Specific	Unvalidated Input
CERT C Secure Coding	ERR07-C		Prefer functions that support error checking over equivalent functions that don't
CERT C Secure Coding	FIO30-C	CWE More Abstract	Exclude user input from format strings
CERT C Secure Coding	MEM10-C		Define and use a pointer validation function
WASC	20		Improper Input Handling
Software Fault Patterns	SFP25		Tainted input to variable

相关攻击模式

引用