Input Validation

/ˈin-pu̇t va-lə-ˈdā-shən/

noun — "trust nothing, verify everything."

Input Validation is the process of examining, filtering, and verifying data before it is accepted, processed, stored, or acted upon by a system. Its purpose is to ensure that incoming data conforms to expected rules, formats, ranges, and constraints, preventing errors, security vulnerabilities, and unexpected behavior.

Every piece of data entering a system originates from somewhere outside the code currently executing. It may come from a user, another application, a database, a network connection, a file, an API request, a sensor, or even another trusted internal service. Regardless of the source, software cannot safely assume that the data is correct, complete, or well-formed.

This simple principle forms the foundation of input validation: all input is potentially wrong until proven otherwise.

Validation can be as simple as checking that a required field is not empty, or as complex as verifying that a structured document complies with an entire specification.

// basic validation

if (username == "") {
    reject("username required");
}

Common validation checks include:

Required values
Minimum and maximum lengths
Numeric ranges
Character restrictions
Date and time formats
Email and URL formats
File size and file type restrictions
Protocol and schema compliance

For example, if a system accepts ages between 0 and 120, validation should reject values outside that range.

// range validation

if (age < 0 || age > 120) {
    reject("invalid age");
}

This is where concepts such as borderline case and edge case become important. A value of 120 may be a borderline case because it sits directly on the allowed boundary. A value of 50,000 is an edge case because it lies far outside normal expectations.

Input validation is also one of the most important defensive practices in computer security. Many vulnerabilities arise because software accepts data without properly validating it. Malicious or malformed input can trigger crashes, corrupt data, bypass authentication, or exploit programming errors.

Historically, numerous attacks have been rooted in insufficient validation:

Buffer overflows
SQL injection
cross-site scripting (XSS)
Path traversal attacks
Command injection
Malformed protocol exploits

Consider a search feature:

// unsafe

query = user_input;
database.execute(
    "SELECT * FROM users WHERE name='" + query + "'"
);

Without proper validation or sanitization, specially crafted input could alter the intended behavior of the query. Modern systems therefore combine validation with parameterization, escaping, and other security controls.

A common design principle is to validate as early as possible. Rejecting invalid data near the entry point prevents errors from spreading deeper into the system where they become harder to detect and correct.

Another widely used approach is allow-listing rather than block-listing. Instead of attempting to identify every possible bad input, the system explicitly defines what is acceptable and rejects everything else.

// allow-list validation

if (!matches("^[a-zA-Z0-9]+$")) {
    reject("invalid characters");
}

Good validation improves reliability as much as security. It prevents invalid states, reduces debugging time, and makes system behavior more predictable. Many production failures are ultimately traced back to assumptions about data that turned out to be incorrect.

Conceptually, Input Validation acts as a checkpoint between the outside world and the system's internal logic. It is where incoming data is asked to prove that it belongs before being allowed further inside.

A useful way to think about validation is that software rarely fails because of the data it expected to receive. More often, it fails because of the data nobody expected to receive.

Ultimately, Input Validation is one of the simplest and most effective techniques in software engineering. It protects systems from mistakes, misunderstandings, malformed data, and malicious input by ensuring that only acceptable information is allowed to proceed.

See edge case, borderline case, Software Design, SQL Injection, overflow, sanitization

Computing

Security

Sanitization