Image Preprocessing Checklist Before Sending to an AI API

Published: April 2026 Reading time: 7 minutes

Every call to a vision AI API — GPT-4o Vision, Google Cloud Vision, AWS Rekognition, Azure Computer Vision — costs money and takes time. When you send a bad image, you still pay, you still wait, and you get a bad result. In high-volume pipelines, that waste adds up quickly.

This checklist covers everything you should validate before making the API call. It applies to any vision AI service and can be automated in Python in about 10 lines of code.

The checklist

✓
Image loads without error — Check that the file can be decoded. Corrupted files, truncated uploads, and wrong extensions are more common than you'd think. Use cv2.imread() or PIL and verify the result is not None.
✓
Correct format — Most APIs accept JPEG, PNG, and WebP. GIF, TIFF, BMP, and HEIC may or may not be supported. Convert to JPEG before the call if in doubt.
✓
File size within limit — GPT-4o Vision: 20 MB. Google Vision: 10 MB. AWS Rekognition: 15 MB. Larger files should be compressed or resized first.
✓
Minimum resolution — Anything below 100×100 will give poor results from most vision models. OCR needs at least 300×300. Face recognition needs at least 160×160.
✓
Not blurry — A blurry image gives blurry API results. OCR engines will miss characters; object detectors will miss objects; face detectors may not detect at all.
✓
Properly exposed — Overexposed (all white) and underexposed (all black) regions contain no usable information. Check that the image isn't more than 60% pure darks or lights.
✓
No severe compression artefacts — Heavily re-compressed JPEG images look worse than they appear on screen. For OCR especially, blockiness at 8×8 boundaries confuses character segmentation.
✓
Aspect ratio is reasonable — Extreme panoramas (> 3:1 ratio) and tall slivers confuse many models. Crop or pad to a reasonable aspect ratio if needed.

Automate the checklist in Python

from imageguard import validate
import os

def preprocess_for_api(image_path: str, api: str = "general") -> str | None:
    """Validate image and return path if ready, None if rejected."""
    # File-level checks
    if not os.path.exists(image_path):
        return None
    if os.path.getsize(image_path) > 10 * 1024 * 1024:  # 10 MB
        return None

    # Quality checks
    thresholds = {
        "ocr": {"blur_score": 60.0, "resolution_score": 70.0},
        "face": {"blur_score": 50.0, "resolution_score": 65.0},
        "general": {},  # use defaults
    }.get(api, {})

    result = validate(image_path, thresholds=thresholds)
    return image_path if result.ok else None


# Usage
ready = preprocess_for_api("scan.jpg", api="ocr")
if ready:
    response = ocr_api.call(ready)   # safe to proceed
else:
    log_skipped("scan.jpg", reason="quality_check_failed")

Per-API minimum requirements

API	Min resolution	Max file size	Formats
GPT-4o Vision	No hard minimum	20 MB	JPEG, PNG, WEBP, GIF
Google Cloud Vision	No hard minimum	10 MB	JPEG, PNG, WEBP, BMP, GIF
AWS Rekognition	80×80 (faces)	15 MB (S3: no limit)	JPEG, PNG
Azure Computer Vision	50×50	4 MB (free); 20 MB (paid)	JPEG, PNG, BMP, TIFF
Tesseract OCR (local)	300 DPI recommended	No limit	JPEG, PNG, TIFF, BMP

The cost of skipping validation

In a pipeline processing 10,000 images per day at $0.001 per API call, a 10% bad-image rate costs $1/day in wasted credits — but worse, it costs you the time to debug the downstream failures those bad results create.

Validation with imageguard runs in ~20–50ms per image. At 10,000 images/day that's less than 10 minutes of compute — a negligible price to eliminate an entire class of production bugs.

imageguard — automated image validation in Python

One call checks blur, noise, resolution, exposure, compression, and pixelation. Returns a simple pass/fail with reason. Open-source on GitHub.

View on GitHub →

Last updated: April 2026

Tags: AI API, Image Preprocessing, Python, Image Validation