Skip to content

Model Scanner

model_scanner

Model security scanner for pickle-based model files.

Uses picklescan (the same library HuggingFace uses) to detect potentially malicious code in .ckpt, .pt, and .pth model files. Safetensors files are inherently safe and are skipped.

ModelScanResult(path, is_safe, issues_count=0, scan_error=False, error_message='') dataclass

Result from scanning a single model file.

DirectoryScanSummary(total_scanned=0, safe_count=0, unsafe_count=0, error_count=0, skipped_safe_format=0, results=list()) dataclass

Aggregated results from scanning a models directory.

scan_model_file(filepath)

Scan a single model file for malicious pickle content.

Parameters:

Name Type Description Default
filepath Path

Path to the model file to scan.

required

Returns:

Type Description
ModelScanResult

ModelScanResult with safety status and issue details.

Source code in src/utils/model_scanner.py
def scan_model_file(filepath: Path) -> ModelScanResult:
    """Scan a single model file for malicious pickle content.

    Args:
        filepath: Path to the model file to scan.

    Returns:
        ModelScanResult with safety status and issue details.
    """
    try:
        from picklescan.scanner import scan_file_path

        result = scan_file_path(str(filepath))
        return ModelScanResult(
            path=filepath,
            is_safe=result.infected_files == 0 and not result.scan_err,
            issues_count=result.issues_count,
            scan_error=result.scan_err,
        )
    except Exception as e:  # noqa: BLE001
        return ModelScanResult(
            path=filepath,
            is_safe=False,
            scan_error=True,
            error_message=str(e),
        )

scan_models_directory(models_dir)

Recursively scan a models directory for unsafe pickle files.

Scans all .ckpt, .pt, .pth, .pkl files. Skips .safetensors, .gguf, .onnx (inherently safe).

Parameters:

Name Type Description Default
models_dir Path

Root models directory to scan.

required

Returns:

Type Description
DirectoryScanSummary

DirectoryScanSummary with aggregated results.

Source code in src/utils/model_scanner.py
def scan_models_directory(models_dir: Path) -> DirectoryScanSummary:
    """Recursively scan a models directory for unsafe pickle files.

    Scans all ``.ckpt``, ``.pt``, ``.pth``, ``.pkl`` files.
    Skips ``.safetensors``, ``.gguf``, ``.onnx`` (inherently safe).

    Args:
        models_dir: Root models directory to scan.

    Returns:
        DirectoryScanSummary with aggregated results.
    """
    summary = DirectoryScanSummary()

    if not models_dir.exists():
        return summary

    # Count safe-format files for the summary
    for safe_ext in SAFE_EXTENSIONS:
        summary.skipped_safe_format += len(
            list(models_dir.rglob(f"*{safe_ext}"))
        )

    # Find and scan all potentially unsafe files
    unsafe_files: list[Path] = []
    for ext in UNSAFE_EXTENSIONS:
        unsafe_files.extend(models_dir.rglob(f"*{ext}"))

    for filepath in sorted(unsafe_files):
        result = scan_model_file(filepath)
        summary.results.append(result)
        summary.total_scanned += 1

        if result.scan_error:
            summary.error_count += 1
        elif result.is_safe:
            summary.safe_count += 1
        else:
            summary.unsafe_count += 1

    return summary