Experimenting with Vision Models on OVH Cloud AI Endpoints

Can a general-purpose vision model read an aeronautical chart like a pilot?

In this article, I benchmark two vision-language models on a real visual approach chart (that of Chavenay, LFPX, my home airfield), asking each to extract ICAO code, coordinates, frequencies and chart metadata as structured JSON.

Qwen2.5-VL-72B-Instruct on OVHcloud AI Endpoints: surprisingly strong zero-shot, nails the ICAO code, coordinates and most frequencies, and even returns per-field confidence scores.
Llava-next-mistral-7b: struggles with field confusion, misread dates and unstable JSON output.

The article also argues for hybrid architectures (AI extraction, rule-based validation, human oversight) and explains why accessible cloud endpoints make iterative aerospace experimentation actually viable at reasonable cost.

Read on LinkedIn