A manufacturer was running a 2018-era CV defect detector that needed retraining for every new product variant. We replaced it with a fine-tuned Qwen-VL deployed on Jetson Orin at every inspection station - generalising across variants without retraining.
The client's existing defect-detection system was a classical CNN that needed weeks of relabelling and retraining for every new product variant. False-negative rate was 29% on edge cases. They were losing on warranty claims and customer trust. They wanted a system that could understand “what counts as a defect” the way a human inspector would - and ship in a form factor that bolts onto the line.
Built a multimodal dataset: image + text description of the defect, with severity tags. Pulled from 6 years of QA archives plus 12 weeks of fresh capture. ~40k labelled examples across 200 defect classes.
QLoRA fine-tune of Qwen-VL-7B on the defect corpus. Trained the model to output structured JSON: { has_defect, type, severity, location, confidence }.
INT4 AWQ quantization plus TensorRT compilation for the vision encoder. Fits in 12GB VRAM with headroom for the image-processing pipeline.
Each inspection station runs a Jetson Orin AGX with the model, a camera trigger, and a custom runtime. Sub-180ms per frame at 1080p. No internet required.
Edge devices send only their low-confidence cases (with PII-stripped frames) to a central retraining queue. Models redeploy OTA every 2 weeks with new variant coverage - no per-variant retraining cycle.
“We used to retrain for every new product. Now the model reads the defect description like an inspector would - and we ship new variants on day one.”
- Plant manager, industrial manufacturer (name withheld)
A 30-minute call. We'll tell you whether we can help - and if not, who can.