Guides/Video & streaming/09 Make a model callable over HTTP

Make a model callable over HTTP

1from lynx import LYNX, serve
2 
3serve(LYNX("lynx-basic"), port=8080)
4# POST images to http://localhost:8080/detect

serve() is the "I need this model accessible over HTTP and I don't want to write a Flask app" escape hatch. Spin it up, POST images, get JSON detections back. Three endpoints: POST /detect (multipart with an image field → JSON results), GET /health (alive check), GET /info (model metadata — class names, input size, version).

Use it when:

A teammate isn't in Python (Java / Rust / Go services calling via HTTP)
You're offloading GPU work from an edge device to a server
You want curl reproducibility for debugging
A throwaway internal service is cheaper than building a real one

Don't use it when you're shipping a production inference service. It's blocking (one Python process, one request at a time), there's no auth, no rate limiting, no batching. Wrap in Docker + systemd if you're going to run it longer than a day, or graduate to a real FastAPI / gRPC service when traffic justifies it.

GET /info is the polite contract: clients can validate compatibility (class names, input dimensions, version) before sending images they know will mismatch. Use it.

React to events in real time →

Fire actions when something specific happens — a class detected, a zone entered, a line crossed.