We're excited to share florence2-sharp, a C# library implementing the Florence-2-model for advanced image understanding tasks. Florence-2 uses a prompt-based approach to a variety of vision tasks, and provides great zero-shot performance across many vision tasks.
Our C# library supports:
- Image captioning (from concise to detailed)
- Optical Character Recognition (OCR)
- Region-based OCR
- Object detection
- Optional phrase grounding
The library is a C# port of Microsoft's Florence-2 model (from https://huggingface.co/microsoft/Florence-2-base), based on the original model and the JS port by Frank Krueger (https://github.com/praeclarum/transformers-js).