• NuMind is a startup developing custom information extraction solutions.
  • NuExtract is a zero-shot model. See the blog posts for more info (NuExtract, NuExtract-v1.5).
  • We have started to deploy NuMind Enterprise to customize, serve, and monitor NuExtract privately. If that interests you, let's chat 😊.
  • Website: https://www.numind.ai/

NuExtract-v1.5

NuExtract-v1.5 is a fine-tuning of Phi-3.5-mini-instruct, trained on a private high-quality dataset for structured information extraction. It supports long documents and several languages (English, French, Spanish, German, Portuguese, and Italian). To use the model, provide an input text and a JSON template describing the information you need to extract.

⚠️ In this space we restrict the model inputs to a maximum length of 10k tokens, with anything over 4k being processed in a sliding window. For full model performance, self-host the model or contact us.
⚠️ The model is trained to assume a valid JSON template. Attempts to use invalid JSON could lead to unpredictable results.
Examples
Template Input Text Is Example?