Skip to content

OSSS.ai.services.nlp_extraction_service

OSSS.ai.services.nlp_extraction_service

NLPExtractionService

Lightweight NLP-based extraction for
  • entities (names, orgs, grades, etc.)
  • date_filters (raw date mentions, phrases like 'this year', etc.)
  • flags (simple booleans like has_date, has_grade, etc.)
This is intentionally
  • best-effort
  • non-LLM
  • optional (fails open if spaCy isn't available)

You can later specialize this for schools (grades, terms, etc.).

extract(text)

Main entrypoint.

Returns a dict like: { "entities": { "named_entities": { "PERSON": [...], "ORG": [...], ... }, "subjects": [...], # noun-ish candidates "verbs": [...], # verb lemmas }, "date_filters": { "raw_mentions": [...], # "this year", "2024-25", "yesterday" }, "flags": { "has_date": bool, "has_person": bool, "has_org": bool, }, }