OSSS.ai.services.nlp_extraction_service¶
OSSS.ai.services.nlp_extraction_service
¶
NLPExtractionService
¶
Lightweight NLP-based extraction for
- entities (names, orgs, grades, etc.)
- date_filters (raw date mentions, phrases like 'this year', etc.)
- flags (simple booleans like has_date, has_grade, etc.)
This is intentionally
- best-effort
- non-LLM
- optional (fails open if spaCy isn't available)
You can later specialize this for schools (grades, terms, etc.).
extract(text)
¶
Main entrypoint.
Returns a dict like: { "entities": { "named_entities": { "PERSON": [...], "ORG": [...], ... }, "subjects": [...], # noun-ish candidates "verbs": [...], # verb lemmas }, "date_filters": { "raw_mentions": [...], # "this year", "2024-25", "yesterday" }, "flags": { "has_date": bool, "has_person": bool, "has_org": bool, }, }