Document Processing Engineer
Email your resume and a short blurb about why you want to work at Orbii to na@orbii.ai
Location
Employment Type
Department
Responsibilities
- Build pipelines for extracting structured data from PDFs, scanned docs, and financial statements.
- Apply OCR/NLP/ML to improve accuracy and handle noisy inputs.
- Optimize for Arabic-language documents (right-to-left processing, language models).
- Continuously benchmark extraction performance, improve recall/precision.
Tech Skills
- OCR: Tesseract, ABBYY, PaddleOCR.
- NLP: spaCy, HuggingFace transformers, BERT/ArabicBERT.
- Python stack: PyPDF2, pdfplumber, Camelot/Tabula.
- Cloud ML services (Azure Cognitive Services, AWS Textract).
- Strong regex/text parsing skills.
- Familiarity with information extraction evaluation metrics (F1, precision, recall).
Values Fit
- Excellence baseline: accuracy is not optional.
- Resilience reflex: messy documents don’t break you.
Humility fuels growth: each edge case is a lesson.