Back to Jobs
PitchBook Data

Staff Machine Learning Engineer, Data Collections AI & ML at PitchBook Data

PitchBook Data Seattle, WA

Job Description

At PitchBook a Morningstar company we are always looking forward. We continue to innovate evolve and invest in ourselves to bring out the best in everyone. Were deeply collaborative and thrive on the excitement energy and fun that reverberates throughout the company.Our extensive learning programs and mentorship opportunities help us create a culture of curiosity that pushes us to always find new solutions and better ways of doing things. The combination of a rapidly evolving industry and our high ambitions means theres going to be some ambiguity along the way but we excel when we challenge ourselves. Were willing to take risks fail fast and do it all over again in the pursuit of excellence. If you have a good attitude and are willing to roll up your sleeves to get things done PitchBook is the place for you.About the Role:The Data Collection AI/ML team builds intelligent systems that scale and improve PitchBooks data extraction enrichment and validation processes. The team applies advanced ML including classification entity/relationship extraction LLM-based parsing OCR and anomaly detection to ensure high accuracy coverage and timeliness of our proprietary datasets.The Staff MLE role is a force multiplier for the team partnering with technical leadership to set best practices and design reusable ML architectures that support rapid innovation and operational excellence.As a Staff Machine Learning Engineer on the Data Collection AI/ML team you will serve as the senior technical expert responsible for designing architecting and deploying advanced AI and machine learning systems that power PitchBooks data collection extraction and enrichment workflows. You will play a pivotal role in elevating the technical bar of the organization by setting engineering standards driving architectural decisions and supporting teams to build scalable production-grade ML systems.Your work will focus on automating and enhancing PitchBooks ingestion and data quality pipelines across a wide variety of structured and unstructured sources drawing from domain areas such as document understanding OCR natural language processing entity resolution multimodal modeling retrieval systems and LLM-driven extraction. You will collaborate closely with Engineering Product and Data Operations partners to translate business requirements into robust high-impact AI solutions.This role is ideal for someone who thrives as a deeply technical IC and wants to push the boundaries of document AI and data extraction technology shape long-term architectural direction and materially influence the future of data automation at PitchBook.In addition to driving product impact this role offers an opportunity to shape PitchBooks growing presence and technical reputation in the AI and ML space. We are looking for individuals who are active contributors to the broader AI community through peer-reviewed research technical publications or open-source initiatives. Candidates who have authored conference papers or patents and who are excited to explore the frontiers of generative AI LLMs and applied NLP will be well-positioned to help us both advance our internal capabilities and deepen trust with our customers through thought leadershipPrimary Job Responsibilities:Serve as the key technical leader shaping system design ML architectures model lifecycles and scalable infrastructure for data extraction document understanding and structured data enrichmentArchitect reusable frameworks and services for LLM-powered extraction entity recognition and resolution models and multimodal document processingPartner with engineering leaders to ensure our systems meet the highest standards of reliability performance and cost efficiencyDesign and build state-of-the-art ML models using transformers LLMs generative models graph-based approaches and OCR/Document AI frameworksIdentify opportunities to advance automation and accuracy across our ingestion stack including entity linking relationship inference classification and anomaly detectionTranslate emerging research into practical production-ready capabilitiesContribute to PitchBooks growing technical reputation through experimentation publication or open-source workWork closely with Product Engineering and Data Operations to ensure AI systems integrate smoothly into human-in-the-loop workflows and downstream pipelinesProvide technical expertise during prioritization discussions roadmap planning and long-term strategic designElevate engineering excellence through code reviews design reviews and technical guidance for ML engineers and scientistsAct as a multiplier by shaping best practices for experimentation model evaluation responsible AI and scalable ML engineeringGuide teams across the organization toward cohesive reusable and standards-aligned architecturesOwn the lifecycle of mission-critical ML systems from data preparation to deployment monitoring and continuous improvementEnsure strong standards for model governance explainability and data integrity across the AI/ML stack.Partner with ML Ops and Platform Engineering teams along with other partner engineering groups to maintain high availability reliability and robustness for production ML systemsSkills and Qualifications:Bachelors or Masters degree in Computer Science Mathematics Data Science or a related technical discipline (Masters degree preferred)8 years of experience in machine learning data science or AI-focused engineering with at least 4 years of experience leading technical teamsProven success delivering AI-driven data extraction enrichment or document understanding systems at scale. Hands-on experience with parameter-efficient fine-tuning methods and expertise in document classification optimization preferredDeep expertise in natural language processing document AI OCR entity resolution and large-scale data automationStrong understanding of modern ML frameworks and infrastructure (e.g. PyTorch TensorFlow Hugging Face LangChain MLFlow)Demonstrated ability to define and execute multi-year AI roadmaps with measurable business impactStrong knowledge of cloud-native architecture distributed computing and scalable model deploymentExcellent communication collaboration and influencing skills including experience presenting to executive and cross-functional leadershipA track record of fostering technical excellence and innovation across global multidisciplinary teamsExperience in fintech data platforms or large-scale information extraction systems preferredContributions to the AI/ML research community (e.g. publications patents or open-source projects) are strongly preferredBenefits Compensation at PitchBook:Physical HealthComprehensive health benefitsAdditional medical wellness incentivesSTD LTD AD&D and life insuranceEmotional HealthPaid sabbatical program after four yearsPaid family and paternity leaveAnnual educational stipendAbility to apply for tuition reimbursementCFA exam stipendRobust training programs on industry and soft skillsEmployee assistance programGenerous allotment of vacation days sick days and volunteer daysSocial HealthMatching gifts programEmployee resource groupsSubsidized emergency childcareDependent Care FSACompany-wide eventsEmployee referral bonus programQuarterly team building eventsFinancial Health401k matchShared ownership employee stock programMonthly transportation stipend*Please be aware the above PitchBook benefit and perk offerings are subject to corresponding plan and policy documents and may change during the course of your employment.CompensationAnnual base salary: $260000-$325000Target annual bonus percentage: 20%Working Conditions:At the heart of our company is a belief in the power of in-person collaboration. Being together in the office fuels our creativity strengthens our connections and drives the innovation that sets us apart. Our culture is built on spontaneous momentsthose hallway conversations whiteboard brainstorms and shared celebrations in each of our global officesthat simply cant be replicated remotely. This role is expected to be in the office 5 days a week.The job conditions for this position are in a standard office setting. Employees in this position use PC and phone on an on-going basis throughout the day. Limited corporate travel may be required to remote offices or other business meetings and events.Life At PB:We are consistently recognized as a Best Place to Work and our culture is at the heart of our success. Its our fundamental belief that people do and create great things and that people are the cornerstone of prosperity. We believe that proactively seeking out different points of view listening to others learning and reflecting on what weve heard creates a sense of belonging within PitchBook and strengthens the PitchBook community.We are excited to get to know you and your background. Concerned that you might not meet every requirement We encourage you to still apply as you might be the right candidate for the role or other roles at PitchBook.#LI-#LI-OnsiteRequired Experience:Staff IC Key Skills Computer Science,Docker,Kubernetes,Python,VMware,C/C++,Go,System Architecture,gRPC,OS Kernels,Perl,Distributed Systems Employment Type : Full-Time Experience: years Vacancy: 1 Monthly Salary Salary: 260000 - 325000

Resume Suggestions

Highlight relevant experience and skills that match the job requirements to demonstrate your qualifications.

Quantify your achievements with specific metrics and results whenever possible to show impact.

Emphasize your proficiency in relevant technologies and tools mentioned in the job description.

Showcase your communication and collaboration skills through examples of successful projects and teamwork.

Explore More Opportunities