Link copied to clipboard!
Back to Jobs
Sr Data Scientist with GenAI & NLP - Hybrid on site at Simple Solutions
Simple Solutions
Washington, DC
Information Technology
Posted 0 days ago
Job Description
Sr Data Scientist with GenAI & NLP - Hybrid on siteMinimum Qualifications:CLIENT PREFERS A PHD - Work or educational background in one or more of the following areas: machine learning, computational linguistics, deep learning, ratification intelligence, data science and/or data analytic, generative AI, symbolic AI, causal AI, operations research, computer science, Mathematics, business analytics, or knowledge management.8-12 years of demonstrated experience programming with R/Python, Linux, and Spark in AWS cloud environment, or knowledge and algorithmic design experience in Python (3+ years)Proficient with Amazon AWS Sagemaker, Jupyter Notebook and Python Scikit, Deep Learning, Machine Learning tools such as TensorFlowExperience with image processing models such as Coco, CLIP, ResNet or comparable modelsDemonstrated experience with machine learning techniques including natural language processing, and Large language Models (GPTv4-o1, o3, OpenAI APIs, Llama, Claude, etc).Experience developing AI agents and development proficiency using agentic programmingProficient in Natural language processing (NLP) and Natural language generation (NLG) including prior projects in any of the following categories: top modeling of text, sentiment analysis of text, part of speech tagging, Name Entity Recognition (NER), Bag of Words, text extractionExperience building and working with any of these components: Vector DB, BERT, RoBERTa (or comparable tools), Spacy, LLM and GenAI tools. Experience with LoRA, LangChain, RAG, LLM Fine Tuning and PEFT, Knowledge Graphs.Strong skills in developing GraphRAG, Chain of Thought (CoT), Tree of Thought (ToT), Reinforcement learning and AI development architectures with Human-in-the-Loop (HITLDemonstrated experience with SQL and any relational database technologies, such as Oracle, PostgreSQL, MySQL, RDS, Redshift, Hadoop EMR, Hive, etc.Demonstrated experience processing structured and unstructured data sources, data cleansing, data normalization and prep for analysisDemonstrated experience with code repositories and build/deployment pipelines, specifically Jenkins and/or Git/GitHub/GitLab.Demonstrated experience using Tableau, or Kibana, Quicksights or other similar data visualizations tools.Very comfortable working with ambiguity (e.g. imperfect data, loosely defined concepts, ideas, or goals)Looking for "hands on" Data Scientist with Fraud detection and time series analysis At least a Master's degree (PhD highly preferred) in Computer Science or any field related to AI.Experience working with big data in AWS and using libraries such as PySpark.Experience in time series forecasting and machine learning models.Experience working with generative AI.Experience working with log file analysis and tools such as Splunk.Qualifications & RequirementsEducation: MS in Computer Science, Statistics, Math, Engineering, or related field, PhD preferred.3+ years of relevant experience in building large scale machine learning or deep learning models and/or systems1+ year of experience specifically with deep learning (e.g., CNN, RNN, LSTM)1+ year of experience building NLP and NLG tools.Experience with wide range of LLMs (Llama, Claude, OpenAI, Cohere, etc.), LoRA, LangChain, RAG, LLM Fine Tuning and PEFT are preferred.Demonstrated skills with Jupyter Notebook, AWS Sagemaker, or Domino Datalab or comparable environmentsPassion for solving complex data problems and generating cross-functional solutions in a fast-paced environmentKnowledge in Python and SQL, object oriented programming, service oriented architecturesStrong scripting skills with Shell script and SQLStrong coding skills and experience with Python (including SciPy, NumPy, and/or PySpark) and/or Scala.Knowledge and implementation experience with NLP techniques (topic modeling, bag of words, text classification, TF/IDF, Sentiment analysis) and NLP technologies such as Python NLTK, or Spacy or comparable technologiesKnowledge and implementation experience with statistical and machine learning models (regression, classification, clustering, graph models, etc.) Preferred QualificationsHands on experience building models with deep learning frameworks like Tensorflow, Keras, Caffe, PyTorch, Theano, H2O, or similarExperience with LLM Agents, Agentic programmingExperience with search architecture (for instance: Solr, ElasticSearch, AWS OpenSearch)Experience with building querying ontologies such as Zeno, OWL, RDF, SparQL or comparable are preferredKnowledge & experience with microservices, service mesh, API development and test automation are preferredDemonstrated experience using Docker, Kubernetes, and/or other similar container frameworks are preferred Additional Job Qualifications:Ability to translate business ideas into analytics models that have major business impact.Demonstrated experience working with multiple stakeholders.Demonstrated communication skills, e.g. explaining complex technical issues to more junior data scientists, in graphical, verbal, or written formats.Demonstrated experience developing tested, reusable and reproducible work.RequirementsInterview Process/# of Rounds:Interview Process/# of Rounds:Direct manager contact2 rounds
Resume Suggestions
Highlight relevant experience and skills that match the job requirements to demonstrate your qualifications.
Quantify your achievements with specific metrics and results whenever possible to show impact.
Emphasize your proficiency in relevant technologies and tools mentioned in the job description.
Showcase your communication and collaboration skills through examples of successful projects and teamwork.