Thank you for stopping by to take a look at the Data Integration Engineer role I posted here on LinkedIN, I appreciate it.If you have read my job descriptions in the past, you will recognize how I write job descriptions. If you are new, allow me to introduce myself. My name is Tom Welke. I am Partner & VP at RSM Solutions, Inc and I have been recruiting technical talent for more than 23 years and been in the tech space since the 1990s. Due to this, I actually write JD's myself...no AI, no 'bots', just a real live human. I realized a while back that looking for work is about as fun as a root canal with no anesthesia...especially now. So, rather than saying 'must work well with others' and 'team mindset', I do away with that kind of nonsense and just tell it like it is.So, as with every role I work on, social fit is almost as important as technical fit. For this one, technical fit is very very important. But, we also have some social fit characteristics that are important. This is the kind of place that requires people to dive in and learn. The hiring manager for this one is actually a very dear friend of mine. He said something interesting to me not all that long ago. He mentioned, if you aren't spending at least an hour a day learning something new, you really are doing yourself a disservice. This is that classic environment where no one says 'this is not my job'. So that ability to jump in and help is needed for success in this role.This role is being done onsite in Irvine, California. I prefer working with candidates that are already local to the area. If you need to relocate, that is fine, but there are no relocation dollars available.I can only work with US Citizens or Green Card Holders for this role. I cannot work with H1, OPT, EAD, F1, H4, or anyone that is not already a US Citizen or Green Card Holder for this role.The Data Engineer role is similar to the Data Integration role I posted. However, this one is mor Ops focused, with the orchestration of deployment and ML flow, and including orchestrating and using data on the clusters and managing how the models are performing. This role focuses on coding & configuring on the ML side of the house. You will be designing, automating, and observing end to end data pipelines that feed this client's Kubeflow driven machine learning platform, ensuring models are trained, deployed, and monitored on trustworthy, well governed data. You will build batch/stream workflows, wire them into Azure DevOps CI/CD, and surface real time health metrics in Prometheus + Grafana dashboards to guarantee data availability. The role bridges Data Engineering and MLOps, allowing data scientists to focus on experimentation and the business sees rapid, reliable predictive insight.Here are some of the main responsibilities:Design and implement batch and streaming pipelines in Apache Spark running on Kubernetes and Kubeflow Pipelines to hydrate feature stores and training datasets.Build high throughput ETL/ELT jobs with SSIS, SSAS, and T SQL against MS SQL Server, applying Data Vault style modeling patterns for auditability.Integrate source control, build, and release automation using GitHub Actions and Azure DevOps for every pipeline component.Instrument pipelines with Prometheus exporters and visualize SLA, latency, and error budget metrics to enable proactive alerting.Create automated data quality and schema drift checks; surface anomalies to support a rapid incident response process.Use MLflow Tracking and Model Registry to version artifacts, parameters, and metrics for reproducible experiments and safe rollbacks.Work with data scientists to automate model retraining and deployment triggers within Kubeflow based on data freshness or concept drift signals.Develop PowerShell and .NET utilities to orchestrate job dependencies, manage secrets, and publish telemetry to Azure Monitor.Optimize Spark and SQL workloads through indexing, partitioning, and cluster sizing strategies, benchmarking performance in CI pipelines.Document lineage, ownership, and retention policies; ensure pipelines conform to PCI/SOX and internal data governance standards.Here is what we are seeking:At least 6 years of experience building data pipelines in Spark or equivalent.At least 2 years deploying workloads on Kubernetes/Kubeflow.At least 2 years of experience with MLflow or similar experiment‑tracking tools.At least 6 years of experience in T‑SQL, Python/Scala for Spark.At least 6 years of PowerShell/.NET scripting. At least 6 years of experience with with GitHub, Azure DevOps, Prometheus, Grafana, and SSIS/SSAS.Kubernetes CKA/CKAD, Azure Data Engineer (DP‑203), or MLOps‑focused certifications (e.g., Kubeflow or MLflow) would be great to see.Mentor engineers on best practices in containerized data engineering and MLOps.

Data Engineer at RSM Solutions, Inc

Job Description

Resume Suggestions

Explore More Opportunities