Job description
Coders Connect is partnering with a cutting-edge data tech company that operates one of the largest real-time social data collection and processing platforms globally handling tens of millions of posts daily across thousands of nodes worldwide.
They serve top-tier clients in AI, cyber defense, and investment through powerful data and analytics APIs. As a Senior Data Engineer, youll split your focus between: Infrastructure (+/- 50%) Stabilising and optimising a Kafka / S3 / ClickHouse stack running on Kubernetes.
Automating ETLs and pipelines for high-volume, low-cost ingestion.
Managing scalability, monitoring, and operational costs.
Analytics & AI/ML: (+/- 50%) Building advanced features: vector search, real-time clustering, predictive analytics, and outcomes-based alerts.
Integrating embeddings and ML models into product APIs (e.g., trend detection, summarisation).
Collaborating directly with founders to turn these features into scalable, API-first products.
Tech Environment: Data & Infra: Kafka (multi-topic, high throughput), PySpark, S3 (Parquet), ClickHouse (OLAP & vector search), Kubernetes.
Programming: Python-first (FastAPI, bytewax/stream processors, ML/DL libraries).
Ops & Monitoring: GitHub Actions/GitLab CI, Prometheus, Grafana.
AI/ML: Embeddings (Hugging Face), clustering (e.g., k-means/Scikit-learn), PyTorch/TensorFlow (a plus).
Requirements 57+ years in data engineering, with strong experience building pipelines and APIs in high-scale environments.
Practical experience with AI/ML workflows (embeddings, clustering, predictive analytics).
Hands-on with ClickHouse, Kafka, Kubernetes, and Python.
A pragmatic, results-driven mindset equally comfortable optimising infra and experimenting with AI-driven analytics.
Previous scale-up/startup experience is a big plus.
Benefits Location: Lyon, France (hybrid, with flexibility for remote) Contract: CDI Salary: Base + attractive BSPCE/equity + benefits Environment: Small, high-impact team (<5), close collaboration with the founders, fast decision-making, and genuine autonomy.
This is a high-impact role where youll help stabilise and scale a global data backbone while building out the next generation of AI-powered data APIs. If you want to combine solid data engineering with hands-on AI/ML, and have a real say in technical direction and product evolution, this is the place for you.
5+ years
5–7+ years in data engineering, with strong experience building pipelines and APIs in high-scale environments.
Practical experience with AI/ML workflows (embeddings, clustering, predictive analytics).
Hands-on with ClickHouse, Kafka, Kubernetes, and Python.
A pragmatic, results-driven mindset – equally comfortable optimising infra and experimenting with AI-driven analytics.
Previous scale-up/startup experience is a big plus.
Required Skill Profession
Computer And Mathematical