Coders Connect is partnering with a cutting-edge data tech company that operates one of the largest real-time social data collection and processing platforms globally — handling tens of millions of posts daily across thousands of nodes worldwide.
They serve top-tier clients in AI, cyber defense, and investment through powerful data and analytics APIs.
As a Senior Data Engineer, you’ll split your focus between:
Infrastructure (+/- 50%)
- Stabilising and optimising a Kafka / S3 / ClickHouse stack running on Kubernetes.
- Automating ETLs and pipelines for high-volume, low-cost ingestion.
- Managing scalability, monitoring, and operational costs.
Analytics & AI/ML: (+/- 50%)
- Building advanced features: vector search, real-time clustering, predictive analytics, and outcomes-based alerts.
- Integrating embeddings and ML models into product APIs (e.g., trend detection, summarisation).
- Collaborating directly with founders to turn these features into scalable, API-first products.
Tech Environment:
- Data & Infra: Kafka (multi-topic, high throughput), PySpark, S3 (Parquet), ClickHouse (OLAP & vector search), Kubernetes.
- Programming: Python-first (FastAPI, bytewax/stream processors, ML/DL libraries).
- Ops & Monitoring: GitHub Actions/GitLab CI, Prometheus, Grafana.
- AI/ML: Embeddings (Hugging Face), clustering (e.g., k-means/Scikit-learn), PyTorch/TensorFlow (a plus).
Requirements
- 5–7+ years in data engineering, with strong experience building pipelines and APIs in high-scale environments.
- Practical experience with AI/ML workflows (embeddings, clustering, predictive analytics).
- Hands-on with ClickHouse, Kafka, Kubernetes, and Python.
- A pragmatic, results-driven mindset – equally comfortable optimising infra and experimenting with AI-driven analytics.
- Previous scale-up/startup experience is a big plus.
Benefits
Location: Lyon, France (hybrid, with flexibility for remote)
Contract: CDI
Salary: Base + attractive BSPCE/equity + benefits
Environment: Small, high-impact team (<5), close collaboration with the founders, fast decision-making, and genuine autonomy.
This is a high-impact role where you’ll help stabilise and scale a global data backbone while building out the next generation of AI-powered data APIs.
If you want to combine solid data engineering with hands-on AI/ML, and have a real say in technical direction and product evolution, this is the place for you.