You'll build and operate the data infrastructure that powers Singa's analytics, reporting, and business intelligence. You'll own AWS Glue pipelines that move data from production PostgreSQL databases to S3 and Redshift, orchestrate Airflow or Mage jobs that ingest data from external SaaS platforms, and implement data tests, monitoring, and alerting to ensure freshness and accuracy. You'll work directly with product, analytics, and leadership teams to turn raw data into actionable insights while maintaining reliable, performant pipelines that scale with our 2 million+ users and 1,600+ venues.
- Build and operate AWS Glue ETL pipelines for moving data from RDS/PostgreSQL to S3 and Redshift
- Design and implement Airflow or Mage DAGs to ingest data from external SaaS platforms (ads, analytics, CRM)
- Implement rETL (reverse ETL) pipelines to sync data back to operational systems
- Write SQL transformations and data models for analytics and reporting
- Build data quality tests, monitoring, and alerting for pipeline freshness and accuracy
- Optimize query performance and warehouse costs in Redshift
- Integrate APIs from common SaaS platforms (Google Ads, Facebook Ads, Stripe, etc.)
- Implement CDC (change data capture) for real-time data synchronization
- Debug data pipeline failures and resolve data quality issues
- Collaborate with analysts and engineers to define data requirements and schemas
- Document data pipelines, schemas, and operational runbooks
- Leverage AI coding assistants to accelerate pipeline development while maintaining data quality
Cloud Platform: AWS (Glue, Redshift, RDS, S3, Lambda, CloudWatch)
Orchestration: Apache Airflow or Mage (DAG-based workflow management)
Databases: PostgreSQL (production), Amazon Redshift (warehouse)
Languages: Python (data pipelines, orchestration), SQL (transformations, analytics)
Data Tools: dbt (transformation framework), Apache Iceberg (table format)
APIs: REST APIs for SaaS platform integration (Google, Facebook, Stripe, etc.)
Streaming: CDC tools for real-time data capture
Testing: Great Expectations or similar data quality frameworks
Monitoring: CloudWatch, custom alerting, data freshness checks
Version Control: Git, GitHub
We're building a modern data stack on AWS with a focus on reliability, cost efficiency, and maintainability. You'll have the opportunity to shape our data infrastructure as we scale.
Must have:
- Strong sense of ownership of data platforms and pipelines
- 3+ years as a data engineer running production pipelines
- Strong SQL skills for complex queries, transformations, and optimization
- Strong Python skills for data workloads (ETL, API integration, orchestration)
- Hands-on experience with Apache Airflow or similar orchestration tools (Prefect, Dagster, Mage)
- Experience with AWS data services (Glue, Redshift, RDS, S3)
- Experience integrating common SaaS and advertising platforms via API
Nice to have:
- Production experience with dbt (data build tool) for transformation workflows
- Experience with Apache Iceberg or other lakehouse table formats
- CDC (change data capture) implementation experience with tools like Debezium or AWS DMS
- Streaming data pipeline experience with Kafka, Kinesis, or similar
- rETL (reverse ETL) implementation experience
- Data quality framework experience (Great Expectations, Soda, dbt tests)
- Redshift optimization experience (sort keys, dist keys, query tuning)
- Cost optimization experience for cloud data warehouses
- Experience with data governance and lineage tools
- Experience with music, entertainment, or SaaS product data
- Experience with AI coding assistants (Claude Code preferred) or eagerness to adopt AI-enhanced workflows
Singa is transforming the global karaoke industry with a modern streaming platform that serves 2 million+ users and 1,600+ venues across 34 countries. With 100,000+ songs including original artist recordings through partnerships with major labels like Warner Music Group, we're building the digital future of karaoke.
Our Finland-based engineering team is building data infrastructure that powers product decisions, venue analytics, and business intelligence. We prioritize reliable pipelines, thoughtful architecture, and data quality that enables the business to move fast with confidence.
Hybrid: In the Helsinki Capital Region, we aim a hybrid setup with 2–3 remote days per week. For those living outside the capital region, we typically aim for around five office days per month.
Fully Remote Option: accommodate full remote if candidate is able to show a proven capability working effectively while fully remote with a larger team that’s in the office.
We're building a team that reflects the diversity of the European developer community. We evaluate candidates on technical skills and systems thinking, not academic pedigree or career path.
We provide interview accommodations for candidates who need them. Our collaborative approach ensures technical decisions are inclusive and transparent.
Don't meet every requirement? Research shows underrepresented groups apply only when they meet 100% of qualifications. If you're excited about data infrastructure and building reliable pipelines, we encourage you to apply.
If this sounds like you, come join us in spreading the joy of singing! Please send your resume and Git Hub portfolio and answer the questions in the application form.
We review applications weekly already during the application period and will fill the position as soon as we find the right person, so we encourage you to apply quickly. Our recruitment process includes interviews as well as a take-home assignment.
This job comes with several perks and benefits
