For Koral, an early-stage startup, we are looking for a skilled web scraping engineer to help us accelerate the product offer and service; a one-stop second-hand search across multiple platforms, allowing the user to spend less time and effort while browsing for their next second-hand purchase.
We want to help more people shop second-hand by presenting the best alternatives at all times, and by being present as an integrated service to well-known shopping journeys.
We have a crawler setup running right now in production which both needs quality assurance and extensive scaling to cover more categories and to become as real-time as possible. Today we scrape around 1.5 million listings and the end goal for our current market is around 20 million listings.
The platforms are listed along with the categories which need to be crawled, so the task is well defined and clear; bring our current scraping setup to new quality standards and scale exponentially with more platforms and categories, so that it can be operated sensibly at the same time as the foundation is created for further expansion into new markets and/or business areas.
The team consists of three founders with competencies in; tech, operations, sustainability, and marketing. Together we will pitch in wherever it is needed, so you are not on your own but will be catching up with the team on daily standups and meetings whenever it is needed.
We are 100% remote with a base in Copenhagen and Aarhus, but no official offices yet. So you need to be okay with working remotely. Meetups can of course be scheduled if the distance allows it. We are very flexible with working hours and the team dynamic is based on mutual respect, humor, and open dialogue. This could be your new free space to help build something truly awesome while also having fun.
The ideal candidate will have a mix of programming expertise, web knowledge, and the ability to build reliable and scalable scrapers that generate high-quality data.
We need a person who is proactive and willing to present us with opportunities instead of limitations
Communication is critical, as we are all remote and a small team with much to prove in a short amount of time. So if something isn’t working out we have to react quickly and adapt. And most importantly, you cannot be afraid to speak up and enlighten the rest of the team with your thoughts and suggestions.
Furthermore and we need someone with solid programming skills in languages like Python, JavaScript, etc. Used for building scrapers.
Excellent understanding of how websites are structured and what is needed to locate and extract data.
Experience with tools like Selenium, Beautiful Soup, Scrapy, etc. For automating scraping at scale.
Knowledge of web scraping techniques like parsing, crawling, handling pagination, etc.
Familiarity with proxies, bots, and automation to overcome anti-scraping measures. Important for robust scraping.
Skill in cleaning and processing unstructured data from websites. Critical for usable datasets.
Knowledge of data analysis, machine learning, and AI is a plus. We perform data wash on most of the listings as the user-generated content is not very searchable. Therefore we apply image tags and LLM models, and lastly, we are working on implementing image and vector-based search.
Understanding the legal and ethical implications of web scraping as we strive to be compliant at all times as well as not damaging the performance of other websites that we scrape.
Lastly, it is great if you have experience with REST APIs and JSON as many sites offer APIs.
Please reach out for any questions, product demos, etc. We hope to hear from you.
Best regards,
Jesper, Nikolai & Lou
This job comes with several perks and benefits