Distributed Systems Engineer

At Daltix (Full-time), in Lisbon, Portugal

Expires at: 2019-11-10

We are looking for talented profiles to help build and maintain the distributed data collection system that is at the heart of our business.

We are a data-driven company which collects and processes more than 500GB of raw data daily. We leverage big data technologies such as Serverless, Spark on AWS EMR to crunch these volumes of data and make it queryable.

In this role, you will ensure that our data collection engine, which consists of distributed web crawlers, is state of the art and ahead of our competition. You will ensure that we can scrape any webshop, no matter the ban-detection that has been put in place. Then, it will be important that proper monitoring tools are in place. We are currently scraping 60 sites and your goal is to at least triple that without losing completeness and quality.

Your responsibilities will include:

Creating and implementing Distributed web crawling architectures

Implementing cost-effective data processing architectures

Creating advanced system monitoring solutions & dashboards

Designing advanced ways of interpreting scraped HTMLs

Managing advanced proxies

Main requirements

At least 5 years of experience in object-oriented software engineering & design in any object-oriented programming language

Experience with and understanding of large-scale web crawling

Experience with databases, SQL

Experience with infrastructure such as load-balancers, caches

Highly proficient in spoken and written English

You never stop learning

Nice to have

Have experience building on top of Amazon Web Services

Have programming experience with Python

Expert knowledge of web-scraping & web-scraping architectures

Experience with GoLang & JavaScript (Node.js) is a plus

Experience with big data technologies (such as Hadoop, Spark, Airflow, Cassandra, Elasticsearch) is a plus

Have a deep understanding of cloud possibilities and limitations in the areas of distributed systems, load balancing and networking, massive data storage, and security

Get energy from working in a highly complex and challenging startup environment with a high tech product

Knowledge of DevOps & automation (Terraform, Ansible)

Data analysis using Pandas (Python)

Perks

Work with the latest tech stack

We're a quickly growing company => you're personal growth can be huge too.

Health Benefits – Comprehensive coverage for medical needs

Meal allowance – Monthly meal card along with a fully stocked kitchen with enough coffee and fruits, along with monthly team drinks and dinners

Work-Life Balance – We trust you to know your schedule and work when you feel most productive

Learning and Development – Attend meet-up, conferences, and events that interest you and benefit your personal and career growth.

Apply for this position
---------------------------------------------------------------------------
Visit this link to stop these emails: http://zpr.io/gkQ3Q

If there is no "apply" button, copy the link above (https://ift.tt/.....) and paste on a new tab in your browser

No comments

CLOSE

Enter your email address:

Delivered by FeedBurner

'
Life and Passion. Powered by Blogger.