Backend Data Engineer

Job description

Why working as Backend Data Engineer at Jexia?

To support our ambitious growth, we are now looking for a awesome Backend Data Engineer for our Brazilian office in São Paulo starting as soon as possible.


What will you do?

In this role, you will support our Data Team in research and pre-development in the field of scalable systems, data science and pipeline architecture.

The team’s mission is to unlock business outcomes through solving data-intensive problems of our future products, services, and data-driven in-house processes in close collaboration with internal customers across the company.


How does a day look like?

Your typical day starts at 9 am with a 15-30 min long meeting with data team. We discuss the advancement of the projects and decide on how to overcome difficulties.


Then usually you will read your emails received during the night (Jexians are distributed in different locations) and react if necessary. After that, you will work on your current project, which currently is a logs extractor from customers activities. 

The process consists of dumping the customer's endpoints consumption logs, then clustering them as active or non-active customers, basically you build pipelines to move data, perform transformations and cleanup, and deploy services to ensure data is available.


In the afternoon, you help your team members to improve their models by testing the current model on the real data, identifying the false positives/negatives and creating new training examples to fix the problem.

The decision when to stop improving the model and deploy in production depends on the project.


The day ends at around 17:30 pm with a 30min of catch up of the tech news/blogging.

Requirements

What do we ask of you?

  • BS in Computer Science or equivalent relevant experience

  • 4+ years experience working with and extracting value from large, disconnected and/or unstructured datasets

  • You have real-world experience building Go applications with a focus on complex and distributed systems

  • Demonstrated ability to build processes that support data transformation, data structures, metadata, dependency and workload management

  • Strong interpersonal skills and the ability to project manage and work with cross-functional teams

  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases

  • Experience building and optimizing ‘big data’ data pipelines, architectures, and datasets

  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement

Bonus:

  • Hadoop, Spark, and some message broker

  • Data pipeline/workflow management tools such as Azkaban and Airflow

  • Stream-processing systems such as Storm and Spark-Streaming

  • Object-oriented/object function scripting languages such as Python, Java, C++, etc