Data Engineer

Sequoia Connect

Full-time

Remote

Our client represents the connected world, offering innovative and customer-centric information technology experiences, enabling Enterprises, Associates, and Society to Rise™.

They are a USD 6 billion company with 163,000+ professionals across 90 countries, helping 1279 global customers, including Fortune 500 companies. They focus on leveraging next-generation technologies, including 5G, Blockchain, Metaverse, Quantum Computing, Cybersecurity, Artificial Intelligence, and more, on enabling end-to-end digital transformation for global customers.

Our client is one of the fastest-growing brands and among the top 7 IT service providers globally. Our client has consistently emerged as a leader in sustainability and is recognized amongst the ‘2021 Global 100 Most sustainable corporations in the World by Corporate Knights.

We are currently searching for a Data Engineer:

Responsibilities

Responsible for the ETL (Extract-Transform-Load) procedures of current and historical data from multiple internal/external data sources, in particular:
E: Extract data from multiple data sources through various available data vendors. Most of the data is structured or semi-structured, while a small portion is unstructured.
T: Transform data to reconcile changes in file formats and schemas over time or across multiple data sources for the same type of information. Align data at various temporal resolutions (e.g., aggregation or disaggregation at various levels in time).
L: Store files in a structured way for future reference and develop a data schema; deposit data to a database for convenient querying and analysis.
Pipeline for current data: maintain automatic pipelines to ingest data continuously.
Ingestion and curation of historical data is a one-time effort. Once the ingestion pipelines for most of the data are established and stabilized, the support switches to the maintenance mode for the existing ingestion pipelines and additions of new data sources from time to time.
Data cleaning is used to provide quick research turnaround, together with a data retrieval API, supported in Excel Add-in and Python packages.

Requirements

Fluent in Python and packages related to data processing, such as pandas.
Fluent in object-oriented programming with Python.
Familiar with database technologies/SQL, data warehouses, and related concepts.
Deploying cloud resources that support data engineering and ML Ops pipelines.
Setting up CICD pipelines for data engineering and ML Ops projects.
GitHub experience.
Fluent in working with large data sets and distributed computing architectures (i.e. Apache Spark).
Experience with various cloud-based data technologies, including Databricks, Azure Data lake, Azure Data Explorer (ADX)
Experience with the development of APIs in front of various data technologies.

Desired

Fluent in front-end development and deployment of single-page applications in one or more Javascript frameworks.
Relational database – Azure SQL & Oracle. No SQL databases - MongoDB and/or Cosmos DB
Exam AZ-900: Microsoft Azure Fundamentals certification

Languages

Advanced Oral English.
Native Spanish.

Note:

Hybrid - Mexico City, Guadalajara, Monterrey or Saltillo

If you meet these qualifications and are pursuing new challenges, Start your application to join an award-winning employer. Explore all our job openings | Sequoia Career’s Page: https://www.sequoiags.com/careers/.

Apply now

Share this job

Twitter Facebook Linkedin Email