Ver oferta completa

DATABRICKS DATA ENGINEER (CI/CD EN AZURE)

Descripción de la oferta de empleo

  About Us:   At Derevo, we are dedicated to empowering businesses and individuals to unleash the value of data within organizations. We achieve this by implementing analytics processes and platforms with a comprehensive approach covering the entire cycle necessary to achieve it. Derevo started in 2010 with a simple idea - to create more than a company, but a community and a space where everyone has the opportunity to build a dream. At Derevo, we believe in human talent that is free and creative. Being human is our superpower! Databricks Data Engineer Summary: The desired profile should have at least 5 years hands-on experience in designing, establishing, and maintaining data management and storing systems. Skilled in collecting, processing, cleaning, and deploying large datasets, understanding ER data models, and integrating with multiple data sources. Efficient in analyzing, communicating, and proposing different ways of building Data Warehouses, Data Lakes, End-to-End Pipelines, and Big Data solutions to clients, either in batch or streaming strategies. Technical Proficiencies: SQL: Data Definition Language, Data Manipulation Language, Intermediate/advanced queries for analytical purpose, Subqueries, CTEs, Data types, Joins with business rules applied, Grouping and Aggregates for business metrics, Indexing and optimizing queries for efficient ETL process, Stored Procedures for transforming and preparing data, SSMS, DBeaver   Python: Experience in object-oriented programming, Management and processing datasets, Use of variables, lists, dictionaries and tuples, Conditional and iterating functions, Optimization of memory consumption, Structures and data types, Data ingestion through various structured and semi-structured data sources, Knowledge of libraries such as pandas, numpy, sqlalchemy, Must have good practices when writing code   Databricks / Pyspark: Intermediate knowledge in   Understanding of narrow and wide transformations, actions, and lazy evaluations How DataFrames are transformed, executed, and optimized in Spark Use DataFrame API to explore, preprocess, join, and ingest data in Spark Use Delta Lake to improve the quality and performance of data pipelines Use SQL and Python to write production data pipelines to extract, transform, and load data into tables and views in the Lakehouse Understand the most common performance problems associated with data ingestion and how to mitigate them Monitor Spark UI: Jobs, Stages, Tasks, Storage, Environment, Executors, and Execution Plans Configure a Spark cluster for maximum performance given specific job requirements Configure Databricks to access Blob, ADL, SAS, user tokens, Secret Scopes and Azure Key Vault Configure governance solutions through Unity Catalog and Delta Sharing Use Delta Live Tables to manage an end-to-end pipeline with unit and integrations test   Azure: Intermediate/Advanced knowledge in   Azure Storage Account: Provision Azure Blob Storage or Azure Data Lake instances Build efficient file systems for storing data into folders with static or parametrized names, considering possible security rules and risks Experience identifying use cases for open-source file formats like parquet, AVRO, ORC Understanding optimized column-oriented file formats vs optimized row-oriented file formats Implementing security configurations through Access Keys, SAS, AAD, RBAC, ACLs   Azure Data Factory: Provision Azure Data Factory instances Use Azure IR, Self-Hosted IR, Azure-SSIS to establish connections to distinct data sources Use of Copy or Polybase activities for loading data Build efficient and optimized ADF Pipelines using linked services, datasets, parameters, triggers, data movement activities, data transformation activities, control flow activities and mapping data flows Build Incremental and Re-Processing Loads Understanding and applying best practices for Source Control with Azure Repos Git integration - CICD (deseable) Process Automation : Automate the deployment, scaling, and de-scaling of Azure Databricks clusters using tools like ARM Templates, Terraform, or Azure DevOps Pipelines.   Version Control and Deployments : Implement version control practices for notebooks and source code using Git repositories. Configure CI/CD pipelines for continuous delivery of notebooks and Spark applications.   Monitoring and Performance Optimization : Set up alerts and monitor key performance metrics in Azure Databricks using Azure Monitor and other monitoring tools. Optimize cluster and workload performance to ensure efficiency and scalability.    Security and Compliance : Implement security controls and compliance policies in Azure Databricks    Integration with Azure Services : Integrate Azure Databricks with other Azure services such as Azure  Data Lake Storage, Azure SQL Database, Azure Synapse Analytics, and Azure DevOps to create end-to-end data analytics solutions.   Configuration and Secrets Management : Manage configurations and sensitive secrets using Azure Key Vault or other secrets management solutions. Ensure the security of credentials and access keys.   Training and Support : Provide training and technical support to development and data analytics teams in the effective use of Azure Databricks. Document best practices and usage patterns to facilitate adoption and collaboration.   What Benefits You'll Have? WELLNESS:   We prioritize your overall well-being through personal, professional, and economic balance. Our statutory and additional benefits will assist you in achieving this. LET´S RELEASE YOUR POWER:   You will have the opportunity to specialize comprehensively in different areas and technologies, achieving interdisciplinary development. We encourage you to set new challenges and surpass yourself. WE CREATE NEW THINGS:   Thinking outside the box is our forte. You'll have the space, trust, and freedom to create, along with the necessary training to achieve it. WE GROW TOGETHER:   Participate in cutting-edge, multinational technological projects with foreign teams. Where Will You Work?   We are a dynamic team operating in a remote setting. We offer flexibility and structure, providing the necessary equipment and internal communication tools to facilitate our operation and that of our clients. If you meet most of the requirements and find the profile interesting, do not hesitate to apply. Our Talent team will contact you!  Become a Derevian & develop your superpower!
Ver oferta completa

Detalles de la oferta

Empresa
  • Derevo SA de CV
Municipio
  • En todo México
Dirección
  • Sin especificar - Sin especificar
Fecha de publicación
  • 05/04/2024
Fecha de expiración
  • 04/07/2024
Data scientist with azure machine learning - permanent
Cliecon solutions inc.

Role - data scientist with azure machine learning – technical support engineer location: guadalajara, mexico (remote) skills: · knowledge with azure machine learning and how it works with associated azure services... · responsible for providing technical support and expertise to customers utilizing the......

Azure Machine Learning – Technical Support Engineer
Cliecon Solutions INC

Job title : azure machine learning – technical support engineer location : guadalajara city, mexico – initially remote job type : fulltime job description:: knowledge with azure machine learning and how it works with associated azure services... responsible for providing technical support and expertise......

Data scientist with azure cognitive services - permanent
Cliecon solutions inc.

Role - azure cognitive services – technical support engineer location – guadalajara, mexico (remote) job description: knowledge of azure cognitive services- luis, sdk app, ml model prediction accuracy strong python and c# coding skills, along with knowledge of c++ and/or java knowledge of debugging and......

Remote middle big data engineer
Kitrum

— межфункциональная работа с командами data science или content engineering для устранения неполадок, обработки или оптимизации критически важных для бизнеса данных... — знание spark — знание scala и/или python — свободное владение хотя бы одним диалектом sql — уровень английского языка: upper-intermediate......

Data Scientist ( AzureML) - Full Time
Cliecon Solutions, Inc

Job title: azure machine learning – technical support engineer location: guadalajara, mx (remote to start with) duration: long-term skills : knowledge with azure machine learning and how it works with associated azure services... responsible for providing technical support and expertise to customers......

Sr Data Scientist
10

Strong understanding of algorithms and data structuresadvanced experience with ms office and google suite strong quantitative, math, and problem-solving skills... role: sr data scientistlocation: guadalajara & mexico city (remote)type of hire: full time job description: strong proficiency with r, sql......

Project Engineer
Eficacia en consultoria

Importante empresa multinacional de origen suizo esta en búsqueda de tu talento como: project engineer escolaridad: ingeniero mecánico (titulado)... inglés avanzado (la entrevista es en inglés) edad: 26 a 30 años experiencia laboral de 3 a 4 años como mechanical engineer o en ingeniería de proyectos......

Sales and Project Engineer Jr
S-MEX, S.A. DE C.V.

Essential: graduated industrial engineer or related... easy communication with clients... customer service quotations and follow-ups search for new customers apqp administration constant communication with international customers prepare, schedule, coordinate and monitor the assigned engineering projects......

Becario Tableau Data Science
TURING INTELIGENCIA ARTIFICIAL

Sueldo de acuerdo a aptitudes zona: 100% remoto... descuentos en universidades y escuelas de idiomas home office lunes a viernes... competencias funcionales: extracción, transformación y carga de información para la realización y actualización de dashboards mediante el uso de tableau análisis e interpretación......

Network Engineer
Servicio Latam COMX

Implementaciones lan/wan/wlan/sdwan con hardware cisco (especialidad)... requiere ingeniero de redes familiarizado con las soluciones de nivel enterprise o carrier con mas de 3 años de experiencia en networking, enfocado en el área de telecomunicaciones, pero incluyendo cableado, datacenter......