What you’ll do…
Position:
Data Engineer III
Job Location:
702 SW 8th St
,
Bentonville
, AR 72716
Duties:
Problem Formulation:
Identifies
possible options
to address
the business
problems within one’s discipline through analytics, big data analytics, and automation. Applied Business Acumen: Supports the development of business cases and recommendations. Owns delivery of project activity and tasks assigned by others. Supports process updates and changes. Solves business issues. Data Governance: Supports the documentation of data governance processes. Supports the implementation of data governance practices. Data Strategy: Understands, articulates, and applies principles of the defined strategy to routine business problems that involve a single function. Data Transformation and Integration: Extracts data from
identified
databases. Creates data pipelines and
transform
data to a structure that is relevant to the problem by selecting appropriate techniques. Develops knowledge of current data science and analytics trends. Data Source Identification: Supports the understanding of the priority order of requirements and service level agreements. Helps
identify
the most suitable source for data that is fit for purpose. Performs initial data quality checks on extracted data. Data Modeling: Analyzes complex data elements, systems, data flows, dependencies, and relationships to contribute to conceptual, physical, and logical data models. Develops the Logical Data Model and Physical Data Models including data warehouse and data mart designs. Defines relational tables, primary and foreign keys, and stored procedures to create a data model structure. Evaluates existing data models and physical databases for variances and discrepancies. Develops efficient data flows. Analyzes data-related system integration challenges and proposes
appropriate solutions
. Creates training documentation and trains end-users
on
data modeling. Oversees the tasks of less experienced programmers and stipulates system troubleshooting supports. Code Development and Testing: Writes code to develop the required solution and application features by
determining
the
appropriate programming
language and
leveraging
business, technical, and data requirements. Creates test cases to review and
validate
the proposed solution design. Creates
proofs
of concept. Tests the code using the
appropriate testing
approach. Deploys software to production servers. Contributes code documentation,
maintains
playbooks, and provides
timely
progress updates.
Minimum education and experience
required
:
Bachelor’s degree or the equivalent in Computer Science, Information Technology, Engineering, or a related field plus 2 years of experience in software engineering or related experience;
OR
Master’s degree or the equivalent in Computer Science, Information Technology, Engineering, or a related field.
Skills required:
Must have experience with: designing and building ETL workflows to load data from a variety of data sources using Spark, SQL, HQL, Triggers and Apache Sqoop to transfer data between database servers such as SQL Server, MySQL or Oracle to Data Lake; building pipelines to ingest data from on-premises clusters to cloud platforms in order to build scalable systems with high performance, reliability, and cost-effectiveness; using Spark SQL and Data Frames to write functional programs with Python and Scala for complex data transformations using in-memory computing capabilities of Spark for fast processing; writing solutions for various scenarios including file watcher and automated validations for data quality with Scripting reusable techniques; designing and developing scripts for creating, dropping tables, and extracting data from files with complex structures; evaluating the latest technologies with proof of concepts to find optimal solutions for Big Data processing in ETL jobs; optimizing SQL Queries and fine tuning
data storage using partitioning/bucketing techniques; working with in-memory database tools such as Druid for sub-second query results; performing architecture design using data warehouse concepts with Logical/Physical data modeling and Dimensional Data Modeling involving Big Data tools such as Apache Hadoop Spark, Sqoop, Map Reduce, Hive, or Parquet; designing, building and supporting the platform providing ad-hoc access to large datasets through APIs using HDFS, Hive, Big Query, Spark, Python, Shell Scripting, and Unix; developing analytical insights using SQL, reporting tools and visualization by understanding the business, working with product owners and data stewards; participating in all phases of the product development cycle from product definition and design, through implementation and testing using JIRA for Agile and Lean Methodology; performing Continuous Integration and Deployment (CI/CD) using tools such as Git or Jenkin to run test cases and build applications with code coverage using Junit and automation of acceptance Test framework with Java Spring Boot libraries; and monitoring cluster performance, setting up alerts, documenting designs, workflows and providing production support, troubleshooting, and fixing the issues by tracking the status of running applications to perform system administrator tasks.
Employer
will accept any amount of experience with the required skills.
#LI-DNP #LI-DNI
Wal-Mart is an Equal Opportunity Employer
.