Do you want to be part of a team that values sustainable quality over velocity and encourages team members’ direct participation? Do you want to be part of a global organization in the continued development of a data pipeline using AI for predictive analytics and benchmarking?
Responsibilities:
- Define technology roadmap in support of product development
- Lead the design, architecture and development of multiple real time streaming data pipelines encompassing multiple product lines and edge devices
- Ensure data governance policies are followed by implementing or validating data lineage, quality checks, classification, etc.
- Resolve defects/bugs during QA testing, pre-production, production, and post-release patches
- Conduct design and code reviews
- Provide technical leadership to agile teams – onshore and offshore: mentor junior engineers and new team members, and apply technical expertise to challenging programming and design problems
- Analyze and improve efficiency, scalability, and stability of various system resources
- Foster Agile Development environment and apply the methodologies
- Monitor risk of technical debt and ensure technical debt is not created
Required Skills:
- Expert knowledge of data architectures, data pipelines, real time processing, streaming, networking, and security
- Proficient understanding of distributed computing principles
- Advanced knowledge of Big Data querying tools, such as Pig or Hive
- Expert understanding of Lambda Architecture, along with its advantages and drawbacks
- Proficiency with MapReduce, HDFS
- Experience with integration of data from multiple data sources and multiple data types
- Bachelor’s Degree in Software Engineer
- 12+ years’ experience in software engineering with 2+ years using public cloud
- 6+ Experience developing ETL processing flows using MapReduce technologies like Spark and Hadoop
- 4+ years’ experience developing with ingestion and clustering frameworks such as Kafka, Zookeeper, YARN
- 1+ years’ experience with DataBricks
- 3+ years’ experience:oSQL databases, such as HBase, Cassandra, MongoDB
- Big Data ML toolkits, such as Mahout, SparkML, or H2O
- Scala or Java Language as it relates to product development.
- 3+ Years’ DevOps experience in cloud technologies like AWS, CloudFront, Kubernetes, VPC, RDS, etc
- Management of Spark or Hadoop clusters, with all included services
- 4+ years’ experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
- 4+ years’ experience with various messaging systems
- 1+ years of DevOps experience
- 1+ years’ benchmarking experience