Company:
Applab Systems Inc
Location: Pittsburgh
Closing Date: 30/11/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description
Job Title: Python PySpark Developer(Fulltime role)
Location: Pittsburgh, PA(Onsite)
Experience require 8+ years and should be very strong in Pyspark.
Job Description:
4+ years working experience in data integration and pipeline development.
2+ years of Experience with AWS Cloud on data integration with Apache Spark, EMR, Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS, MongoDB/DynamoDB ecosystems
Strong real-life experience in python development especially in pySpark in AWS Cloud environment.
Design, develop test, deploy, maintain and improve data integration pipeline.
Experience in Python and common python libraries.
Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
Strong experience with source control systems such as Git, Bitbucket, and Jenkins build and continuous integration tools.
Databricks or Apache Spark Experience is a plus.
Responsibilities
Design, develop, test, deploy, support, enhance data integration solutions seamlessly to connect and integrate enterprise systems in our Enterprise Data Platform.
Innovate for data integration in Apache Spark-based Platform to ensure the technology solutions leverage cutting edge integration capabilities.
Facilitate requirements gathering and process mapping workshops, review business/functional requirement documents, author technical design documents, testing plans and scripts.
Assist with implementing standard operating procedures, facilitate review sessions with functional owners and end-user representatives, and leverage technical knowledge and expertise to drive improvements.
Location: Pittsburgh, PA(Onsite)
Experience require 8+ years and should be very strong in Pyspark.
Job Description:
4+ years working experience in data integration and pipeline development.
2+ years of Experience with AWS Cloud on data integration with Apache Spark, EMR, Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS, MongoDB/DynamoDB ecosystems
Strong real-life experience in python development especially in pySpark in AWS Cloud environment.
Design, develop test, deploy, maintain and improve data integration pipeline.
Experience in Python and common python libraries.
Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
Strong experience with source control systems such as Git, Bitbucket, and Jenkins build and continuous integration tools.
Databricks or Apache Spark Experience is a plus.
Responsibilities
Design, develop, test, deploy, support, enhance data integration solutions seamlessly to connect and integrate enterprise systems in our Enterprise Data Platform.
Innovate for data integration in Apache Spark-based Platform to ensure the technology solutions leverage cutting edge integration capabilities.
Facilitate requirements gathering and process mapping workshops, review business/functional requirement documents, author technical design documents, testing plans and scripts.
Assist with implementing standard operating procedures, facilitate review sessions with functional owners and end-user representatives, and leverage technical knowledge and expertise to drive improvements.
Share this job
Useful Links