Company:
Abacus Service Corporation
Location: Tucson
Closing Date: 04/12/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description
Data Lake Database Administrator
Position Summary: The Database Administrator designs, implements and optimizes our Cloud-based data lake solution within the AWS ecosystem to include storage, databases, and compute warehousing. The position is responsible for ensuring efficient data ingestion, transformation, and security. The role involves collaboration with cross-functional teams to support data accessibility, governance, and infrastructure while ensuring the scalability and reliability of cloud data services.
% of Time
Essential Function (Yes/No)
Function
Duties Performed
30%
Yes
Design and Implementation
Design and implement Snowflake-based Data Lakes within the AWS ecosystem, ensuring a scalable and secure architecture.
Develop data models, schemas, and storage solutions that optimize performance for structured and semi-structured data.
Collaborate with data engineers and other DBAs to ensure alignment between data systems, business requirements and organization standards.
20%
Yes
Enhancements, Migrations, and ETL/ELT Processes
Lead enhancements and migrations of data systems, including transitioning from legacy systems to modern cloud-based platforms (e.g., Snowflake, Redshift, Databricks).
Design and optimize ETL/ELT processes to ensure seamless data ingestion, transformation, and loading from multiple sources.
Continuously improve data pipeline efficiency by incorporating best practices and automating manual processes where applicable.
20%
Yes
Troubleshooting and Support
Identify, diagnose, and resolve performance bottlenecks, data integration issues, and cloud infrastructure challenges.
Collaborate with application and data teams to troubleshoot complex issues across AWS, Apache tools, and database systems.
Address data governance and security concerns, ensuring compliance with organizational policies.
15%
Yes
Monitoring, Performance Tuning, and Maintenance
Monitor Snowflake and related data environments to ensure system availability, performance, and reliability.
Perform database tuning, query optimization, and capacity planning for cloud data systems.
Manage backups, recovery strategies, and disaster recovery planning for mission-critical data systems.
5%
Yes
Documentation
Create and maintain comprehensive documentation for data architectures, ETL/ELT workflows, and operational processes.
Provide clear, detailed runbooks and operational guides to ensure a smooth handoff to other teams.
Document system changes, enhancements, and troubleshooting procedures for knowledge sharing and compliance.
5%
No
Professional Growth and Development
Stays informed of industry trends, emerging technologies, and best practices related to database management systems and cloud and data lake solutions and incorporate them into departmental strategies and practices.
5%
No
Other Duties as Assigned
Maintains knowledge of current developments and practices in information technology. Performs other related projects or assignments as assigned or requested.
Preferred Qualifications:
Bachelor's degree in related field and three (3) to five (5) years of experience as a Database Administrator or in a related role. Experience working with diverse technologies and databases. Strong understanding of cloud-based platforms. Expertise with Snowflake, Databricks, or other cloud-native data warehousing and analytics solutions. With expertise in automation and CI/CD pipelines as well as proficiency in Apache Spark, Kafka, Hadoop, or similar big data tools.
Knowledge of:
Data lake architectures and cloud-based data warehousing solutions, specifically Snowflake and AWS (S3, Glue, Redshift)
Relational Database Management Systems (RDBMS)
ETL/ELT processes, data integration techniques, and data pipeline optimization
Big data technologies, including Apache Spark, Kafka, and Hadoop, and their role in cloud ecosystems.
Database performance tuning, query optimization, and cloud infrastructure management.
Data governance, security, and compliance frameworks (e.g., PCI, CJIS, HIPAA).
Cloud-based data ecosystems to design and maintain scalable and secure data lakes (preferred AWS).
Database clustering and replication technologies including containerization.
Skill in:
Linux and Windows environments.
Managing and optimizing Snowflake, Redshift, or Databricks environments for high availability and performance.
Designing, implementing, and troubleshooting complex ETL/ELT processes and data pipelines using Apache tools (e.g., Spark, Kafka).
Managing, creating, and optimizing ETL processes using AWS DMS and GLUE services.
Automating cloud-based data workflows using scripting languages (e.g., Python, PowerShell, Bash) and CI/CD tools (e.g., GLUE, Airflow).
Troubleshooting and resolving database performance issues and cloud infrastructure bottlenecks.
Effective communication to interact with technical and non-technical stakeholders and present findings and recommendations.
Collaboration to work effectively within cross-functional teams and across departments.
Developing comprehensive documentation and technical specifications.
Testing to ensure system functionality and reliability.
Continuous learning to stay updated with emerging technologies and industry trends.
Ability to:
Apply knowledge of cloud-based data ecosystems to design and maintain scalable and secure data lakes.
Analyze, diagnose, and resolve technical issues related to data architecture, ETL/ELT processes, and cloud services.
Adapt to changing technology landscapes and evolving business requirements.
Collaborate with cross-functional teams to meet data accessibility, governance, and reporting requirements.
Work under pressure and prioritize tasks effectively to meet project deadlines.
Learn quickly and apply new concepts and techniques.
Ensure accuracy in database administration, design, and implementation through attention to detail.
Solve problems by addressing technical issues and challenges efficiently.
Provide effective leadership and mentorship in areas of expertise.
Critically think to evaluate and improve existing systems and processes.
Provide quality support and meet user needs effectively.
Effectively document and communicate technical solutions, operational procedures, and system enhancements for operational continuity.
Position Summary: The Database Administrator designs, implements and optimizes our Cloud-based data lake solution within the AWS ecosystem to include storage, databases, and compute warehousing. The position is responsible for ensuring efficient data ingestion, transformation, and security. The role involves collaboration with cross-functional teams to support data accessibility, governance, and infrastructure while ensuring the scalability and reliability of cloud data services.
% of Time
Essential Function (Yes/No)
Function
Duties Performed
30%
Yes
Design and Implementation
Design and implement Snowflake-based Data Lakes within the AWS ecosystem, ensuring a scalable and secure architecture.
Develop data models, schemas, and storage solutions that optimize performance for structured and semi-structured data.
Collaborate with data engineers and other DBAs to ensure alignment between data systems, business requirements and organization standards.
20%
Yes
Enhancements, Migrations, and ETL/ELT Processes
Lead enhancements and migrations of data systems, including transitioning from legacy systems to modern cloud-based platforms (e.g., Snowflake, Redshift, Databricks).
Design and optimize ETL/ELT processes to ensure seamless data ingestion, transformation, and loading from multiple sources.
Continuously improve data pipeline efficiency by incorporating best practices and automating manual processes where applicable.
20%
Yes
Troubleshooting and Support
Identify, diagnose, and resolve performance bottlenecks, data integration issues, and cloud infrastructure challenges.
Collaborate with application and data teams to troubleshoot complex issues across AWS, Apache tools, and database systems.
Address data governance and security concerns, ensuring compliance with organizational policies.
15%
Yes
Monitoring, Performance Tuning, and Maintenance
Monitor Snowflake and related data environments to ensure system availability, performance, and reliability.
Perform database tuning, query optimization, and capacity planning for cloud data systems.
Manage backups, recovery strategies, and disaster recovery planning for mission-critical data systems.
5%
Yes
Documentation
Create and maintain comprehensive documentation for data architectures, ETL/ELT workflows, and operational processes.
Provide clear, detailed runbooks and operational guides to ensure a smooth handoff to other teams.
Document system changes, enhancements, and troubleshooting procedures for knowledge sharing and compliance.
5%
No
Professional Growth and Development
Stays informed of industry trends, emerging technologies, and best practices related to database management systems and cloud and data lake solutions and incorporate them into departmental strategies and practices.
5%
No
Other Duties as Assigned
Maintains knowledge of current developments and practices in information technology. Performs other related projects or assignments as assigned or requested.
Preferred Qualifications:
Bachelor's degree in related field and three (3) to five (5) years of experience as a Database Administrator or in a related role. Experience working with diverse technologies and databases. Strong understanding of cloud-based platforms. Expertise with Snowflake, Databricks, or other cloud-native data warehousing and analytics solutions. With expertise in automation and CI/CD pipelines as well as proficiency in Apache Spark, Kafka, Hadoop, or similar big data tools.
Knowledge of:
Data lake architectures and cloud-based data warehousing solutions, specifically Snowflake and AWS (S3, Glue, Redshift)
Relational Database Management Systems (RDBMS)
ETL/ELT processes, data integration techniques, and data pipeline optimization
Big data technologies, including Apache Spark, Kafka, and Hadoop, and their role in cloud ecosystems.
Database performance tuning, query optimization, and cloud infrastructure management.
Data governance, security, and compliance frameworks (e.g., PCI, CJIS, HIPAA).
Cloud-based data ecosystems to design and maintain scalable and secure data lakes (preferred AWS).
Database clustering and replication technologies including containerization.
Skill in:
Linux and Windows environments.
Managing and optimizing Snowflake, Redshift, or Databricks environments for high availability and performance.
Designing, implementing, and troubleshooting complex ETL/ELT processes and data pipelines using Apache tools (e.g., Spark, Kafka).
Managing, creating, and optimizing ETL processes using AWS DMS and GLUE services.
Automating cloud-based data workflows using scripting languages (e.g., Python, PowerShell, Bash) and CI/CD tools (e.g., GLUE, Airflow).
Troubleshooting and resolving database performance issues and cloud infrastructure bottlenecks.
Effective communication to interact with technical and non-technical stakeholders and present findings and recommendations.
Collaboration to work effectively within cross-functional teams and across departments.
Developing comprehensive documentation and technical specifications.
Testing to ensure system functionality and reliability.
Continuous learning to stay updated with emerging technologies and industry trends.
Ability to:
Apply knowledge of cloud-based data ecosystems to design and maintain scalable and secure data lakes.
Analyze, diagnose, and resolve technical issues related to data architecture, ETL/ELT processes, and cloud services.
Adapt to changing technology landscapes and evolving business requirements.
Collaborate with cross-functional teams to meet data accessibility, governance, and reporting requirements.
Work under pressure and prioritize tasks effectively to meet project deadlines.
Learn quickly and apply new concepts and techniques.
Ensure accuracy in database administration, design, and implementation through attention to detail.
Solve problems by addressing technical issues and challenges efficiently.
Provide effective leadership and mentorship in areas of expertise.
Critically think to evaluate and improve existing systems and processes.
Provide quality support and meet user needs effectively.
Effectively document and communicate technical solutions, operational procedures, and system enhancements for operational continuity.