Company Overview:
Bayanat, provides comprehensive world-class AI-powered geospatial solutions to a growing number of sectors such as Defense, Environment, Energy & Resources, Smart Cities and Transportation. Bayanat’s solutions harness vast amounts of premium and unique data from a range of sources including Satellites, High Altitude Pseudo Satellites (HAPS) and Earth Observation powered by AI to drive geospatial intelligence (gIQ).
Role Summary:
We are seeking an experienced Big Data DevOps Engineer to join our team. The successful candidate will be responsible for designing, building, and maintaining our company's big data infrastructure and applications. This includes setting up, deploying, and operating big data components like Hadoop, Hive, Spark, Presto, and Azure data components like Databricks. The ideal candidate should have experience in managing and optimizing large-scale data processing systems, strong programming skills, and the ability to work collaboratively across teams.
Key Responsibilities:
Design, build, and maintain scalable and reliable big data infrastructure and applications using technologies such as Hadoop, Hive, Spark, Presto, and Databricks.
Collaborate with cross-functional teams to gather requirements, design solutions, and implement data pipelines that meet business needs.
Develop and maintain ETL processes, data warehousing, and data governance practices that ensure data quality, security, and compliance.
Optimize big data processing systems for performance, scalability, and reliability, including tuning cluster configurations, monitoring resource utilization, and troubleshooting issues.
Ensure data availability and integrity by implementing backup, disaster recovery, and data retention policies.
Implement data security best practices, including access controls, encryption, and auditing.
Work closely with development teams to integrate big data analytics into our software products and services.
Collaborate with data scientists and analysts to understand their data requirements and help them optimize their data workflows.
Stay up-to-date with emerging trends and technologies in big data and cloud computing, and recommend ways to improve our data infrastructure and processes.
To qualify, you must have:
Bachelor's degree in Computer Science or related field.
At least 3 years of experience working with big data technologies such as Hadoop, Hive, Spark, Presto, and Databricks.
Experience with containerization technologies such as Docker and Kubernetes.
Knowledge of programming skills in languages such as Java, Python, Scala, or R.
Knowledge of data modeling, data warehousing, and ETL processes.
Familiarity with cloud computing platforms such as Azure and cloud-native open-source components
Knowledge of SQL and NoSQL databases.
Knowledge of data governance, security, and compliance best practices.
Excellent problem-solving and communication skills.
Ability to work independently and collaboratively as part of a distributed team.
Nice to Have:
Experience with other big data tools such as Apache Flink, Apache Storm, or Apache Kafka.
Familiarity with machine learning frameworks such as TensorFlow, PyTorch, or Scikit-learn.
Certification in big data or cloud computing technologies.