Total 3 rounds one technical , manager and hr round and onboarding went smoothly and now I am working here.
Design, develop, and maintain scalable data pipelines for batch and streaming data, handling both structured and unstructured formats effectively.
Utilize advanced analytics tools and methodologies to supplement and enhance the analytics platform, contributing to our journey towards advanced analytical capabilities.
Interview questions [1]
Question 1
1. Can you provide an overview of your experience working with PySpark and big data processing? 2. What motivated you to specialize in PySpark, and how have you applied it in your previous roles? 3. Explain the basic architecture of PySpark. 4. How does PySpark relate to Apache Spark, and what advantages does it offer in distributed data processing? 5. Describe the difference between a DataFrame and an RDD in PySpark.