
Cloudera CDP Data Engineer - Certification - CDP-3002 Exam Questions
QUESTION NO: 1
You need to process data stored in AWS S3 using SparkSQL. Which of the following options correctly reads a JSON file stored in S3 into a DataFrame and performs a SQL query on it?
You need to process data stored in AWS S3 using SparkSQL. Which of the following options correctly reads a JSON file stored in S3 into a DataFrame and performs a SQL query on it?
Correct Answer: A
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 2
Which Spark component is responsible for managing the execution of tasks on worker nodes?
Which Spark component is responsible for managing the execution of tasks on worker nodes?
Correct Answer: A
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 3
What advanced technique can be used in Hive to optimize queries on bucketed tables by skipping unnecessary data?
What advanced technique can be used in Hive to optimize queries on bucketed tables by skipping unnecessary data?
Correct Answer: B
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 4
You need to design your Airflow DAG for data quality checks to be easily monitored and visualized. How can you achieve this?
You need to design your Airflow DAG for data quality checks to be easily monitored and visualized. How can you achieve this?
Correct Answer: A,B,D
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 5
You're facing a schema mismatch between a Spark DataFrame and a Hive table when trying to write the DataFrame to the table. What are the potential causes and how can you address them?
You're facing a schema mismatch between a Spark DataFrame and a Hive table when trying to write the DataFrame to the table. What are the potential causes and how can you address them?
Correct Answer: A
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 6
How does Apache NiFi support schema inference during data flow management?
How does Apache NiFi support schema inference during data flow management?
Correct Answer: B
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 7
What challenge does schema inference aim to address when dealing with big data ecosystems?
What challenge does schema inference aim to address when dealing with big data ecosystems?
Correct Answer: B
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 8
When using Apache Airflow to schedule quality checks, which strategy helps ensure that checks are only run on the most recent data partition?
A Use the LatestOnlyOperator to skip tasks that are not the latest in a series of executions.
When using Apache Airflow to schedule quality checks, which strategy helps ensure that checks are only run on the most recent data partition?
A Use the LatestOnlyOperator to skip tasks that are not the latest in a series of executions.
Correct Answer: B
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 9
In a meeting about Spark applications deployment on Kubernetes, your team discusses strategies for handling node failures. Which Kubernetes feature should be utilized to maintain high availability in case of node failures?
A Horizontal Pod Autoscaler
In a meeting about Spark applications deployment on Kubernetes, your team discusses strategies for handling node failures. Which Kubernetes feature should be utilized to maintain high availability in case of node failures?
A Horizontal Pod Autoscaler
Correct Answer: C
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).
QUESTION NO: 10
Which Airflow feature allows you to template your tasks, enabling dynamic generation of task parameters such as table names for data quality checks?
Which Airflow feature allows you to template your tasks, enabling dynamic generation of task parameters such as table names for data quality checks?
Correct Answer: A
Explanation: Only visible for Pass4Test members. You can sign-up / login (it's free).




