About Course
What you’ll learn
-
Apache Spark Foundation and Spark Architecture
-
Data Engineering and Data Processing in Spark
-
Working with Data Sources and Sinks
-
Working with Data Frames and Spark SQL
-
Using PyCharm IDE for Spark Development and Debugging** Exclusive real practice problems using spark for interview
Course Content
Pyspark : why do we need it ?
Setting Up PyCharm: A Step-by-Step Guide for Python Development
Handling DataFrames with CSV Files
Handling other file formates: Json and Parquet
Handle Dataframe Structure : Guide to withColumn , withColumnRenamed and StructType Functions
Exploring split(), array () and explode() functions of Pyspark
Comparing Pyspark in-built functions : orderBy() vs sort() ? distinct() vs dropDuplicates() ? filter vs where ? union() vs unionall() ?
Aggregating Functions of Pyspark : groupBy() and groupByAgg()
Joins in Pyspark : inner () , left() , which one to choose ?
Pivot function in Pyspark
UDF’s in Pyspark : Understanding and Implementation
coalesce vs Repartition : Data engineering concept
Window functions in Pyspark : rank() vs dense_rank() and row_number() with example
Pyspark : Data Engineering Interview Questions
Earn a certificate
Add this certificate to your resume to demonstrate your skills & increase your chances of getting noticed.
Student Ratings & Reviews
No Review Yet