• Pune : +91-20-2427 2383 / 2426 4291 / 2426 0308
  • Karad : 02164 - 225500 / 225800

Big Data - Apache Spark

Course Name : Big Data - Apache Spark

Batch Schedule : 02-Nov-2019   To   11-Jan-2020

Schedule : Saturday Only (8:00 am to 2:00 pm)

Duration : 65 hours - 11 Saturdays

Timings : 8:00 AM  To  2:00 PM

Fees : Rs. 14000/- (Incl 18% GST)

  • Students and Freshers.
  • Professionals willing to switch to Big Data / Spark developer stream.
Click to Register
  • Linux commands familiarity
  • Any RDBMS (like Oracle or MySQL)
  • Python3 programming skills
  • Java programming awareness (for Hadoop MR demos)
  • XML awareness
Click to Register
Spark lessons were great and Scala learning experience was really great.
[Spark 01 Batch] Pranjal Dutta (Senior Architect)

 

Excellent modular course & Excellent faculty of sunbeam! Those who want to gain knowledge rather than just information of Apache Spark should go to Sunbeam's Big Data Apache Spark batch.
[Spark 01 Batch] Ashwini P Patil (Project Engineer)

 

Weekend batch is good initiative. Professional can spare time for learning new things. Faculty's way of teaching really helps a lot to understand basic concepts very easily. Looking for more such courses. 
[Hadoop 03 Batch] Sushil Taskar (Software Engineer)

 

Appreciate Sir's tremendous efforts to explain the concepts again and again. The course was very detailed along with lot of hands-on experience. It surely helped me to understand the big data problems and how hadoop MR helps to solve it. When I started the course, I was absolutely naive on this topic but after the training I am confident on hadoop. Thanks a lot Sunbeam for helping associates by imparting such quality trainings.
[Hadoop 03 Batch] Anonymous
Click to Register
  • Core i3 (64-bit) and above
  • RAM Min 8 GB. Recommended: 16 GB+.
  • 64-bit Linux – Ubuntu.
Click to Register
  • Data science (math/stat) - However implementation of stats formulae in Spark job will be covered.
  • Machine Learning - However simple ML program using Spark MLLib will be demonstrated.
  • Hadoop administration - However some basic config and performance related config will be discussed.
  • Spark administration - However some basic config and performance related config will be discussed.
  • Spark cluster on cloud - However multi-node cluster with minimal configuration will be covered.
  • Python3 Programming Language - However for Spark programming will be done in Python3.
  • Reporting and visualization tools.
Click to Register
  • Hadoop 2.x
    • Hadoop installation modes
    • Setting up Hadoop cluster
    • HDFS Java API
    • Implementing MR jobs
    • Parsing MR job args
    • Hadoop data types & custom writables
    • Job counters & configuration
    • Input Splits
    • Input/Output formats, Compression
    • Partitioner & Combiners
    • Hadoop Streaming
    • MR Job execution on YARN  
  • Hive
    • Hive introduction, architecture, installation
    • Hive CLI, Security, Beeline, Metastore & Derby
    • Hive managed & external tables,
    • Hive QL: Loading, Filtering, Grouping, Joins
    • Hive simple & complex types, DDL, DML, DQL
    • Hive indexes, views, query optimizations
    • Hive serialization / deserialization, Loading data
    • Partitioning: static & dynamic – use cases
    • Bucketing, use cases of Partitions & Buckets
    • Hive functions, operators and Hive UDF impl.
    • Thrift server, Java/JDBC connectivity  
  • Apache Spark 2
    • Spark concepts
    • Distributed Computing Challenges
    • Spark Architecture & Components
    • Spark Installation & Deployment
    • Setting up Spark cluster
    • PySpark concepts
    • PySpark Shell
    • PySpark installation
    • Executing Spark Python programs
    • Spark Web UI
    • Spark in Pycharm IDE
    • Spark on Databricks cloud
  • Apache Spark 2 - Spark Core
    • Spark RDD, Transformations & Actions, Data Load & Save
    • RDD characteristcus & execution
    • Types of RDD: Key-value, Two Pair, ...
    • Accumulators & Broadcast variables
    • RDD Internals: Distributed/Partitions, Lineage, Persistence
    • Implementing & Submitting Spark Job
    • Execution of Spark Job (RDD)
    • DAG visualization
  • Apache Spark 2 - Spark SQL
    • Spark SQL Introduction
    • Architecture
    • SQLContext & SparkSession
    • Data Frames & Datasets
    • Data Frame Columns & Expressions
    • Implementing & Executing Spark SQL job
    • Interoperating with RDDs
    • User Defined Functions
    • File Formats & Loading data
    • Spark SQL data types & schema
    • Spark SQL functions
    • UDFs & their execution
    • Global/Temporary views
    • Partitioning & Bucketing
    • SQLContext & HiveContext
    • Processing Hive data using Spark SQL
  • Apache Spark 2 - Spark Streaming
    • Streaming concepts
    • Microbatches vs Continuous job
    • Spark Streaming concepts
    • Streaming Context & DStreams
    • Transformations on DStreams
    • Windowing Concept, Windowed Operators:Slice, Window and ReduceByWindow, Stateful Operators
    • Twitter data processing
    • Spark Structured Streaming concepts
    • Triggers, Event time based processing & Watermark
    • Input sources & output sinks
    • Structured Streaming application execution
    • Apache Kafka Introduction
    • Kafka Architecture
    • Kafka Cluster Components & Configuration
    • Kafka Applications
    • Kafka Python client
    • Kafka Spark Source & Sink
  • Apache Spark 2 - Spark ML Introduction
    • Advanced Analytics concepts
    • Advanced Analytics workflow
    • Spark Machine Learning concepts
    • Transformers, Estimators & Models
    • Implement ML model using MLLib
    • Consuming Spark ML model
Click to Register
Sr.No Batch Code Start Date End Date Time
1 Spark03 02-Nov-2019 11-Jan-2020 8:00 AM  To  2:00 PM

Schedule : Saturday Only (8:00 am to 2:00 pm)

Click to Register

Contact us

Sunbeam Market Yard Pune

'Sunbeam Chambers', Plot No.R/2, Market Yard Road, Behind Hotel Fulora, Gultekdi,    Pune - 411 037. MH-INDIA.

+91-20-2427 2383 / 2426 4291 / 2426 0308
Sunbeam Hinjawadi Pune

Authorized Training Center of C-DAC

"Sunbeam IT Park", Phase 2 of Rajiv Gandhi Infotech Park,Hinjawadi, Pune - 411057, MH-INDIA

+7410 071 951
Sunbeam Karad

Authorized Training Center of C-DAC

'Anuda Chambers', 203 Shaniwar Peth, Near Gujar Hospital, Karad - 415 110,     Dist. Satara, MH-INDIA.

02164 - 225500 / 225800