# Realtime-Data-Analytics-Using-Spark
**Repository Path**: email4reg/Realtime-Data-Analytics-Using-Spark
## Basic Information
- **Project Name**: Realtime-Data-Analytics-Using-Spark
- **Description**: Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 1
- **Forks**: 0
- **Created**: 2020-02-16
- **Last Updated**: 2020-12-18
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Realtime Data Analytics Using Apache Spark
Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc
### Description
Project uses Apache Spark functionalities (SparkSQL, Spark Streaming, MLib) to build machine learning models (Batch Processing-Slow) and then apply the model with (Spark Streaming-Fast) to predict new output.
### Data MashUp
We utilize historical and streaming data from different social media networks through network provided APIs.
* Twitter - https://apps.twitter.com/
* MeetUp - https://secure.meetup.com/meetup_api
* GitHub - [Guides : https://developer.github.com/v3/, API Calls: https://api.github.com/, API Keys : https://github.com/settings/developers, Tokens : https://github.com/settings/tokens
### Tools
* DataBricks Community Edition
* Anaconda Python 2.7 Distro (Pandas, etc)
* Apache Spark (SparkSQL, Spark Streaming, Spark MLib, GraphX)
* Apache Kafka (Realtime distributed message passing tool)
* Persistent Data Store (RDMBS:MySQL, Columnar:CSV, Casandra, Document:MongoDB)
### Required Libraries
> pip install Twitter
> pip install PyGithub
> pip install
### Associated Project - R3levancy!
> Discovering what everyone is whispering about on social media. Fantastic tool to discover what's really trending across social media and hot topics discovery.
- [x] Delivering REALTIME news, events, alerts tailored to users needs and interest.
- [x] Search Twitter, Facebook, Google+ for keywords.
- [x] Batch process with Spark
- [x] Present on web pages, send alerts and push to users.