# Realtime-Data-Analytics-Using-Spark **Repository Path**: email4reg/Realtime-Data-Analytics-Using-Spark ## Basic Information - **Project Name**: Realtime-Data-Analytics-Using-Spark - **Description**: Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2020-02-16 - **Last Updated**: 2020-12-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Realtime Data Analytics Using Apache Spark Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc ### Description Project uses Apache Spark functionalities (SparkSQL, Spark Streaming, MLib) to build machine learning models (Batch Processing-Slow) and then apply the model with (Spark Streaming-Fast) to predict new output. ### Data MashUp We utilize historical and streaming data from different social media networks through network provided APIs. * Twitter - https://apps.twitter.com/ * MeetUp - https://secure.meetup.com/meetup_api * GitHub - [Guides : https://developer.github.com/v3/, API Calls: https://api.github.com/, API Keys : https://github.com/settings/developers, Tokens : https://github.com/settings/tokens ### Tools * DataBricks Community Edition * Anaconda Python 2.7 Distro (Pandas, etc) * Apache Spark (SparkSQL, Spark Streaming, Spark MLib, GraphX) * Apache Kafka (Realtime distributed message passing tool) * Persistent Data Store (RDMBS:MySQL, Columnar:CSV, Casandra, Document:MongoDB) ### Required Libraries > pip install Twitter
> pip install PyGithub
> pip install ### Associated Project - R3levancy! > Discovering what everyone is whispering about on social media. Fantastic tool to discover what's really trending across social media and hot topics discovery. - [x] Delivering REALTIME news, events, alerts tailored to users needs and interest. - [x] Search Twitter, Facebook, Google+ for keywords. - [x] Batch process with Spark - [x] Present on web pages, send alerts and push to users.