# py-facebook-scraper **Repository Path**: evoup/py-facebook-scraper ## Basic Information - **Project Name**: py-facebook-scraper - **Description**: 私有版本,可能早于上游开源代码修复,当然一般不可能快,但是当作者提出解决方案时,但没有release版本的时候,可以快速fix线上问题。 - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2020-11-05 - **Last Updated**: 2020-12-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Facebook Scraper Scrape Facebook public pages without an API key. Inspired by [twitter-scraper](https://github.com/kennethreitz/twitter-scraper). ## Install ```sh pip install facebook-scraper ``` ## Usage Send the unique **page name** as the first parameter and you're good to go: ```python >>> from facebook_scraper import get_posts >>> for post in get_posts('nintendo', pages=1): ... print(post['text'][:50]) ... The final step on the road to the Super Smash Bros We’re headed to PAX East 3/28-3/31 with new games ``` ### CLI usage ```sh $ facebook-scraper --filename nintendo_page_posts.csv --pages 1 nintendo ``` Use ```sh $ facebook-scraper --help ``` for more details on CLI usage ### Optional parameters - **group**: group id, to scrape groups instead of pages. Default is `None`. - **pages**: how many pages of posts to request, usually the first page has 2 posts and the rest 4. Default is 10. - **timeout**: how many seconds to wait before timing out. Default is 5. - **credentials**: tuple of user and password to login before requesting the posts. Default is `None`. - **extra_info**: bool, if true the function will try to do an extra request to get the post reactions. Default is False. - **youtube_dl**: bool, use Youtube-DL for (high-quality) video extraction. You need to have youtube-dl installed on your environment. Default is False. ## Post example ```python {'post_id': '2257188721032235', 'text': 'Don’t let this diminutive version of the Hero of Time fool you, ' 'Young Link is just as heroic as his fully grown version! Young Link ' 'joins the Super Smash Bros. series of amiibo figures!', 'time': datetime.datetime(2019, 4, 29, 12, 0, 1), 'image': 'https://scontent.flim16-1.fna.fbcdn.net' '/v/t1.0-0/cp0/e15/q65/p320x320' '/58680860_2257182054366235_1985558733786185728_n.jpg' '?_nc_cat=1&_nc_ht=scontent.flim16-1.fna' '&oh=31b0ba32ec7886e95a5478c479ba1d38&oe=5D6CDEE4', 'images': ['https://scontent.flim16-1.fna.fbcdn.net' '/v/t1.0-0/cp0/e15/q65/p320x320' '/58680860_2257182054366235_1985558733786185728_n.jpg' '?_nc_cat=1&_nc_ht=scontent.flim16-1.fna' '&oh=31b0ba32ec7886e95a5478c479ba1d38&oe=5D6CDEE4'], 'likes': 2036, 'comments': 214, 'shares': 0, 'reactions': {'like': 135, 'love': 64, 'haha': 10, 'wow': 4, 'anger': 1}, # if `extra_info` was set 'post_url': 'https://m.facebook.com/story.php' '?story_fbid=2257188721032235&id=119240841493711', 'link': 'https://bit.ly/something'} ``` ### Notes - There is no guarantee that every field will be extracted (they might be `None`). - Shares doesn't seem to work at the moment. - Group posts may be missing some fields like `time` and `post_url`. - Group scraping may return only one page and not work on private groups. ## To-Do - Async support - Image galleries - Profiles or post authors - Comments ## Alternatives and related projects - [facebook-post-scraper](https://github.com/brutalsavage/facebook-post-scraper). Has comments. Uses Selenium. - [facebook-scraper-selenium](https://github.com/apurvmishra99/facebook-scraper-selenium). "Scrape posts from any group or user into a .csv file without needing to register for any API access". - [Ultimate Facebook Scraper](https://github.com/harismuneer/Ultimate-Facebook-Scraper). "Scrapes almost everything about a Facebook user's profile". Uses Selenium. - [Unofficial APIs](https://github.com/Rolstenhouse/unofficial-apis). List of unofficial APIs for various services, none for Facebook for now, but might be worth to check in the future. - [major-scrapy-spiders](https://github.com/talhashraf/major-scrapy-spiders). Has a profile spider for Scrapy. - [facebook-page-post-scraper](https://github.com/minimaxir/facebook-page-post-scraper). Seems abandoned. - [FBLYZE](https://github.com/isaacmg/fb_scraper). Fork (?). - [RSSHub](https://github.com/DIYgod/RSSHub/blob/master/lib/routes/facebook/page.js). Generates an RSS feed from Facebook pages. - [RSS-Bridge](https://github.com/RSS-Bridge/rss-bridge/blob/master/bridges/FacebookBridge.php). Also generates RSS feeds from Facebook pages.