Skip to content

Brendonk139/bluesky-posts-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Bluesky Posts Scraper

This project collects public post data from the Bluesky platform, giving you a straightforward way to analyze conversations, trends, and user activity. It focuses on clean, structured extraction so you can plug the results into any workflow with minimal effort. If you need reliable Bluesky post scraping for research, monitoring, or analytics, this tool has you covered.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Bluesky Posts Scraper you've just found your team — Let's Chat. 👆👆

Introduction

The scraper collects detailed information from public posts and returns it in an organized dataset. It solves the challenge of manually gathering Bluesky content at scale and is ideal for analysts, developers, and teams tracking platform activity.

Why This Matters

  • Helps you analyze user interactions and engagement patterns.
  • Supports monitoring specific topics, hashtags, or users.
  • Offers structured, machine-friendly results for automation.
  • Reduces time spent manually searching and collecting data.
  • Works well for both one-off pulls and recurring research tasks.

Features

Feature Description
Query-based scraping Pull posts using specific search terms, hashtags, or user queries.
Time-range filtering Limit results using since and until parameters for historical research.
Language targeting Extract only posts written in your preferred languages.
Engagement metrics Fetch likes, replies, and repost counts for analysis.
Media extraction Capture thumbnails, full images, and attachments.
Flexible sorting Choose between latest or top-ranked posts.

What Data This Scraper Extracts

Field Name Field Description
id Unique identifier for each post.
authorId User ID of the post creator.
authorName Display name of the author.
authorUsername Username/handle associated with the user.
authorAvatar URL to the author’s avatar image.
text The full text of the post.
images Array of image objects with URLs and metadata.
primaryImage Full-size primary image when available.
link Any attached external link.
createdAt Timestamp of when the post was published.
langs Languages detected in the post.
replyCount Number of replies.
repostCount Number of reposts.
likeCount Number of likes.
url Direct link to the post on Bluesky.

Example Output

[
  {
    "id": "bafyreibqjfx2ejvxkd3okjtodoyqvoyk7wuberwonsakdjv4yahp2lrn4a",
    "authorId": "did:plc:pc2aiklrpzwgsiq3fuohbui4",
    "authorName": "Keri Warbis",
    "authorUsername": "keriwarbis.bsky.social",
    "authorAvatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:pc2aiklrpzwgsiq3fuohbui4/bafkreihgejbtckxrsgrba7ckx6mlsofe6nzvs4t2m54y2in6edcp65tlne@jpeg",
    "text": "Bit sunburnt from yesterday’s stint in the garden.\n\nBit hungover from Eurovision.\n\nAnother day of sun and entertaining ahead.\n\nSunday roast at The Grand will be happening to round of the day.",
    "images": [
      {
        "thumb": "https://cdn.bsky.app/img/feed_thumbnail/plain/did:plc:pc2aiklrpzwgsiq3fuohbui4/bafkreiejc6jc4z47urksbwn7owoyyi4o46ufi362e52bscl4bsjrj4izyq@jpeg",
        "fullsize": "https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:pc2aiklrpzwgsiq3fuohbui4/bafkreiejc6jc4z47urksbwn7owoyyi4o46ufi362e52bscl4bsjrj4izyq@jpeg",
        "alt": "",
        "aspectRatio": {
          "height": 820,
          "width": 828
        }
      }
    ],
    "link": null,
    "primaryImage": "https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:pc2aiklrpzwgsiq3fuohbui4/bafkreiejc6jc4z47urksbwn7owoyyi4o46ufi362e52bscl4bsjrj4izyq@jpeg",
    "createdAt": "2024-05-12T08:36:29.345Z",
    "langs": ["en"],
    "replyCount": 0,
    "repostCount": 0,
    "likeCount": 0,
    "url": "https://bsky.app/profile/keriwarbis.bsky.social/post/3kxkwdhu77o23"
  }
]

Directory Structure Tree

Bluesky Posts Scraper/
├── src/
│   ├── runner.js
│   ├── extractors/
│   │   ├── bluesky_parser.js
│   │   └── utils_time.js
│   ├── outputs/
│   │   └── exporters.js
│   └── config/
│       └── settings.example.json
├── data/
│   ├── queries.sample.txt
│   └── sample-output.json
├── package.json
└── README.md

Use Cases

  • Researchers track topic trends to understand how discussions evolve over time, helping them identify emerging patterns.
  • Marketing teams follow influencers or brand mentions to refine campaigns and messaging.
  • Analysts monitor user engagement on specific themes so they can assess audience reactions.
  • Developers integrate Bluesky post data into dashboards to power real-time insights.
  • Journalists keep an eye on public conversations to support data-driven reporting.

FAQs

How do I limit the number of posts returned? Use the limit parameter to cap the number of posts per query.

Can I filter posts by language? Yes, setting the language option restricts results to posts in specific languages.

Does it support date-range filtering? You can use since and until fields to control the time window for extraction.

What formats can I export the data to? Results can be converted into JSON, CSV, or any format supported by your processing pipeline.


Performance Benchmarks and Results

Primary Metric: Consistently processes around several hundred posts per minute depending on query complexity and network conditions.

Reliability Metric: Maintains a high completion rate with stable extraction across varying content types.

Efficiency Metric: Handles large batches with minimal overhead, allowing scalable automation.

Quality Metric: Extracted data reaches a strong completeness level with detailed metadata, accurate timestamps, and well-structured media fields.

Book a Call Watch on YouTube

Review 1

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

Review 2

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

Review 3

"Exceptional results, clear communication, and flawless delivery. Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Releases

No releases published

Packages

 
 
 

Contributors