Raw CAR/CBOR format parsing
Parsing raw binary content of the realtime firehose from WebSocket, and CAR account repository snapshot data.
Self-contained, zero dependencies.
npm install bski
import { firehose } from 'bski'; // import from npm
const chunk = [];
for await(const msg of firehose.each()) {
chunk.push(msg);
if(chunk.length === 1000) break;
}
import { readCAR } from 'bski'; // import from npm
const did = 'did:plc:z72i7hdynmk6r22z27h6tvur';
const car = await fetch('https://puffball.us-east.host.bsky.network/xrpc/com.atproto.sync.getRepo?did=' + did)
.then(x => x.arrayBuffer());
const records = readCAR(did, car);
firehose(address = 'wss://bsky.network/xrpc/com.atproto.sync.subscribeRepos'):
AsyncIterable<FirehoseRecord[]>
Connects to a firehose via WebSocket (defaults to the central server https://bsky.network/ or potentially a local server i.e. PDS if the address parameter is provided).
Yields records in batches FirehoseRecord[]
.
Batching lets you consume records at your own speed. If you're iterating and immediatelly processing — they will come in batches of one. If your code stalls in process, they will queue up and next iteration will come with whole pile at once.
Exiting the iterator loop disconnects from the WebSockets and discards any unprocessed records.
firehose.each(address?): AsyncIterable<FirehoseRecord>
Same as the firehose() above, but always reporting records one by one.
The queueing still happens behind the scene, but if your code stalls it will still receive each record separately. That comes with a small performance penalty.
readCAR(messageBuf: ArrayBuffer | Uint8Array, did: string): FirehoseRepositoryRecord[]
Parses binary CAR/DAG/CBOR format that is the archive/database format for BlueSky account history.
The parser is pretty fast: 50Mb repository takes 1-2 seconds. However, for a web app that delay could be jarring. Enter sequenceReadCAR:
sequenceReadCAR(messageBuf: ArrayBuffer | Uint8Array, did: string):
Iterable<FirehoseRepositoryRecord | undefined>
Parsing that binary, yielding the parsed records in implementation-defined batches.
This lets your code parse CAR even on the main thread incrementally, without freezing the app.
Apart from capturing the built-in BlueSky fields, both firehose and readCAR collect a couple extras:
at://<did>/<type>/<hash>
way of referring to events in ATProtocreate
but can also be delete
or update
(think updating user profile)The firehose functionality existed in colds.ky codebase for a while, using some of the packages referenced by the official @atproto/api:
But those are complex and broader-purpose libraries. Later @mary.my.id created leaner, more focused set of libraries to transcode some of the same formats, @atcute/* - MIT license.
And now this library here is taking in only few necessary bits, focusing on singular use case: parsing realtime firehose, and account repository CAR.
MIT Oleg Mihailik
a collection of lightweight TypeScript packages for AT Protocol, the protocol powering Bluesky.
Use this repository to get started with your own Bluesky Labeler.
ATProto Feed Generator Starter Kit
AT Protocol Reference Implementation (TypeScript)
A dead simple client for subscribing to an ATProto Relay ("firehose").
A fully typed client for the Bluesky Jetstream (https://github.com/bluesky-social/jetstream) service.
Your Brand Here!
50K+ engaged viewers every month
Limited spots available!
📧 Contact us via email🦋 Contact us on Bluesky