<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Building in Public</title>
        <link>https://paragraph.com/@0x330a</link>
        <description>Building Farcaster and Web 3 applications and tools</description>
        <lastBuildDate>Wed, 08 Apr 2026 19:50:38 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <copyright>All rights reserved</copyright>
        <item>
            <title><![CDATA[Mythical Data Pulls Pt. 2]]></title>
            <link>https://paragraph.com/@0x330a/mythical-data-pulls-2</link>
            <guid>Xc2ry4mnZVQv8xJiHbRO</guid>
            <pubDate>Tue, 25 Jun 2024 09:28:33 GMT</pubDate>
            <description><![CDATA[We explore links and how to track them, starting to build out our social graph of user interactions]]></description>
            <content:encoded><![CDATA[<div class="relative header-and-anchor"><h3 id="h-previously-on-mythical-data-pulls">Previously, on Mythical Data Pulls</h3></div><p>This is a continuation of a previous post, <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://paragraph.xyz/@0x330a/mythical-data-pulls-pt-1">which you can find here</a>. Some of the content or locations to find certain information related to building Farcaster apps using your own Hub data may be skipped over or implied. I'll try to re-introduce topics where appropriate.</p><div class="relative header-and-anchor"><h3 id="h-if-you-wish-to-make-a-follow-list-from-scratch-you-must-first-iterate-through-the-universe-carl-sagan-probably">"If you wish to make a follow list from scratch, you must first iterate through the universe" - Carl Sagan, probably</h3></div><p>Previously we fetched a user's profile information: username, display name, bio, profile pic, and URL, all things that could be considered essential displayable information for a user in a Farcaster application. A slightly more difficult metric to track that is just as expected to be seen on a user's profile is the people they follow and more importantly, the people that follow them. If you recall the Hub's perspective on data and the available RPC calls we have you might remember we have the ability to query against every link message created by a fid:</p><figure float="none" data-type="figure" class="img-center" style="max-width: null;"><img src="https://storage.googleapis.com/papyrus_images/46afefa641c131aaa0f2194acb66c26b.png" blurdataurl="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAJCAIAAADcu7ldAAAACXBIWXMAAAsTAAALEwEAmpwYAAACXklEQVR4nHXSMUgcQRQA0CkDEbsF5aoVg03EGI9YXHGuhazHud00ekTIQcADIaxw8cBLhhDIptp0S2w2RWQbb7EaOE4GhDArF1xQ4gQJTKWLabZdE4QfbgcOE8grhl/MzP/z5yMppaZpo6OjCCHHcTjn9P9YjlLKOY/jWEqZZRkAxDkhhJRyGKg9yHXdIAgIIbZtO47juq6XGwZDvu+7OcdxPM+jlPq+H8cxAGCMH8/N6bo+NTWl50qlUpIkjuMg3/cBQBUCAGmaqsNhGKqVMTaM0zSFOxhjx1EEAPW11XfN5u/b2yRJsixLcwAQhuGgLepeSqkqvNVqNRoN27Yty6rX67VaDeeKxaJt257nYYyDIAAA1QQAWH/2/P2Hj3dzq4oHCdQVpmkahpGmqRCiUq1WqtXqygoASCkty8IYm6YphEiSpLK8vGgsFItzWZbxXBzHL7e3NzYaGGMrpwqybZsQgkzT1HV9YnJybHz868nJ5dXVmKYtlctjmvbt/FxKiRC6PzKC0KCZUsqJyQcPHxURQsdRxDlnjAFArVYzDKNQKExPT+u6XigUSqWSejHCGBNC7iG0VC5/2t39cXFB2u12s0na7eMoOomi+dnZxXJ5dW2Nc/7z+rr9mmy1Xm2+2FJzohLUc0IIxpgQQv1cHMee5w2mCACOKD2i9Mvh4eHBAdnctJ7Mv7Xts37/++npWb9/fXkJAL9ubobNVStjjHMOAIQQFfyDUorCXLfXo91ut9fb73Te7OwszMxsPF0PgmC/09nvdD7v7fm+H9yhTg3nynXdJEnE36SUYRj+AfxrIHcpZJDXAAAAAElFTkSuQmCC" nextheight="449" nextwidth="1557" class="image-node embed"><figcaption htmlattributes="[object Object]" class="hide-figcaption"></figcaption></figure><p>This works fine for getting the list of people that user <strong>follows</strong> but now we are faced with a new problem - we don't know who <strong>follows them</strong>. Instead, we <em>could</em> use the <code>GetLinksByTarget</code> RPC, although I am unsure on whether Hubs will store this data properly in the future as they are keyed based on inserting LinkAdd/LinkRemoves, which may not be used via the LinkCompactStateBody messages and lead to new hubs having incomplete data in the future(?). Either way, it probably pays to just iterate every fid to get all their data anyway, arriving at a reduced state of the network we care about, such as links, reactions, casts, etc.</p><div class="relative header-and-anchor"><h3 id="h-implementing-a-crude-follow-index">Implementing a crude follow index</h3></div><p>Since we are already iterating through every known fid, we could just as easily add some other query information in there as we go. Picture the following flow: Get the total fid count <strong>N</strong> from our hub, then from fid <strong>i</strong> in <strong>1 to</strong> <strong>N</strong> we would simply query that fid's follow list, add it to some database or however we want to store it, and while we're at it we could get the current user profile information (from before) alongside their follow list. After iterating through every fid we should have the current state of the network mapped to show everyone's follows and their profile information and give us insights like who is the most popular user or even build up some social score metrics by weighting the followers based on some other value. We could imagine a follow by a user who is also followed by a lot of other users is weighted higher than a follow by a fresh account for example.</p><div class="relative header-and-anchor"><h3 id="h-the-virgin-linkbody-vs-the-chad-linkcompactstatebody">The Virgin LinkBody vs the Chad LinkCompactStateBody</h3></div><p>It's probably worth exploring the actual messages and how Hubs communicate them between each other and what it means for a consumer to receive certain types of follows, as recently the link compact messages threw me for a loop and might not be intuitive to understand compared to the normal link add / remove message types:</p><p><code>MessageData</code> is a protobuf 'message' or type which contains a <code>MessageType</code>, the fid sending it, the timestamp they sent it (in seconds from Farcaster epoch, which is different to unix epoch), and a body which can be one of a few different types. The body types can have their own that make sense, for example a <code>ReactionBody</code> might have the <code>target</code> cast or URL(?) the user is reacting to, a <code>LinkBody</code> has the <code>target_fid</code> that the link add/remove is relevant to.</p><pre data-type="codeBlock"><code>  <span class="hljs-comment">/**
 * A MessageData object contains properties common to all messages and wraps a body object which
 * contains properties specific to the MessageType.
 */</span>
message MessageData {
  MessageType <span class="hljs-keyword">type</span> <span class="hljs-operator">=</span> <span class="hljs-number">1</span>; <span class="hljs-comment">// Type of message contained in the body</span>
  <span class="hljs-keyword">uint64</span> fid <span class="hljs-operator">=</span> <span class="hljs-number">2</span>; <span class="hljs-comment">// Farcaster ID of the user producing the message</span>
  <span class="hljs-keyword">uint32</span> timestamp <span class="hljs-operator">=</span> <span class="hljs-number">3</span>; <span class="hljs-comment">// Farcaster epoch timestamp in seconds</span>
  FarcasterNetwork network <span class="hljs-operator">=</span> <span class="hljs-number">4</span>; <span class="hljs-comment">// Farcaster network the message is intended for</span>
  oneof body {
    UserDataBody user_data_body <span class="hljs-operator">=</span> <span class="hljs-number">12</span>;
    LinkBody link_body <span class="hljs-operator">=</span> <span class="hljs-number">14</span>;
    <span class="hljs-comment">/* other message body types etc ... */</span>
    <span class="hljs-comment">// Compaction messages</span>
    LinkCompactStateBody link_compact_state_body <span class="hljs-operator">=</span> <span class="hljs-number">17</span>;
  } <span class="hljs-comment">// Properties specific to the MessageType</span>
}</code></pre><figure float="none" data-type="figure" class="img-center" style="max-width: null;"><img src="https://storage.googleapis.com/papyrus_images/f66d907420fb8da556398d7267958e06.png" blurdataurl="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAASCAIAAAC1qksFAAAACXBIWXMAAAsTAAALEwEAmpwYAAADpElEQVR4nJWU32vbVhSA7z+Qh8yDMGY20RH2B5iBoTEDQ+pl6MEPwdBme3ADGaPKVsOYyibB0ENMH9RSNBoLGmvE8gxX6+QHxRBMKViDJYUqbUnsYpv0dm6UFSGTopggh0bFusQzzY81H+dBoHvvd86VzgG2bcdiMZIkWZaVJCkSiSSTyVQq5bqudwKnvDoKQAiFQiEAAEEQAIBAIBAKhcLhMITQMIxKpbLiYxyysrLiOM4ZBCzLGoZh2zZCqD4AhFCSJNkHHlIqlURRlGX53esAkiRFo9FgMBj1iUQio6OjBEEghI7dYJqmIAhnEHAcFwqFhoaGRFFMJBIQQpqmk8lkvV7HpwzieR5C6GwClmWTPvV6fX19/X93IoTS6fSxy/a7XRxvV0DTdCAQAAAEg0EAAEmSp2gQQizLnp7HoAnM+AAAKIryPA9/bdd1O7u7ONwjV0TT9FsCZ2en1WyWFWUpn9c1ba/T+a8CiqJs2yZJMhwO53K5+Uzmt8VFHPOZzK/z87yPcIhhGKlUqi/AaW4jND0+/gkABACfAiAwzNPHj3VNa1sWwKt5nt969qxtWa1ms/rwYavZNHR9s1bbrNVwIvg4WZYlSRq8IixYhrB896718mXbslCj0XjyZLNW0zVNYJiewLZtuVBQFxaixLnp8fFvvpiYjcfHPvjw8+BHUeKcZZqvDw46u7tYIMsyvkzTNG3bNk3zxdZWWVFIkrw0NfUxQQwPv4c7cb/bVRcWAE3T7XZbLhTuq+rVycnPht+fjccFhsErdtrtwQoghBzHjYyMSJIkCALP89OXLwMApJs3Jy5cAADEYrHqxsa/29v73e5ep1NWlJ7AcRy5UCgrytTY2LcTX87G47PxePb69bkr1C8zM78Lwo8XL16dnLwzN1fI54vFIsdxfaXrujuvXt1XVVEUw+fPa0tLnue9PjjA+f2RyfT6wHGcXC6na9qtn34u3L4tMEz6u+8Fhrl17Rp++CGRoL/6eu4Klctm/1RVLHAcp/9rGbpeVpRWs/n38vI9VdU17Z6qLuXzizdu9PrAtu072WzbstqWddKv7foHlUolCOHRTt7vdlGjgRqNbYT60Wo29zqdnsDzPIqi/tL1p9VqdWPj0drao7W1B6ureI4+WF3F8c/z5yzLQghFUTwqOKnjAO77SqUiCAJOUJIkURR5nud80uk0z/OST6lUMk3zpFFxLEAURY7j+of2EX36ExsP7WKxmE6nIYTvLngDmgI2zZQPsS0AAAAASUVORK5CYII=" nextheight="485" nextwidth="854" class="image-node embed"><figcaption htmlattributes="[object Object]" class="hide-figcaption"></figcaption></figure><p>Keen-eyed readers will notice the <code>LinkCompactStateBody</code> type, which was recently added as part of the <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://github.com/farcasterxyz/protocol/discussions/169">FIP-15 Link Defragmentation</a> improvement discussion. Essentially this new message should be handled differently by both Hubs and applications consuming hub messages. The improvement took a little bit of re-reading to understand but the crux of the problem this improvement addresses is that users have only a limited number of messages that hubs can store and are pruned once the user's storage limits are exceeded for that message type. As a refresher, you can check out the <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://docs.farcaster.xyz/learn/what-is-farcaster/messages#storage">Farcaster protocol architecture</a> for how storage works and the limits for each type of data.</p><p>Presently, at <strong>2500</strong> Link messages including removes and adds, your historical message will start getting pruned from Hubs until your message count for Links is inside that limit. The <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://github.com/farcasterxyz/protocol/blob/main/docs/SPECIFICATION.md#316-link-crdt">Link CRDT</a> resolves merging link messages in a specific way, such that older messages for the same <strong>link type</strong> and <strong>target fid</strong> for any given user will be removed and updated with the latest message resolved by later timestamp. This means that fid 1 storing a LinkAdd with a <strong>type</strong> of <strong>"follow"</strong> and a <strong>target_fid</strong> of <strong>2</strong> will be overwritten by a LinkRemove with <strong>type</strong> of <strong>"follow"</strong> and a <strong>target_fid </strong>of <strong>2</strong> if the timestamp of the Remove is later than the Add. The <code>LinkCompactStateBody</code> is a kind of "signalling" message in this way, as well as useful for storing a larger amount of "currently followed" fids (A <code>LinkCompactStateBody</code> can have as many as <code>10 * STORAGE_UNITS * LINKS_PER_STORAGE_UNIT</code> fids specified as "currently following", meaning that a user with a single storage unit (the minimum requirement) would be able to express <code>10 * 1 * 2500 = 250,000</code> "follows" in a single follow message).</p><figure float="none" data-type="figure" class="img-center" style="max-width: null;"><img src="https://storage.googleapis.com/papyrus_images/09ac138d8d2fdd30b8f33019e9bc4863.png" blurdataurl="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAOCAIAAADBvonlAAAACXBIWXMAAAsTAAALEwEAmpwYAAACbklEQVR4nK2UwUsbQRSHB1pBL/0DerGgUfQwPS6UxKVBJFBRJ14kh2jZJFXoNgilk1pTGC8Ne04POUUrNGyhDYFFyGWgIdAhJyV7yEGmt8oepJchXipTktFliaEo9DssuzAz37z3fiyQd8eyLCHELReDgZ0QQoQQhNAwDNM0Q6EQ59xfoM7FGEspOeeoj6Zpuq5rmhaLxSCEhJAhArWTc84Y45xTSm3bdhynXC7za/xbY4xFH8aY28cOQCnlnHueN6QCIYTrupTScrmMEMIYbzzf8K9ZLBYxxpRSQogvY4wdn5wYhqFfAyEMhUK6rhNCMMY9gRDCNE3O+cLCAgAgkUgghO6N3DdNs7cCAISQlNLzPFWHqkAJPh0cTE9Nvc3lhBCe5wWfqu4rgfr4UCg8W0L5nZ3UiwwAoFgsErI3MvogHo8HC1UzYIzl3+cTiQQAYHpm9lZDllIeffu6++plp93+WCjUa7Xf5+fqBsHYqAr+XF42G429XG4zmUytran23kwXUJO0LIsx1hNUq9VK5aharddqh6VSp92WUh63Wl/293+eniqf36JOu70yN7cSiaRWV2PRKOij63oweIBS6jjO50qFMXZYKuWz2Xw2+2ZzM5/Nri8vH7dav87O1pPJ+Ujk9dbWj2ZTSmkYBu5jGIbruuqg+WhU0zTTNC3LopQOb1FycfEpfKxNTGoTkw8BWAqHv9fr77a3H42Pz8zOPgmHVfcxxiqIwewSQiCEmXTacRw/o4OCi273otv136+C2GgAAEbHxjLptG3bwRbdZGBaQ4YcPN33McYopazPvwU3GS4Y6vP5/4IB7vSz+wvlgBmkwUOP7wAAAABJRU5ErkJggg==" nextheight="991" nextwidth="2262" class="image-node embed"><figcaption htmlattributes="[object Object]" class="">Hubs will store a number of LinkBody messages based on their timestamps and when merging a compact state message, however, only one CompactListStateBody will be stored per user.</figcaption></figure><p>The only complications to understand for a developer is how a Hub should treat this message, and what to interpret receiving a message of this type as. Jumping back to the Rust context, you could essentially treat all link messages as a <code>Vec&lt;LinkAction&gt;</code> here, where a single <code>LinkBody</code> with <code>MessageType == MessageType::MESSAGE_TYPE_LINK_ADD</code> is a single-element collection with an <strong>add</strong> link type. A single <code>LinkBody</code> with a <code>MessageType == MessageType::MESSAGE_TYPE_LINK_REMOVE</code> could be thought of as a single-element collection with a <strong>remove</strong> link type. The <code>LinkCompactStateBody</code> then can be considered a 1..250,000 element collection with a Link<strong>Add</strong> type, as any fid in this list should be considered "currently followed" by the application. An important thing to note here is that the <code>LinkCompactStateBody</code> body should probably a <code>MessageType == MessageType::MESSAGE_TYPE_LINK_COMPACT_STATE</code>, so if you are filtering by <code>MessageType</code> in a subscription of <strong>merge</strong> messages, you might miss it expecting only link adds or removes.</p><p>Another thing to note is that the <code>LinkCompactStateBody</code> inherently cannot preserve the original follow times for each of the users in the list, so it could be assumed that the follow time is the compact message's timestamp if we haven't already merged in a <code>LinkAdd</code> with the same target fid and link type for that user. We might want to therefore try to merge all the links we can, and then request any user's <code>LinkCompactStateBody</code>, of which there should only be one message anyway, and only add in any missing LinkAdds, not overwrite any timestamped links we have already added of those types and targets. The <code>LinkCompactStateBody</code> also doesn't have a link type, so you might assume it is always going to be a "follow" type, even if more link types are added in the future.</p><div class="relative header-and-anchor"><h3 id="h-show-me-the-code-already">Show me the code already!</h3></div><p>Let's start by defining the overall interaction we want to be able to facilitate, a user might want a know their follow list and who follows them, let's say we keep this as a simple index that keeps updated from a subscription to a hub built with an initial index of every fid as well. The end user could be anything from a custom client, a terminal UI, a website dashboard, a frame or even just exported to run some analysis on for building a social graph or user power score.</p><figure float="none" data-type="figure" class="img-center" style="max-width: null;"><img src="https://storage.googleapis.com/papyrus_images/486eb9397cfd148ef3f8a5b664f8adbe.png" blurdataurl="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAdCAIAAABE/PnQAAAACXBIWXMAAAsTAAALEwEAmpwYAAAFYUlEQVR4nK1WT2zTVhh3b0w95t4TaHAYOeXAegqXnNLLLA7zYRhpw4fJQpMsUZEDWEitBwxZ2zCCrQ82LSkTSUvrRHQOXTonIXEVSB7Q1QgWUwZ1KW29QhNgNG+qv+Imod00bT/l8Py9977f9/+FIoTIskxRVCQSoSgqFAqR/xWUoigsy/p8vnC4h3IhCAL/BhzHeWue5wUXkTcQRVFyIbtQXCCEJEnSNG2dQJIkQkjdBYhg3SwhrvBt6+pbQ1EUuLIWGUKI4zhtKuLxuKIoURdgF0JIVVVZluPxOEJIlmWM8VaR2SDgOM6zRVVVhBAoZVk2HO5hGIam6VAoxLIswzAcx4XDPSzL0jTt9/tFUVRVNRqNqqoqSRLYCno3CGiabibQXOi6blkWxthwYdu2ZVmO48ACu7AsyzAMXdcNw9A0TVEUWZY9D2RZXieAsvHiA+mCTNq2/TdCRVEgz4IgWJbVFqItCaCQfD4fRa07RzYT1ut1EHZ1dVEU1d3dDYn8BwKMsSzLNE2DsVBgmus+JECSJEVRQBdN0wzDhMM9oVBIFMVm7S0EDMN45Bjj5tqHmOq6LggC44LjOFEUCSG2bcPJoItIJAJ16BX3RpIFQWgOkeOiXq83F67tArY8S+EMbL3dJS0EUPKQrua+5VywLBuJRGALPgHgqCfneR6iCn2jqup6/hzHgXLUdV3TNHUzdHZ2xuNxqE4oXO88XNE0zTAMqF2MsWmaG1Wz6QxoA+2mHdIATRcO94BbwWAwEAgIgiBJEsMwLMu23aWaP9rmj4dgcO++ffve3bmTdsGybMAFpH3P+3v6+vssy9JdgJ7Xq683ISCErNRq/fK5gejlWGL0LIqdVtDhY9K2d7aRf4nNCSbykwPRywcORT78+LO9H3y0n+89cOjIfr53OJW+Op69Op4dTqUzuSIhDc/RfOn6xR+/U7XREW30m9jA8a/6z8cGTpz7Ijo82E6wuOTEEurjWds0zbm5JwuLCxjfsiyrXq+XK7hQNApFo1y59UN8ZGr6Hvg3kZ/Ufsmd/z529IR8ZWzs+coze36uPFUZz2WWnz9rJ6haM1dS1wyj4Pf7LcuSZbmjo0OW5QcPZg4e/ISiqB07doyMqig2lMkVDxw6EkuMlsp3rAePrhvlfPHmrD0PeuYXng6OXPJ6a4PgfnXm0sjaVPD7/dDrPp8PtjDGFEVJUv9qo9Ernlxccu5XZwghy8+XH84+ulu9++Tp3PzC0+rD6g18o/qwurTseONvg+D3x7Onz17UdR2GgWma27dvh5fPNM1AIJDLZmu1unB0/QUkhJSnKkNjQ+dj30aHB9VrKXTpwvEv+3I384qihMM9MMpaCE6dQTDvwOpoNApbtm0LgoAxrtXqveLJQqmSSGqJpHZZHRM///pK6loiqb148XKltpIvXX/1+k+e5x3HEQTBcZx2AsMwwAOMMULII2BZtjQ5CQRT0/c+PXysUKosLjn4zt1S+XblztSbKl+B8MLobfHg0eyTM2hwevpX8MA0zXg8DluO43Ac53mw2mg4zh8QqKWlRcdZizgEfbXRgCu7du0CSYsHQ8m0YRR4nieEFIrFvv5+uPNbtfre7t2apq02GqfODBBCDMMghCCEGIZRVdWzF96PYDDY2dkpCIKqqi0eDKfShJBXL1++3ZmNxlpnrtRqF2JrbnV3d9M0Df+FEELwIPM839HRYZomIQQSsFZ+zVpiCTVXKBVKlfFs8aefc8l0JpnODKfSyXTm6ng2VyhlckUwYlPACwGLrq4umNgtBF7SJvKTE/lJGA/wG88WC6VKqXzbi/5WHJBk70FsH3b/EUAA8UEI2bb9Fxw5T+YaBH31AAAAAElFTkSuQmCC" nextheight="1019" nextwidth="1113" class="image-node embed"><figcaption htmlattributes="[object Object]" class="">Our application is the blue, literally just a SQLite table with maybe a rest API in front of it</figcaption></figure><p>To start off we are going to bring in the usual suspects from the previous exercise, this time also adding in some sort of data persistence library because we want to actually keep this follow list to query against in the future to save having to make RPC calls or re-indexing every time a user talks to it. Let's start by setting up a simple table to track the link state, here I'm going to use SQLite and Diesel in Rust to make it easier:</p><pre data-type="codeBlock"><code><span class="hljs-operator">~</span><span class="hljs-operator">/</span>git<span class="hljs-operator">/</span>fid<span class="hljs-operator">-</span>playground$ echo DATABASE_URL<span class="hljs-operator">=</span>sqlite:<span class="hljs-comment">//data.db &gt; .env</span>
<span class="hljs-operator">~</span><span class="hljs-operator">/</span>git<span class="hljs-operator">/</span>fid<span class="hljs-operator">-</span>playground$ diesel setup
<span class="hljs-operator">~</span><span class="hljs-operator">/</span>git<span class="hljs-operator">/</span>fid<span class="hljs-operator">-</span>playground$ diesel migration generate create_links</code></pre><p>Now we can fill in a super super super basic sql up and down to create the required table, we won't worry about other link types for now and we can filter on only "follow" links in our code</p><pre data-type="codeBlock" language="sql"><code><span class="hljs-comment">-- up.sql</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> IF <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> links (
    fid <span class="hljs-type">INTEGER</span>,
    target <span class="hljs-type">INTEGER</span>,
    <span class="hljs-type">timestamp</span> <span class="hljs-type">INTEGER</span>,
    <span class="hljs-keyword">PRIMARY</span> KEY (fid, target)
);</code></pre><pre data-type="codeBlock" language="sql"><code><span class="hljs-comment">-- down.sql</span>
<span class="hljs-keyword">DROP</span> <span class="hljs-keyword">TABLE</span> links;</code></pre><p>Running <code>diesel migration run</code> should work without any issues and give us a brand new links tables to track the from&lt;-&gt;to follow link relationship. I'm going to first define some <code>Message</code> to <code>LinkAction</code> processes to interpret the <code>LinkBody</code> messages with <code>MESSAGE_TYPE_LINK_ADD</code> and <code>MESSAGE_TYPE_LINK_REMOVE</code> message types, recall the protobuf structure from above if any of this seems like we're speeding through it (This implementation is pretty much all taken from fatline-rs). Recall that we want to create a <code>Vec</code> of <code>LinkAction</code>'s through the stream of every user's links via the<code> GetAllLinkMessagesByFid</code> request which takes a <code>FidRequest</code>. This in turn will give us back a list of <code>Message</code>s, which should be all of the user's link adds and removes.</p><p>We'll start by mapping out our LinkAction, to represent the possible actions that someone can perform via a link add or remove message</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-meta">#[derive(Debug, Serialize, Deserialize, Clone)]</span>
<span class="hljs-keyword">pub</span> <span class="hljs-keyword">enum</span> <span class="hljs-title class_">LinkAction</span> {
    <span class="hljs-title function_ invoke__">AddFollow</span>(LinkInfo),
    <span class="hljs-title function_ invoke__">RemoveFollow</span>(LinkInfo)
}</code></pre><p>Next, we'll create two more functions, one to map the message type and link's body information into the previous enum, and one to handle the unwrapping of the protobuf <code>Message</code> itself,</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-keyword">fn</span> <span class="hljs-title function_">map_link_action</span>(message_type: &amp;MessageType, target: <span class="hljs-type">u64</span>, timestamp: <span class="hljs-type">u32</span>) <span class="hljs-punctuation">-&gt;</span> <span class="hljs-type">Option</span>&lt;LinkAction&gt; {
    <span class="hljs-keyword">let</span> <span class="hljs-variable">info</span> = LinkInfo {
        target_fid: target,
        timestamp
    };
    <span class="hljs-keyword">match</span> message_type {
        MessageType::LinkAdd =&gt; <span class="hljs-title function_ invoke__">Some</span>(LinkAction::<span class="hljs-title function_ invoke__">AddFollow</span>(info)),
        MessageType::LinkRemove =&gt; <span class="hljs-title function_ invoke__">Some</span>(LinkAction::<span class="hljs-title function_ invoke__">RemoveFollow</span>(info)),
        MessageType::LinkCompactState =&gt; <span class="hljs-title function_ invoke__">Some</span>(LinkAction::<span class="hljs-title function_ invoke__">AddFollow</span>(info)),
        _ =&gt; <span class="hljs-literal">None</span>
    }
}

<span class="hljs-title function_ invoke__">pub</span>(<span class="hljs-keyword">crate</span>) <span class="hljs-keyword">fn</span> <span class="hljs-title function_">link_from_message</span>(message: Message) <span class="hljs-punctuation">-&gt;</span> <span class="hljs-type">Option</span>&lt;<span class="hljs-type">Vec</span>&lt;LinkAction&gt;&gt; {
    <span class="hljs-keyword">let</span> <span class="hljs-variable">data</span> = message.data?;
    <span class="hljs-keyword">let</span> <span class="hljs-variable">data_type</span> = data.r#<span class="hljs-title function_ invoke__">type</span>();

    <span class="hljs-keyword">match</span> data.body? {
        Body::<span class="hljs-title function_ invoke__">LinkBody</span>(link) =&gt; {
            <span class="hljs-keyword">let</span> <span class="hljs-variable">TargetFid</span>(target) = link.target?;
            <span class="hljs-keyword">let</span> <span class="hljs-variable">mapped</span> = <span class="hljs-title function_ invoke__">map_link_action</span>(&amp;data_type, target, data.timestamp)?;
            <span class="hljs-title function_ invoke__">Some</span>(<span class="hljs-built_in">vec!</span>[mapped])
        },
        Body::<span class="hljs-title function_ invoke__">LinkCompactStateBody</span>(compaction) =&gt; {
            <span class="hljs-keyword">let</span> <span class="hljs-variable">mapped</span> = compaction.target_fids
                .<span class="hljs-title function_ invoke__">iter</span>().<span class="hljs-title function_ invoke__">copied</span>()
                .<span class="hljs-title function_ invoke__">filter_map</span>(|target| <span class="hljs-title function_ invoke__">map_link_action</span>(&amp;data_type, target, data.timestamp))
                .collect::&lt;<span class="hljs-type">Vec</span>&lt;_&gt;&gt;();
            <span class="hljs-keyword">if</span> mapped.<span class="hljs-title function_ invoke__">is_empty</span>() {
                <span class="hljs-literal">None</span>
            } <span class="hljs-keyword">else</span> {
                <span class="hljs-title function_ invoke__">Some</span>(mapped)
            }
        },
        _ =&gt; <span class="hljs-literal">None</span>
    }
}</code></pre><p>Here we simply add in a way to get a <code>Vec</code> of <code>LinkAction</code>'s from any message, whether it's a <code>LinkCompactStateBody</code> or a <code>LinkBody</code>, we just use the <code>MessageType</code> to figure out whether it will be an add or remove, treating the <code>MessagaeType::LinkCompactState</code> as a list of adds as well. We use an optional in case the data body isn't what we expect, as we are dealing with the proto <code>Message</code>, which could contain any of the valid body types as its inner body and lets us extend the types at a future date if other link type messages are added.</p><p>fatline-rs gives us a way to receive all of these messages as a <code>Stream</code>, which is useful for just iterating or collecting them all from some intermediate steps, here we are going to use that property to iterate over all the link add and remove actions to run a database query for all of them, either creating a row or deleting a row in the table we defined earlier. The primary key being composed of both the source and target fids should let us have only one entry per person per target, which is fine for our needs. We need to define the model class for generating some diesel database types that allow insertion and query that match the previously defined SQLite table:</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-comment">// model.rs</span>
<span class="hljs-keyword">use</span> diesel::prelude::*;

<span class="hljs-meta">#[derive(Queryable,Selectable,Insertable,Debug, Clone)]</span>
<span class="hljs-meta">#[diesel(table_name = crate::schema::links)]</span>
<span class="hljs-meta">#[diesel(check_for_backend(diesel::sqlite::Sqlite))]</span>
<span class="hljs-title function_ invoke__">pub</span>(<span class="hljs-keyword">crate</span>) <span class="hljs-keyword">struct</span> <span class="hljs-title class_">LinkEntry</span> {
    <span class="hljs-keyword">pub</span> fid: <span class="hljs-type">i32</span>,
    <span class="hljs-keyword">pub</span> target: <span class="hljs-type">i32</span>,
    <span class="hljs-keyword">pub</span> timestamp: <span class="hljs-type">Option</span>&lt;<span class="hljs-type">i32</span>&gt;
}</code></pre><p>Now we can wire it all up with a very simple <code>main.rs</code> file to consume the hub messages transformed into <code>LinkAction</code>s</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-keyword">use</span> std::pin::pin;

<span class="hljs-keyword">use</span> diesel::{BoolExpressionMethods, Connection, ExpressionMethods, QueryDsl, RunQueryDsl, SqliteConnection};
<span class="hljs-keyword">use</span> diesel::associations::HasTable;
<span class="hljs-keyword">use</span> dotenvy_macro::dotenv;
<span class="hljs-keyword">use</span> fatline_rs::hub::HubInfoService;
<span class="hljs-keyword">use</span> fatline_rs::HubService;
<span class="hljs-keyword">use</span> fatline_rs::stream::{LinkAction, StreamService};
<span class="hljs-keyword">use</span> tokio_stream::StreamExt;

<span class="hljs-keyword">use</span> crate::model::LinkEntry;
<span class="hljs-keyword">use</span> crate::schema::links::dsl::{fid <span class="hljs-keyword">as</span> link_fid, links, target <span class="hljs-keyword">as</span> link_target};

<span class="hljs-keyword">mod</span> schema;
<span class="hljs-keyword">mod</span> model;

<span class="hljs-keyword">const</span> DB_URL: &amp;<span class="hljs-symbol">'static</span> <span class="hljs-type">str</span> = dotenv!(<span class="hljs-string">"DATABASE_URL"</span>);
<span class="hljs-keyword">const</span> HUB_URL: &amp;<span class="hljs-symbol">'static</span> <span class="hljs-type">str</span> = dotenv!(<span class="hljs-string">"HUB_URL"</span>);

<span class="hljs-meta">#[tokio::main]</span>
<span class="hljs-keyword">async</span> <span class="hljs-keyword">fn</span> <span class="hljs-title function_">main</span>() <span class="hljs-punctuation">-&gt;</span> <span class="hljs-type">Result</span>&lt;(), <span class="hljs-type">Box</span>&lt;<span class="hljs-keyword">dyn</span> std::error::Error&gt;&gt; {
    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut </span><span class="hljs-variable">connection</span> = SqliteConnection::<span class="hljs-title function_ invoke__">establish</span>(DB_URL)?;
    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut </span><span class="hljs-variable">client</span> = HubService::<span class="hljs-title function_ invoke__">connect</span>(HUB_URL).<span class="hljs-keyword">await</span>?;

    <span class="hljs-comment">// for iterating</span>
    <span class="hljs-keyword">let</span> <span class="hljs-variable">total_fid_count</span> = client.<span class="hljs-title function_ invoke__">get_current_fid_count</span>().<span class="hljs-keyword">await</span>?;

    <span class="hljs-keyword">for</span> <span class="hljs-variable">fid</span> <span class="hljs-keyword">in</span> <span class="hljs-number">1</span>..=total_fid_count {
        <span class="hljs-keyword">let</span> <span class="hljs-variable">page_size</span> = <span class="hljs-literal">None</span>;
        <span class="hljs-keyword">let</span> <span class="hljs-variable">reverse</span> = <span class="hljs-literal">false</span>;
        <span class="hljs-keyword">let</span> <span class="hljs-variable">fid_links</span> = client.<span class="hljs-title function_ invoke__">get_all_link_messages</span>(fid, reverse, page_size);
        <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut </span><span class="hljs-variable">fid_links</span> = pin!(fid_links);

        <span class="hljs-comment">// our actual stream iteration happens here</span>
        <span class="hljs-keyword">while</span> <span class="hljs-keyword">let</span> <span class="hljs-variable">Some</span>(link_action) = fid_links.<span class="hljs-title function_ invoke__">next</span>().<span class="hljs-keyword">await</span> {
            <span class="hljs-keyword">match</span> link_action {
                LinkAction::<span class="hljs-title function_ invoke__">AddFollow</span>(info) =&gt; {
                    <span class="hljs-comment">// when it's an add then insert values</span>
                    <span class="hljs-keyword">let</span> <span class="hljs-variable">model</span> = LinkEntry {
                        fid: fid <span class="hljs-keyword">as</span> <span class="hljs-type">i32</span>,
                        target: info.target_fid <span class="hljs-keyword">as</span> <span class="hljs-type">i32</span>,
                        timestamp: <span class="hljs-title function_ invoke__">Some</span>(info.timestamp <span class="hljs-keyword">as</span> <span class="hljs-type">i32</span>),
                    };
                    <span class="hljs-comment">// Database insertion</span>
                    diesel::<span class="hljs-title function_ invoke__">insert_into</span>(links::<span class="hljs-title function_ invoke__">table</span>())
                        .<span class="hljs-title function_ invoke__">values</span>(model)
                        .<span class="hljs-title function_ invoke__">on_conflict_do_nothing</span>() <span class="hljs-comment">// my hub has some weird duplicate adds with differing timestamps, maybe a hub sync issue?</span>
                        .<span class="hljs-title function_ invoke__">execute</span>(&amp;<span class="hljs-keyword">mut</span> connection)?;
                },
                LinkAction::<span class="hljs-title function_ invoke__">RemoveFollow</span>(info) =&gt; {
                    <span class="hljs-comment">// when it's a remove then remove values (if any exist)</span>
                    <span class="hljs-keyword">let</span> <span class="hljs-variable">source</span> = fid <span class="hljs-keyword">as</span> <span class="hljs-type">i32</span>;
                    <span class="hljs-keyword">let</span> <span class="hljs-variable">target</span> = info.target_fid <span class="hljs-keyword">as</span> <span class="hljs-type">i32</span>;
                    <span class="hljs-comment">// Database deletion</span>
                    diesel::<span class="hljs-title function_ invoke__">delete</span>(links.<span class="hljs-title function_ invoke__">filter</span>(link_target.<span class="hljs-title function_ invoke__">eq</span>(target).<span class="hljs-title function_ invoke__">and</span>(link_fid.<span class="hljs-title function_ invoke__">eq</span>(source))))
                        .<span class="hljs-title function_ invoke__">execute</span>(&amp;<span class="hljs-keyword">mut</span> connection)?;
                }
            }
        }
        <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Mapped fid {fid}"</span>);
    }
    <span class="hljs-comment">// we now have every fid link mapped</span>
    <span class="hljs-title function_ invoke__">Ok</span>(())
}</code></pre><p>Running this application should now ingest all the link information and print every fid that is mapped after consuming all the current link events, minus the compact state list events, but for now it should get us started. The output looks something like this:</p><pre data-type="codeBlock"><code><span class="hljs-meta prompt_">...</span> <span class="python">etc</span>
Mapped fid 14
Mapped fid 15
Mapped fid 16
Mapped fid 17
Mapped fid 18
Mapped fid 19
Mapped fid 20
<span class="hljs-meta prompt_">...</span> <span class="python">etc</span></code></pre><p>And output a table that looks something like this:</p><figure float="none" data-type="figure" class="img-center" style="max-width: null;"><img src="https://storage.googleapis.com/papyrus_images/2c7cbd1113a5e9eb3fbe1eecf4e970f1.png" blurdataurl="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAB8AAAAgCAIAAABl4DQWAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAIXRFWHRDcmVhdGlvbgBUdWUgMjUgSnVuIDIwMjQgMTk6MDQ6MjW7eqE/AAAAGXRFWHRTb2Z0d2FyZQBnbm9tZS1zY3JlZW5zaG907wO/PgAABVNJREFUeJydln1ME3cYxy+x0kRLW9urtQza0VYKiMVhWwryIlhK7Y4XK63Iu7BswKxboWxd0ZT61hpEA9aYwDL9x5lsIy7aZAFDaBAKEloQSoAezhaC1mSSLZF/dInL2c503NXkdvn99TzJJ988L9/nAPlhSKGEoNLjxo5O42lz++lzOn373lTJ3n3pYknW8TptZ5et7ZszFyxXmpp1oVSqZL/4wFFNdWeXzXKp23z20gXLFYv1al1D04mGLznxAkHSPknmQS5fAKjUFceOVdQ1aO/cuXvz1m2b7fvmU21iSbZYkq2E1F1X+37+pf+efejFy1c6fftXXxvEkux9n0iVkFrXdqbH1nf7zt3pWe9P/fa1wJ9nz13W6duPaqqLj1SYzFaROAOIZcezOVyQwQIAAhD6CGQKCAAEhbKIx0/4N4jEgynC1u0icUa+TA6gPqPpvBf2+1YC6+sbhYVKgMtL4PEFsex4kMFisuKYrDiQwQIZLEJUtEJZlCJMo9B3smLYrBh2MA4yWGQKKM3MzS9QkCkgkxUXzMay4wlR0Td6b649f9nfb3fPLhSVqBA6f3dikB7+CFHRxUc0wtQ0MgVEp7IPFhSXqglR0eFxMgU81WKYnp6f88Bj45MKJfQhen1Dk0gsRdPJFLD4iEalwqC3tBmd4y4v7H/iW0NKx+UlCJL2xnH4wbK8f0QSrbyyLk2UTqUzN6WodOYPt34sL68mkmib4o3alpGRCffs4sOxR/tFkvd1/xhXZUTijPwCBVp7bX3TPfvAkMP5673f8vJkCD0hcQ8mXaWuiERv1rYeVR/DpA88cAS1h+hJyULMuqs1VZh0MgW0WK8WlajQ9GZt68jIhMvtedfVIoDN4XF5CXjp3bbe8vJqTPqwY8w9u+gcd0FQCcDlC1KE+2PZ8eiu1pz4Ik0kRXcV3PWRyWyFoJJo2n9SVDrzVIvB5fYswU+fB9bLNJUfnPdSNaZ2IonW2XUNU3ttfdN9++CwY2xoeDS/QIHQg9pxddVg7DheVYOmV59ovG8fdM8uDg2Pyg9/CvD4gszsQ7jqToi8q83aVue4y+X2TLkeF5eqgT3Jye+045vIg7LCMk0lml7/+cn79kGX2zPnWUIqk5SUnJUjw+UEVDrTZL6oUEJousHYMeeBff5nS/BTZCJ5fEGaKD2Si6UIsSfyvKULgkowtQ8MDk9OzUxOzbynS3F1lUwBTeaL6F0lkqh6g8m3Ehh1up741kIemZ6RhUkv01RGop+3dGE6gcFomp9fds8urq9vhOgicQYu7SCDZTB2yOUKIom2ia5t+XZ5ecUL+1dXXyB0Hl9wICcPr7/f6L2pUpVt0k6IirZe7nm18Wbt+cs3f78NuVhWjozN3Y12ApW6CtPfiSRat623sUlL2L5jsxO0fufzP3s49mjOA4e6KkwV4e2q4V1ltkZRNjlES5tx4IGjt+/W9et9oastzczF6wTdtt6amgbMyzfngZ/41rywH9H+AXpZ5Jkxn7VEcmDfSmDB6/OtBBDt/7urENY2mcwXV1dfBP7469XGa4SekLgn99BhXFebSKJZL/fky+Toq603mJaXVxa8vtdv3iI+w9+dKEwVhf8MhTuBEKsyVDrTYOwoLFSi573xpG5iwj05NTPsGMvJzUMqg3ebiCRapF3V6dvnPPDk1MzC0u+hXRVLs3bFcqh0ZvgDtmxTa6pShGlEEg2dslivIIdty7bwOGH7Dr3BND09P+p0LcFPEe1JycIDOXlsDo8Vwwl/ZApDrakSiTMYO2PQqR5bX23tZ2QKIzxOBVntp895Yf+Qwznlepwvk/8DHaCRb4EW1UkAAAAASUVORK5CYII=" nextheight="537" nextwidth="520" class="image-node embed"><figcaption htmlattributes="[object Object]" class="hide-figcaption"></figcaption></figure><p>You should be able to query against this and serve this data in... several days time... or less, depending on your latency to the hub and transfer speed I'm guessing, as well as any SQL tuning we do to optimise the writes. Another optimisation could be starting the initial sync in a breadth-first approach, you might have a specific fid in mind, and queue up that initial fid, followed by all the fids they follow, and their fids they follow etc, instead of brute forcing 1 to 700,000+. At the end of the day these are just some trade-offs you might want to make in your own application to get to a faster result quicker. You might even fall back to using the get links by target RPC for the specific fids you want, and branch out that way to at least build two-way links from an initial fid.</p><p>As of writing this I don't have a checkpointed data source for people to analyze, but in theory it would be pretty easy to spin one up if you left the application running, although you might want to throw in some extra data into that export reactions, profile info and casts before actually wanting to use it for anything useful.</p><p>Thanks for getting this far and let me know if you have any other useful data questions you want to explore for a future post, or how you use the link data from the Farcaster network in your own applications!</p>]]></content:encoded>
            <author>0x330a@newsletter.paragraph.com (Building in Public)</author>
            <category>rust</category>
            <category>farcaster</category>
            <category>data</category>
        </item>
        <item>
            <title><![CDATA[Mythical Data Pulls Pt. 1]]></title>
            <link>https://paragraph.com/@0x330a/mythical-data-pulls-pt-1</link>
            <guid>RrTJp5swz475rwiHLH6g</guid>
            <pubDate>Fri, 21 Jun 2024 13:45:05 GMT</pubDate>
            <description><![CDATA[Mythical Data Pulls Pt. 1
Exploring Farcaster data from Hub RPCs]]></description>
            <content:encoded><![CDATA[<div class="relative header-and-anchor"><h3 id="h-some-background">Some Background</h3></div><p>As a developer particularly focused in building the future of France, the idea of an open, permission-less and alternative social network protocol to the Big Ones seems appealing. Having played around with previous iterations of social media data endpoints back when you were allowed to do so without paying, even the most restricted API limits were enough to generate some ideas and validate them. That's why the idea of being able to run your own hardware and contribute to the sufficient decentralisation of the network appealed greatly to me and I wanted to play around with and understand how it worked from the ground up.</p><p>What kind of design and architecture choices were taken? What are the primitives for communication, message storage, etc? What are the balances and trade-offs or potential hazards to look out for when writing applications that consume or broadcast this data? Are there any improvements you can suggest to the protocol overall, or implementation-specific to make your life a little easier? The documentation on the official Farcaster websites for the protocol as well as the message types for the Hubble APIs get us enough information to start building and are a pretty easy way to get across the basics to understand this information on your own as well:<br><a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://docs.farcaster.xyz/learn/architecture/overview">https://docs.farcaster.xyz/learn/architecture/overview</a><br><a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://docs.farcaster.xyz/reference/hubble/datatypes/messages">https://docs.farcaster.xyz/reference/hubble/datatypes/messages</a></p><div class="relative header-and-anchor"><h3 id="h-conceptual-view-for-the-first-steps">Conceptual view for the first steps</h3></div><p>A good way to conceptualise all of this, much like any other distributed and shared data source, is as an event stream that every node resolves to get to an overall view of the state. Although in the Hub's case it is out of scope to assume you could query any "reduced" view of this state for any given user or post, instead relying on that stream's state reduction being done in your app/indexer-specific process; either a service you can pay for, an open source implementation, or writing your own application-specific implementations using libraries or clients to query against your own hub (my personal favourite, and fitting with the open and decentralised theme of the protocol). You can visualise this separation conceptually for a single user's "Profile Update" messages like this:</p><figure float="none" data-type="figure" class="img-center" style="max-width: null;"><img src="https://storage.googleapis.com/papyrus_images/c310933462ab6ff18a90655b411b8763.png" blurdataurl="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAdCAIAAABE/PnQAAAACXBIWXMAAAsTAAALEwEAmpwYAAAFQUlEQVR4nK2VT2wUVRzH363HeuK4BrahRlBsY9jLxhgProcaDRgGA6OGJZBp6QOtE+FJ2UfQWUMeCkyFcVt4JM1Q0qEpU9Gl1EwMnYBhPDgYOhc6pw49zIFkIGRA6Jjurx22LRajfA6btzPvve/8/rzvQ0mSuK5LKTUMgzGmaVoYhsnzA7muK0kSqiOXyxFCJEnC/xq5BqVUURRVVTnneo1qtYoopWGNoEYYhnYdlmXZS3Ac56mvnHm8eQghSNO0JEniOK6PK4qiIAiiKIrjOAiCOI7DMIzmgXEcx/A3nmdpflRVfSJQP0/XdVEUMcaKogiCoKqqIAiKokAqZFmGahWLRUVRKKWiKCqKslSAMYZUVU0jsG1bqcEYU1UVxpqm0YWwGjAmhMiyjDEmhDDGBEHI5/OQgyRJKKVzAr7v35yYcF3XcRyoh+u61WrVNE3DMCzL8jwvCALf9y3LMk1T13XTNG3b9n0/CALP86AkhmFcGR+H3ZMkma2BruvFYhEh1NTUhBBasWKFVuPmxMTq5mZRFF9raWloaIA+DoKAUrru1XWiKDY0NORyOcuyOOdJkrS2tmazqzKZDEKoUChomqbruizLsxEQQl5cmd0qivk33mxa3SyKW03TdBwnk8m0t7czxtra2jKZDMbYtm1CyMpVKw+USrs6d61Zu7ZQKDDGPM/btGlTS+vrX32ttL23Yc3aVyRJMgxjVoDS0rHjatfBw3jvQVI+2imTHvU41OPt9euljRu7MS51dHzQ1nbv/v0kSTDG0he7trR/tE/p3t65Y2BgoDZ5vL3z052fUVo+9mX5+M6OPZdHR+dSxDm3bZtz3lupnD596gznrusmSXJrcvKbcrmvt/dET4969GhvpQKZ5Zz3nOjpO9VX6evVNA0mh2H47ZEjkNsfR8zeSuXW5GSSJIqiPP0cPC9UVUWiKIKyoiiEEIxxsVgUF1KsIUkSNOVSF5EkKV21efOHoihSSqG6CJov/fU8z6+RHne31ru2bXueJ4pia2trEATwZJE3wELf98F7YIDqk+P7fhRFMDsIgkWekSSJYRiapvm+zxirVqsQOnQLvGKM6bpOKdV1HTKPYGtY39jYWG+rCKHGxsZsNosQyufzQRAIgpDL5arVKjhH6p2UUoyxWgNOuGVZsPOcQBRFlFLOuSAIjDFFUaAYsiwTQgRBEEVRlmVINCEEdkl9RdM0sGjDMKo1wCCeCHieV29K/5M4jlPvmxNwXZcxBqGktpr6c7TQk+FVvCzQ/U8EHMeRZbk+gnQQ1m7Q1G5lWfY8D3phmSAWR2DbtiRJ6cUAJxYhhDFGCGWzWZABj8tms5B0QRAwxtBFiqJwzqFmpmk+W4AQYhhGGIZ/3rjx2/Xrfz18CBcRxjiTyRQKhXw+LwgCDHK5XHPzS/l8HprVcRxCyD8KpGVPbf3e3buPZ2biOJ6+fTsMQ9M0D5RKwxcupPVIr9I0RRjj5QTge63hYbpj+w+HDo0ODu7f9jGTP4/u3Jl/OwOTHz1+tKh/gGcIPHgwK9DHz7274RPz/HDX/vJ2qWvvXsoHzJ/Gxo2RX07ywbFfry1T5MUCjuPAIxB4PDP7gUM/X9yye8/QpUsd3d1bOrt6zxm7S/T9bdKat9554eWWfeWy7dh/TNywf786NT3rK/VATy44B1D3tAbXrl7rH+r/Xj8xNT3V/V2pf/jMyNjIybMnh0aHhi8PX7pS7b9w9nDlyODF86cGz0xNTz2jTX3fh6s1FeCcg9/9NxZbBZw98BNeQ9d1TdNUVQWPXB6wVYBzDqtA4G/aguJatIUewwAAAABJRU5ErkJggg==" nextheight="931" nextwidth="1033" class="image-node embed"><figcaption htmlattributes="[object Object]" class="hide-figcaption"></figcaption></figure><p>This diagram shows the separation between what you could consider the data streams of the network, real-time, historical for every user and cast on the network. Every like and recast, every follow and unfollow, every cast and the data associated with them. Today I want to go through the process of creating a simple script that could give you the "up to date" view of a specific user's profile information as that's probably one of the earliest and easiest tasks one might want to do with their own Farcaster application.</p><div class="relative header-and-anchor"><h3 id="h-implementation-choices">Implementation choices</h3></div><p>The choices for implementation matter and are at the end of the day it is always better to work with languages and tooling you are more familiar with. Most of the Farcaster work I've done outside of frames or browser-facing applications are all in Rust, just because I like it and there wasn't a huge amount of existing tooling in it for building Farcaster applications (good learning opportunity compared to using existing libraries I wouldn't understand at the fundamental level). I started writing a library to make a lot of these hub to app-specific translations easier or help import the historical data and I'll continue to build it in a semi-leftcurve way because I'm not S-tier Rust or library developer, but I'm Doing My Part<span data-name="tm" class="emoji" data-type="emoji">™</span> and it's been fun which is the most important thing I think. :)</p><div class="relative header-and-anchor"><h3 id="h-talking-to-hub-grpc-endpoints">Talking to Hub gRPC endpoints</h3></div><p>The Hub endpoints we'll be communicating with use gRPC and protobufs, the first step of creating a library or application to help script out communications with the Hub is to compile the protobufs <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://github.com/farcasterxyz/hub-monorepo/tree/main/protobufs">from the official Farcaster hub monorepo</a>, I started by just cloning it as a submodule and creating a build step to compile and make available the compiled RPC services and types. You can usually get a gRPC library for this in your language of choice, I used <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://github.com/hyperium/tonic/">tonic</a> for the Rust code. It might be mentioned that you would need a Hub running and up to date with all the OP and mainnet data as well as the Farcaster network messages you care about querying to be synced with it, you can use a hosted hub from a provider or run your own on your own machine. The current DB requirement is probably 100-200+GB of storage.</p><div class="relative header-and-anchor"><h3 id="h-userdatabody">UserDataBody</h3></div><p>The data we'll be querying comes from the Hub's <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://docs.farcaster.xyz/reference/hubble/grpcapi/userdata">UserData endpoints</a> in the RPC service (<a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://github.com/farcasterxyz/hub-monorepo/blob/main/protobufs/schemas/rpc.proto">protos</a>), which we could crudely construct a couple of different ways depending on what specific information the application wanted. If we didn't want historical data we could just use the <code>rpc GetUserData(UserDataRequest) returns (Message);</code> function which returns a single <code>Message</code> for each type (to get the latest value for each field, although we have to make a different request for each type we cared about) or iterate through all of the messages using <code>rpc GetAllUserDataMessagesByFid(FidRequest) returns (MessagesResponse);</code> or the paged version <code>rpc GetUserDataByFid(FidRequest) returns (MessagesResponse);</code> which will return all the currently stored user data messages for that fid (minus any that have been pruned from storage limits or deletions I guess)</p><p>Writing this in a Rust script, you can start by adding my <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://github.com/0x330a-public/fatline-rs">fatline-rs</a> library, or just pull in the protobufs and add the tonic dependency and build.rs stuff yourself (bear in mind some code may reference utility functions or extra library code I've written for fatline-rs)</p><p>For adding the library to your existing <code>Cargo.toml</code>:</p><pre data-type="codeBlock"><code><span class="hljs-comment"># Cargo.toml file</span>
<span class="hljs-comment"># other dependencies you might want and Cargo metadata etc ...</span>
<span class="hljs-section">[depdencies]</span>
<span class="hljs-attr">eyre</span> = <span class="hljs-string">"0.6.12"</span>
<span class="hljs-attr">tokio</span> = { version = <span class="hljs-string">"1"</span>, features = [<span class="hljs-string">"full"</span>] } <span class="hljs-comment"># probably useful as the tonic responses will be async</span>

<span class="hljs-section">[dependencies.fatline-rs]</span>
<span class="hljs-attr">git</span> = <span class="hljs-string">"https://github.com/0x330a-public/fatline-rs.git"</span>
<span class="hljs-attr">rev</span> = <span class="hljs-string">"c155d9f862c56e94ecf508d1185a114e1c5bc1a4"</span></code></pre><p>In the <code>src/main.rs</code> we're just going to add a constant for our Hub's publicly accessible IP/URL, you could just as easily add this as a dotenv or something using dotenvy or equivalent library</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-keyword">const</span> HUB_URL: &amp;<span class="hljs-symbol">'static</span> <span class="hljs-type">str</span> = <span class="hljs-string">"http://somethingsomething:2283"</span>;</code></pre><p>We're going to use the application to query and print out the current state of a specific user by their fid, we'll use the Farcaster account for this example (fid #1). The information we should expect to see should match the Warpcast display for <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://warpcast.com/farcaster">Farcaster's user profile page</a> and looks (presently) something like this:</p><figure float="none" width="100%" data-type="figure" class="img-center" style="max-width: 100%;"><img src="https://storage.googleapis.com/papyrus_images/f37c7048693151efcb6711fd736602a3.png" blurdataurl="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAHCAIAAADmsdgtAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAIXRFWHRDcmVhdGlvbgBGcmkgMjEgSnVuIDIwMjQgMjE6NDI6MTUhv9EaAAAAGXRFWHRTb2Z0d2FyZQBnbm9tZS1zY3JlZW5zaG907wO/PgAAAbBJREFUeJxjEOGU42cVZWfkFWKTFGGX5WAQ52EQ52aS4GIQ42EQF2KTFmGXpQQx8LKIqonbOWrmSfHrinBKmenZG6hZaKqYmunbG6hbSAkoUWgHg4qI3cHFf/5//T+r+qoEj5qvZ0R6Yk5sWEpOaklCVIaqlAY/i5QIpwz5Fjhq5bw89f/Nlf9XNv5XFLDUUjPWU7dRkTNSUzHTUjET45QDqYOQIIaMMJyNgYTZ5eAIYYGMoNaU0vObu/7XJK7mZ5ES51MQYoOZyCYrxCYtyCrFzyIlyApCAiySfKxSQiwgBlSEWUqAFcQWYpMWYgOLgLmC7FJC7FIgC/hZJUTZlVTFbfhZpIQ5Zc31HWxMXf08wt3tfd3tfZ1sfOws3D2dAlztfJxsfDycg3xdA+wt3G3NXd3sfWzN3dwdAl3tfNwd/c007YwEPUzFPCzEAizEvAxFnA1FXETZZRnA3pHmZxUV5gQ5XEpEVU5MQ1ZEVU5cXV5cXU5cQ05MTVFCU15cXRbMUJLUlAbLKohryIqoKUhoKIhryEmoSwmrSHGrSfNqyPBpSvOqy/BpSPNqiLBLAwAgQl2qHyMxGwAAAABJRU5ErkJggg==" nextheight="102" nextwidth="496" class="image-node embed"><figcaption htmlattributes="[object Object]" class="hide-figcaption"></figcaption></figure><p>We can see the user's username <code>@farcaster</code>, the display name <code>Farcaster</code>, the profile picture (Farcaster logo), and the bio <code>A sufficiently decentralized social network. farcaster.xyz</code>. To pull all this data we need to construct a client using our Hub URL, surrounded by some other async Rust program boilerplate and imports:</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-keyword">use</span> eyre::<span class="hljs-type">Result</span>;
<span class="hljs-keyword">use</span> fatline_rs::HubService;

<span class="hljs-keyword">const</span> HUB_URL: &amp;<span class="hljs-symbol">'static</span> <span class="hljs-type">str</span> = <span class="hljs-string">"http://somethingsomething:2283"</span>;

<span class="hljs-meta">#[tokio::main]</span>
<span class="hljs-keyword">async</span> <span class="hljs-keyword">fn</span> <span class="hljs-title function_">main</span>() <span class="hljs-punctuation">-&gt;</span> <span class="hljs-type">Result</span>&lt;()&gt; {

	<span class="hljs-comment">// HubService here is a simplified, re-exposed type from fatline-rs, HubServiceClient&lt;Channel&gt; is the original type</span>
	<span class="hljs-keyword">let</span> <span class="hljs-keyword">mut </span><span class="hljs-variable">service</span> = HubService::<span class="hljs-title function_ invoke__">connect</span>(HUB_URL).<span class="hljs-keyword">await</span>?;

	<span class="hljs-comment">// return Result::Ok at the end of main</span>
	<span class="hljs-title function_ invoke__">Ok</span>(())
}</code></pre><p>Running this won't give us any useful information, except I guess that the execution didn't result in an error when connecting to our Hub. We can expand the code to fetch each specific piece of information for the Farcaster account (fid #1):</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-keyword">const</span> FC_FID: <span class="hljs-type">u64</span> = <span class="hljs-number">1</span>;</code></pre><p>In fatline-rs there is a function to simplify getting a User's current profile information, there's a custom Profile type I use that enables serialization/deserialization from an API as well. The implementation just calls the client's <code>get_user_data</code> endpoint for each type, with a shorthand function to build the specific <code>FidRequest</code>, as well as some utility functions to get the response out of the proto RPC response.</p><p>The basic gist is that we want to query every UserDataType for any given fid as individual requests, which should give us the latest message for every type of user data. Once we get back an optional body, we check that the body is a UserDataBody which indicates it's a message containing an update to this user's user-data aka pfp/username/displayname etc. After this check we just have to get the UserDataBody field and type to see which field was updated and what the new field value is. We could stream through or subscribe to all realtime messages and filter on this user's updates to set or update user fields in a DB for indexing purposes so we can always return the current state in some API or use it for whatever purpose we want to.</p><p>we can recreate the logic for getting the profile like so:</p><pre data-type="codeBlock"><code><span class="hljs-keyword">use</span> eyre::<span class="hljs-type">Result</span>;
<span class="hljs-keyword">use</span> fatline_rs::proto::{UserDataType, UserDataRequest, Message, MessageData};
<span class="hljs-keyword">use</span> fatline_rs::proto::message_data::Body;
<span class="hljs-keyword">use</span> fatline_rs::HubService;
<span class="hljs-keyword">use</span> tonic::{Response, Status};

<span class="hljs-comment">/// The combined user's profile, holding values from all user update types</span>
<span class="hljs-meta">#[derive(Debug)]</span>
<span class="hljs-keyword">pub</span> <span class="hljs-keyword">struct</span> <span class="hljs-title class_">Profile</span> {
    <span class="hljs-keyword">pub</span> fid: <span class="hljs-type">u64</span>,
    <span class="hljs-keyword">pub</span> username: <span class="hljs-type">Option</span>&lt;<span class="hljs-type">String</span>&gt;,
    <span class="hljs-keyword">pub</span> display_name: <span class="hljs-type">Option</span>&lt;<span class="hljs-type">String</span>&gt;,
    <span class="hljs-keyword">pub</span> profile_picture: <span class="hljs-type">Option</span>&lt;<span class="hljs-type">String</span>&gt;,
    <span class="hljs-keyword">pub</span> bio: <span class="hljs-type">Option</span>&lt;<span class="hljs-type">String</span>&gt;,
    <span class="hljs-keyword">pub</span> url: <span class="hljs-type">Option</span>&lt;<span class="hljs-type">String</span>&gt;
}

<span class="hljs-comment">// shorthand so we can call this for each type</span>
<span class="hljs-keyword">fn</span> <span class="hljs-title function_">get_user_data_request</span>(fid: <span class="hljs-type">u64</span>, data_type: UserDataType) <span class="hljs-punctuation">-&gt;</span> UserDataRequest {
	<span class="hljs-comment">// the RPC method is expecting this as the "request"</span>
    UserDataRequest {
        fid,
        user_data_type: data_<span class="hljs-keyword">type</span> <span class="hljs-title class_">as</span> <span class="hljs-type">i32</span>
    }
}

<span class="hljs-comment">// Actual function to get the profile</span>
<span class="hljs-keyword">async</span> <span class="hljs-keyword">fn</span> <span class="hljs-title function_">get_user_profile</span>(client: &amp;<span class="hljs-keyword">mut</span> HubService, fid: <span class="hljs-type">u64</span>) <span class="hljs-punctuation">-&gt;</span> <span class="hljs-type">Result</span>&lt;Profile&gt; {
	<span class="hljs-comment">// lets start by getting the username as an example:</span>
	<span class="hljs-keyword">let</span> <span class="hljs-variable">username_request</span>: Response&lt;Message&gt; = client.<span class="hljs-title function_ invoke__">get_user_data</span>(
		<span class="hljs-title function_ invoke__">get_user_data_request</span>(fid, UserDataType::Username)
	).<span class="hljs-keyword">await</span>?;

	<span class="hljs-comment">// Now we have a response, we can get the inner message and extract the message, data and body,</span>
	<span class="hljs-comment">// and finally see the client published "profile update" message body</span>
	<span class="hljs-keyword">let</span> <span class="hljs-variable">message</span>: Message = username_request.<span class="hljs-title function_ invoke__">into_inner</span>();
	<span class="hljs-keyword">let</span> <span class="hljs-variable">body</span>: <span class="hljs-type">Option</span>&lt;Body&gt; = message.data.<span class="hljs-title function_ invoke__">and_then</span>(| data: MessageData | data.body);
	<span class="hljs-comment">// The UserDataBody contains the type (UserDataType) as well as the value (String)</span>
	<span class="hljs-comment">// We *should* expect the type to match the type we requested, in this case the username</span>
	<span class="hljs-keyword">let</span> <span class="hljs-variable">username_data_body</span>: <span class="hljs-type">Option</span>&lt;UserDataBody&gt; =  <span class="hljs-keyword">match</span> body {
		<span class="hljs-title function_ invoke__">Some</span>(Body::<span class="hljs-title function_ invoke__">UserDataBody</span>(body)) =&gt; <span class="hljs-title function_ invoke__">Some</span>(body),
		_ =&gt; <span class="hljs-literal">None</span>
	}; 
	<span class="hljs-comment">// ignore the type since we requested it explicitly, map the body and pull the value out</span>
	<span class="hljs-keyword">let</span> <span class="hljs-variable">username_value</span>: <span class="hljs-type">Option</span>&lt;<span class="hljs-type">String</span>&gt; = username_data_body.<span class="hljs-title function_ invoke__">map</span>(|body| body.value);
	<span class="hljs-comment">// technically usernames can be optional, so this is all we need for the profile for now</span>
	<span class="hljs-keyword">let</span> <span class="hljs-variable">profile</span> = Profile {
		fid,
		username: username_value,
		display_name: <span class="hljs-literal">None</span>,
		profile_picture: <span class="hljs-literal">None</span>,
		bio: <span class="hljs-literal">None</span>,
		url: <span class="hljs-literal">None</span>
	};
	<span class="hljs-comment">// return our poorly populated profile</span>
	<span class="hljs-title function_ invoke__">Ok</span>(profile)
}</code></pre><p>At this point we would just populate all the other remaining fields, maybe write some useful helpers to extract other username data, shortcut all of the body/optional/message etc types and fill out the rest of the <code>get_user_profile</code> function here.</p><p>Returning to our main method, we would add the call for this new profile function and display the data however we want, like logging it to the terminal output:</p><pre data-type="codeBlock" language="rust"><code><span class="hljs-keyword">use</span> eyre::<span class="hljs-type">Result</span>;
<span class="hljs-keyword">use</span> fatline_rs::HubService;

<span class="hljs-keyword">const</span> HUB_URL: &amp;<span class="hljs-symbol">'static</span> <span class="hljs-type">str</span> = <span class="hljs-string">"http://somethingsomething:2283"</span>;
<span class="hljs-keyword">const</span> FC_FID: <span class="hljs-type">u64</span> = <span class="hljs-number">1</span>;

<span class="hljs-meta">#[tokio::main]</span>
<span class="hljs-keyword">async</span> <span class="hljs-keyword">fn</span> <span class="hljs-title function_">main</span>() <span class="hljs-punctuation">-&gt;</span> <span class="hljs-type">Result</span>&lt;()&gt; {

	<span class="hljs-comment">// HubService here is a simplified, re-exposed type from fatline-rs, HubServiceClient&lt;Channel&gt; is the original type</span>
	<span class="hljs-keyword">let</span> <span class="hljs-keyword">mut </span><span class="hljs-variable">service</span> = HubService::<span class="hljs-title function_ invoke__">connect</span>(HUB_URL).<span class="hljs-keyword">await</span>?;
	
	<span class="hljs-comment">// call the new profile function from wherever we implemented it, assuming here it's in the main.rs</span>
	<span class="hljs-keyword">let</span> <span class="hljs-variable">profile</span>: Profile = <span class="hljs-title function_ invoke__">get_user_profile</span>(client, FC_FID).<span class="hljs-keyword">await</span>?;
	<span class="hljs-comment">// as username is optional, get a default if the user doesn't have one set</span>
    <span class="hljs-keyword">let</span> <span class="hljs-variable">username</span> = <span class="hljs-keyword">match</span> profile.username {
        <span class="hljs-title function_ invoke__">Some</span>(value) =&gt; value,
        _ =&gt; <span class="hljs-string">"actually nothing"</span>.<span class="hljs-title function_ invoke__">to_string</span>()
    };

    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"FC profile's username is: {}"</span>, username);

	<span class="hljs-comment">// return Result::Ok at the end of main</span>
	<span class="hljs-title function_ invoke__">Ok</span>(())
}</code></pre><p>Running this should give us an output something like this:</p><pre data-type="codeBlock"><code>FC profile<span class="hljs-symbol">'s</span> username is: farcaster</code></pre><p>I'll leave implementing the helpers and other user profile fields as an exercise for the reader, alternatively read through the library implementation, mine isn't perfect or the ideal implementation but seems to work for me!</p><div class="relative header-and-anchor"><h3 id="h-conclusion">Conclusion</h3></div><p>Reading data from your own Hub is easy and fun and can help to diversify client applications, reduce dependence on hosted or paid services which also helps decentralize the overall Farcaster network and keep your personal costs down. If you have the hardware to run one it provides very low latency access to the current entire state of the network. You can think of this similar to having your own Eth node running to query against vs relying on something like Infura. I'll try to write more content like this alongside writing a library and alternate client as a learning tool and to help diversify the open source Farcaster community. Any tips or feedback is welcome and appreciated.</p><p>Thanks for reading until the end! Hit me up on Farcaster <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://warpcast.com/harris-">@harris-</a> for what kind of content you want to see next, and I would appreciate any stars or follows and requests for new features on my <a target="_blank" rel="noopener noreferrer nofollow ugc" class="dont-break-out" href="https://www.github.com/0x330a-public/fatline-rs">fatline client and libraries</a></p>]]></content:encoded>
            <author>0x330a@newsletter.paragraph.com (Building in Public)</author>
            <category>rust</category>
            <category>farcaster</category>
            <category>data</category>
        </item>
    </channel>
</rss>