Remix & Relay

I've been hyper-critical of Remix since I first saw it. I think a lot of this criticism comes from my bias towards Relay and how well they handle the colocation of data requirements with your component's source code. In reality, I was conflating the problems each technology is trying to solve. There are still valid criticisms of Remix but overall, Remix is an awesome meta-framework with a Rails / Laravel touch of magic.

Fetching Models

There a few different mechanisms to fetch data in a React app. Let's touch on two models (assuming you're using suspense). You can read React's take on these here.

Render-as-you-Fetch

If you want to avoid network waterfalls, you need to ensure that all of the network requests needed to render a page can be determined by the browser's URL. This is a cold hard fact. This method will always give us the fastest page response time. We'll always be able to start fetching data from our API or database as soon as the request starts.

Fetch-on-Render

Fetching data as you render lets you put your data requirements in your component function but may lead to unnecessary network waterfalls.

Imagine we have a PostCard component like the one below. Right now this component does not have any network waterfalls. We have a new product requirement. We need to show a "Like Count" below the post card.

function PostCard({ postId }) {
+   const likeCount = useFetch(`/post/${postId}/likes`);

    return (
        <>
+           <PostLikeCount count={likeCount} />
            <PostBody postId={postId} />
            <CommentsSection postId={postId} />
        </>
    )
}

Here is where we've accidentally introduced an unnecessary network waterfall. Now the useFetch calls inside of PostBody and CommentsSection cannot start until the like count is fetched even though like count is not a dependency of the post body comments section

The above example did not contain a necessary waterfall. We could easily fix the above waterfall by pushing the data requirements to the leaf components like so.

function PostLikeCount({ postId }) {
+   const likeCount = useFetch(`/post/${postId}/likes`);

    // other stuff here...
}

function PostCard({ postId }) {
-   const likeCount = useFetch(`/post/${postId}/likes`);

    return (
        <>
-           <PostLikeCount count={likeCount} />
+           <PostLikeCount postId={postId} />
            <PostBody postId={postId} />
            <CommentsSection postId={postId} />
        </>
    )
}

Now the three children can load in parallel.

The Tradeoff of Colocation

FOR is fundamentally at odds with RAYF.

If you want to fetch data as soon as possible, you need to know all of the data your componoents will need.
If you want to define your data requirements in your component function, you need to render that component to know what data it needs.

Colocation is something that developers value. React is a great example of this. The functionality (Javascript) of your markup (HTML) go together in the same file. The community took this one step further with CSS-in-JS so you can colocate your styles with your React component. Why shouldn't we colocate our network data requirements with our compnent? Because it introduces network waterfalls.

If you want colocation, and zero network waterfalls, you have to duplicate your component's data requirements at the route level to ensure the page loads as fast as possible.

function PostPage({ postId }) {
  usePrefetch([`/post/${postId}/body`, `/post/${postId}/comments`, `/post/${postId}/likes`])

  return <AssetList userId={userId} />
}

In the above snippet, we maintain a list of REST URLs to be fetched when the PostPage component loads. This fires the requests as soon as possible to make the downstream waterfalls disappear.

Why is this a problem?

Ensuring that your route's pre-fetcher includes all of its children's data requirements is a real nuisance. It's too easy for an engineer to forget to ensure every route that uses their component includes the new data requirement at the route level. This problem is amplified as your app gets larger and more complex.

In our example, we didn't have a necessary network waterfall, so it was pretty easy to fix by pushing our data requirements down to the leaf components. However, if your REST API has a necessary network waterfall, e.g. we must first get the post id by a slug to start fetching the /post/{id}/... requests, the waterfalls get worse.

Potential Solutions

Use Remix Loaders

Remix loaders let you fetch all the data you need for a component hierarchy on the server where latency is low. This is great but you still have to duplicate your component's data requirements. Once in the component, and again in the loader.

Fat Rest APIs

A fast REST API would be a singular REST endpoint which gives you all of the data for a given page. This is similar to a Remix loader without all the baked in goodies from Remix.

Use Relay / GraphQL

GraphQL fundamentally solves the network waterfall problem by letting an engineer declaratively define their data requirements in a singular query. This singular declarative GraphQL query is sent to the server to choreograph the upstream service calls / db queries to ensure the lowest possible latency.

All of these are fine solutions, but let's see how we can make things even better with Relay & GraphQL.

Colocation & Zero Waterfalls w/ Relay

Relay lets you define your data requirements at the component level. Relay also ensures that you will "never" have a network waterfall on the client. How do they achieve this?

They achieve this by statically analyizing your source code at compile time to create a single query for your entire component tree. Let's reconsider the above example, but this time with Relay.

function PostLikeCount({ postRef }) {
  const { likeCount } = useFragment(
    graphql`
      fragment PostLikeCountFragment on Post {
        likeCount
      }
    `,
    postRef
  )

  // some cool content here...
}

function PostCard({ postRef }) {
  const post = useFragment(
    graphql`
      fragment PostCardFragment on Post {
        ...PostBodyFragment
        ...PostLikeCountFragment
        ...CommentsSectionFragment
      }
    `,
    postRef
  )

  return (
    <>
      <PostLikeCount postRef={post} />
      <PostBody postRef={post} />
      <CommentsSection postRef={post} />
    </>
  )
}

Relay will automatically create a tree of data requirements from your component definitions at compile time, letting us fetch all of our data in one go without having to maintain a separate source of truth!. Let's fetch all of our data for the entire hierarchy with a single request.

function PostPage({ postId }) {
  const { post } = useLazyLoadQuery(
    graphql`
      query PostPageQuery($id: ID!) {
        post(id: $id) {
          ...PostCardFragment
        }
      }
    `
  )

  return <PostCard postRef={post} />
}

Boom! All of the data requirements for the entire tree are now encapsulate in a single query!

Where Remix comes in!

Typically with Relay, we have one query per page. This model breaks down pretty fast when you have nested routes. Nested routes are awesome and almost every app makes use of them whether they know it or not.

Let's consider the following Sales page from the Remix homepage.

Sales page

Let's imagine that the React for this page looks something like this. For example's sake, let's pretend we also need to show the logged in user's username next to the page title.

function SalesPage() {
  return (
    <>
      <div>
        <h1>Sales</h1>

        <LoggedInUsername />
      </div>

      <SalesRoutes />
    </>
  )
}

function SalesRoutes() {
  return (
    <Tabs>
      <Tab title="Overview">
        <OverviewTab />
      </Tab>
      <Tab title="Subscriptions">
        <SubscriptionsTab />
      </Tab>
      <Tab title="Invoices">
        <InvoicesTab />
      </Tab>
    </Tabs>
  )
}

We want to load only the data for the selected tab. We also have a common data requirement between all three tabs (the logged in user's username). We have a couple of options here.

Use a single query and load the data required for all three tabs.
Use four queries. One for the SalesPage's username. Three more queries to load the data for each tab.

Single Query

Using a single query would lead us to something like this.

function SalesPage() {
  const query = useLazyLoadQuery(graphql`
    query SalesPageQuery {
      ...LoggedInUsernameFragment
      ...SalesRoutesFragment
    }
  `)

  return (
    <>
      <div>
        <h1>Sales</h1>

        <LoggedInUsername queryRef={query} />
      </div>

      <SalesRoutes queryRef={query} />
    </>
  )
}

function SalesRoutes({ queryRef }) {
  const query = useFragment(
    graphql`
      fragment SalesRoutesFragment on Query {
        ...OverviewTabFragment
        ...SubscriptionsTabFragment
        ...InvoicesTabFragment
      }
    `,
    queryRef
  )

  return (
    <Tabs>
      <Tab title="Overview">
        <OverviewTab queryRef={query} />
      </Tab>
      <Tab title="Subscriptions">
        <SubscriptionsTab queryRef={query} />
      </Tab>
      <Tab title="Invoices">
        <InvoicesTab queryRef={query} />
      </Tab>
    </Tabs>
  )
}

Now we have a single query, but we're loading the data for EVERY tab when we're only looking at one 🤮.

Multiple Queries

Using multiple queries would look something like this.

function SalesPage() {
  const query = useLazyLoadQuery(graphql`
    query SalesPageQuery {
      ...LoggedInUsernameFragment
    }
  `)

  return (
    <>
      <div>
        <h1>Sales</h1>

        <LoggedInUsername queryRef={query} />
      </div>

      <SalesRoutes />
    </>
  )
}

function SalesRoutes() {
  return (
    <Tabs>
      <Tab title="Overview">
        {/* OverviewTab has its own useLazyLoadQuery */}
        <OverviewTab />
      </Tab>
      <Tab title="Subscriptions">
        {/* SubscriptionsTab has its own useLazyLoadQuery */}
        <SubscriptionsTab queryRef={query} />
      </Tab>
      <Tab title="Invoices">
        {/* InvoicesTab has its own useLazyLoadQuery */}
        <InvoicesTab queryRef={query} />
      </Tab>
    </Tabs>
  )
}

Now we're only loading the minimum amount of data for a particular tab, but we've introduced a network waterfall since we have to wait for the SalesPage useLazyLoadQuery to finish before we can start the queries defined in the tab components 🤮.

Back to square one

We can solve this problem by maintaining a list of queries which need to load for a given route so we can preload all of the queries. Relay even has something built in for this (useQueryLoader & usePreloadedQuery).

This is less than ideal and is antithetical to what makes Relay so great in the first place.

Solving this w/ Route Loaders

This problem is exactly what Remix was made to solve. Remix let's us define nested routes and preload the required data in parallel! Now I haven't gotten Remix working w/ Relay but the solution is quite simple. Kick off your queries inside of your loader function!

// dashboard.js

export function loader() {
  preloadQuery(dashboardQuery)
}

// dashboard/[tab].js

export function loader() {
  preloadQuery(selectedTabQuery)
}

This functionality is so great that they brought this functionality from Remix back into React Router. You can read about this in their blog post.

Recap

Relay is great, but it needs a great router to go alongside with it.

Remix is great, but doesn't let you define your data requirements with extreme granularity.