Rob Pruzan

Next.js @vercel

What is kyju

Kyju is a JavaScript library for building web based devtools. Specifically, it solves:

  • how to communicate with different processes (iframe's/local servers) from your devtool
  • accessing messy/platform dependent browser API's
  • how to expose the devtools to LLM's
  • how to consistently build devtools that live on top of websites

The web has become a great place to build applications, as the result of years of investment. Most people take for granted how easy it is to create a react project with tools like next.js/vite. Without it, you need to solve hot reloading, bundling, build tooling, code splitting, scaleable UI patterns, and 1000 other things all yourself.

Unfortunately, web applications and devtools don't have as much overlap as you would expect. To list a few differences (with some generalizations made):

  • devtools are published to registries, not served over https
  • devtools run on top of an existing website
  • devtools with servers are local and don't need to consider a network boundary
  • a devtools primary data source is a website it's monitoring, not a database

These differences are large enough that we have to re-solve many problems that applications once had, and solve new problems applications have never encountered

Features

Boiler plate for creating a devtool

  • react
    • don't have to worry about configuring shadow roots, build configuration for typescript/jsx, not breaking the users app by running a second react instance
  • build tools
    • esbuild, tailwind, typescript paths, directives
  • hot reloading
    • react refresh
  • instantly NPM publishable

Multi process communication

Many devtools are separated from their data source by some boundary. For example, if you are an extension, need to access data from a local server, or need to access data from a cross origin iframe, you need to share state using a message API (postMessage, websocket message, etc).

The boundary you are sending messages across is arbitrary, and only exists because there's some data in another process that doesn't exist in the devtool's process.

To prove this boundary is arbitrary- imagine you needed to get the users CPU usage for a performance devtool that runs in a website. Assume the browser the devtool was running in had 0 sandboxing or permissions.

Would you rather:

  • write a local server, which is exposed via http, that internally calls functions to get the users CPU usage, and then in the browser access it by calling fetch
  • just call window.getCpuUsage() directly in the browser?

Most people would pick window.getCpuUsage, as semantically that's exactly what you're doing. The boundary between your browser and the operating system isn't important in this context.

Enough people agree that there are several ways to use RPC's on the web:

  • trpc
  • next.js server functions
  • many many more

But a key fact about these RPC's is they are unidirectional in nature (when they use HTTP), and they still treat the server as a distinct entity. It may look like you are calling a local function, but the semantics are significantly different- all these functions are running outside of React! You can't run a react hook inside this procedure, and you can't send messages back to the browser outside of the response (if you are using HTTP)

This means frequently when you write a remote procedure, you must break outside the react state life cycle, and then use "escape hatches" to sync state with that external process.

For example, what if you wanted to display the users stdout for a given process in the browser? You may think you can use next.js server functions perhaps to do something like this:

But of course you can't, setStdout in the remote procedure has no way to send that data back to the real setStdout in the main website. Meaning you will need a websocket connection, which forces you to break outside the react life cycle

But, what if this arbitrary limitation did not exist? What if the remote function behaved exactly like a local one, meaning you could write react hooks, access context from parent components, and set state.

Then we would have an app that feels single threaded, and all state would be controlled inside of React. We can throw away 90% of our useEffects and event listeners just to access trivial data that we are forced to send over a message API

But is this possible? How could you run the remote procedure in the context of the website's react instance? These seem mutually exclusive.

If we visually drew a react tree, it might look like:

a simple react component tree

where it has some nodes (the component instances) and pointers to other component instances.

But, what are the pointers?

They're a javascript object reference, which is an abstraction over a pointer to some location in memory. This means the react component tree is already separated by some distance. But, of course, as a programmer we don't really care, the API is designed such that the data structure feels continuous.

So, can we use other boundaries than "separation in RAM/cache"?

Why not a network boundary instead, where a pointer is just a url

2 react component tree's connected by a network boundary

Inside <ServerApp/> would be a valid instance of react running. Any functions called by <ServerApp/> that originated in a parent component would just be a remote procedure call to that function on the original website (all abstracted away from the caller).

What this means is you could write code that looks like this:

And... it would just work?

I want to emphasize that this is designed for communicating with a stateful local process, where client:server is 1:1 (the arbitrary boundary described earlier). This allows us to not be burdened by UX concerns from communicating over a large network boundary. Crossing the boundary only costs us on the order of microseconds.

Abstracting messy platform API's

Most web API's that are used for building web applications are highly standardized and available on all browsers. Though, in the past this was not the case. Libraries like jquery standardized and simplified many browser API's, and this utility is clear by jquery's past adoption.

While browser standardization of many API's has made jquery less useful, this standardization has not came for what I call "devtool" API's- API's that are meant for accessing debug data in the browser. This ends up forcing you to become a browser expert to implement cross platform devtools.

For the API's that are standardized, they are riddled with foot guns. For example, if you wanted to make a devtool that overlaid something over HTML elements on the users website, how would you measure the HTML elements position to layout the overlay?

The most common answer is element.getBoundingRect()- but this forces your browser to immediately recalculate the layout values for the elements on the page every time you call it.

This becomes a massive bottleneck very fast, as forced relayouts can be extremely expensive. The most performant solution is unintuitive- use IntersectionObserver, and then async listen for an entry from the API, which ends up containing the elements position. The browser does this calculation off the main thread.

You once again are forced to become a browser expert to implement a quite standard feature. These problems exist everywhere if you are approaching web API's from the perspective of a devtool.

kyju's utilities solve all these problems for you internally.