Deploy Next.js APIs Effortlessly! 🚀

I Found a Faster Way to Build Next.js APIs

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In the video, the creator discusses a method for deploying Next.js APIs using Cloudflare Workers, focusing on performance and cost-effectiveness. By keeping the API server close to the data source, reducing dependencies, and utilizing a lightweight runtime, he achieves impressive results. The speaker compares three deployment approaches, notably a new experiment with JStack, and highlights how separating API components into routers with distinct Cloudflare workers boosts efficiency, enabling faster and cheaper deployments of Next.js applications.

Highlights

Discovered a method to efficiently deploy entire Next.js backends. 🧑‍💻✨
Evaluated various approaches for optimizing API speed and cost. 💸🔍
Highlighted three key factors affecting API latency: location, dependencies, and runtime. 📍🔗⏱️
Compared three deployment methods, revealing interesting performance findings. 📊🔍
Experimented with JStack for improved API deployment speed and simplicity. 🚀🆕

Key Takeaways

Deploy Next.js APIs piece by piece using Cloudflare Workers for high performance and low cost. 🚀
Ensuring geographical proximity between API servers and data sources boosts API speed. 🌍💡
Minimizing dependencies can significantly enhance API performance. 🏃‍♂️⚙️
Cloudflare Workers offer a lightweight runtime for efficient serverless functions. ☁️⚡
Separating API components using routers increases deployment efficiency. 🧩✨

Overview

Yesterday, I stumbled upon an innovative approach for deploying Next.js backends, piecing them together with Cloudflare Workers for both high performance and economical benefits. This whole endeavor started as an experiment, and after some insightful benchmark tests, I discovered that my experiment yielded promising results. I delved into key factors like the region of deployment, the number of dependencies, and the type of runtime framework to achieve optimal API efficiency.

In exploring how to make APIs faster, I tied the importance of geographical proximity between APIs and their data sources, the necessity of minimizing dependencies, and utilizing a lightweight runtime such as Cloudflare Workers to process requests quickly. Through rigorous testing, I examined an empty Cloudflare Worker, a JStack Cloudflare Worker, and a Vercel Edge Function, each offering unique insights into speed and efficiency.

One particularly interesting outcome was discovering that segmenting API components into routers, each deployed to separate Cloudflare Workers, evolved deployment efficiency immensely. As it stands, my method showed that it’s possible to boost speed and reduce costs drastically while maintaining or even improving developer experience with tools like Hono, a lightweight web application framework. I'm eager to continue refining this stack for future Next.js projects!

I Found a Faster Way to Build Next.js APIs Transcription

00:00 - 00:30 so yesterday I found a way to deploy your entire nextjs back end piece by piece in a super performant and extremely cheap to run way it was all just fun in games and me experimenting with different approaches until in the benchmarks I realized how well this experiment actually turned out to work so the implementation finally works the third Benchmark is running right there in the background and if this works this is looking very promising now to understand which three approaches I just benchmarked there we're going to get to
00:30 - 01:00 the results let's understand what makes a in parenthesis nextjs but this could be any other Service as well API fast what increases performance for example in nextjs I would argue the first thing is region where is your API deploy that's one parameter we need to get right if we want to make a very fast API that serves user requests in milliseconds because one cavia that you need to keep in mind when you do region is a global API is not always the best
01:00 - 01:30 option what you always try to do in a fast API is keep your data source and the thing that fetches from the data source which is your API geographically as close together as possible and in that you actually accept that the distance between your user and the API can be much larger so why is this important this kind of architecture right if your API and your database are really close together what usually happens in a web app or any app is that you make multiple request from your API
01:30 - 02:00 to your database and back to serve one request whereas the user often times just requests one thing from your API and then gets back the response so the closer you can have your API in your database the faster your API is going to be in the end the less Laten you'll have and if the user is even closer to your API like everything here happens in let's say for example Europe that's great but even if the user is in the US and your database and API are both in Europe then the latency is still going
02:00 - 02:30 to be pretty pretty quick for your API the second thing that makes an API for example in nexts really fast is the pendencies and I don't mean a lot of dependencies I mean the opposite it's having as little dependencies in your code as possible I don't think a whole lot of explanation is needed here you only want to load the dependencies per request that you actually need to serve that request right how do we achieve that we're going to get to that part it's pretty cool and then the last thing of course this is not an exhaustive list but you get the point these are really important fact factors when it comes to
02:30 - 03:00 API latency is having a lightweight runtime and also a framework that supports that lightweight runtime so in my experiment here that lightweight runtime is called Cloud flare workers it's a super cheap super performant right here exceptional performance reliability and scale way to run your serverless functions very similar to AWS Lambda that verell deploys to under the hood or they also deploy to Cloud flare workers under the hood if you opt into that in nextjs your code runs within
03:00 - 03:30 milliseconds of your users worldwide that ties into the region argument here where does your code run close to the user good but it should also run as close to the data source as possible and secondly you want a framework that supports that lightweight runtime which ideally doesn't add any overhead itself and I found that with hono a web application framework built on web standards that supports any JavaScript runtime for example the cloud flare runtime with a super lightweight regular
03:30 - 04:00 router because not only does Hodo have first class typescript support but the router so basically the system behind serving a request right imagine the user has an incoming request routing that request to the Handler that's going to serve it and then send back the answer this should happen as quick and lightweight as possible right and hu Let's us achieve that so that's the bigger picture of what makes an API fast and now for the experiment that I made I wanted to implement a way that ties into all three points that has region control
04:00 - 04:30 less dependencies for each request served and a super lightweight runtime and router to serve the request right I experimented with all three and did a benchmark of three different approaches the middle one is my experiment that I wanted to try out how fast I can make an API and I compared it to two things first off an empty Cloud flare worker a super simple stupid worker that literally only does one thing and that's this right here it's an app.get whenever
04:30 - 05:00 we make a request to this API we simply send back a text that is even cached and that's it right this should be the fastest possible way to run in Cloud flare workers it should not get faster than this because there is no logic involved no database nothing right so this is the perfect scenario the ideal speed we could achieve at the maximum no matter what we do I compared that to my Approach the jstack cloud flare worker if you're wondering jstack is basically a way to run cheap fast nexs projects
05:00 - 05:30 like a boiler plate a starter code that you can simply use to deploy your nexs projects to Cloud flare workers and I compared that to a versell edge function if you don't know what that is basically it's a regular nextjs API endpoint that is also deployed to Cloud flare workers so I argue we are comparing Apples to Apples here this all runs on the cloud flare lightweight Edge runtime and now let's see how the performance Compares so I compared three different things right here first off it's the empty
05:30 - 06:00 worker so the best possible speed we could possibly achieve second off it's my JQ approach that I'm experimenting with right here and then lastly it's the versel edge function there's also a versel node function but that's not comparing Apples to Apples because this actually runs in the US by default and I'm in Europe and this also runs in the node.js runtime and not Cloud flare workers so I'm excluding that from the comparisons and because making like one request here is not very scientific I prepared a script that simply makes 500 requests and then takes the median the
06:00 - 06:30 average the percentiles and so on I'm open sourcing the script so I might not be a benchmark expert you can look at the script it's completely open source fully transparent I have nothing to hide I want you to see the actual code that I'm running here and compare these three functions together um so you can run your own benchmarks if you want that's completely cool we're going to start with the versel edge runtime I'm going to paste the link in here and that's going to make right here 500 iterations
06:30 - 07:00 and also give us the progress so right now it's actually spamming the versel API and noting down each of the results that we're getting back and once that's done we're going to get a table of results here we go Benchmark complete 100% progress 500 total requests zero failed requests the average latency being 48.3 milliseconds that's pretty good all of these are stupid API roads just sending back a single payload no database interaction whatsoever with a median latency of 43 milliseconds
07:00 - 07:30 minimum 32 Max 270 these are pretty good statistics now let's compare that I'm very curious how my new experiment will do right the jstack cloud flare worker and now like to be fully transparent I've already ran the benchmarks once I kind of expect something to happen here let's try this together how do we compare in this new approach to the standard versell function that you get if you don't really put a lot of thought into this let's see let's copy over the API route that I have right here and
07:30 - 08:00 paste it in and that's going to run 500 iterations just on a different API route that looks a bit more complicated because you know I haven't gotten to that part yet but it does the exact same thing 500 requests and it's going to give us the results here in just a second so dude let's see how we compare 500 total requests average latency 36.3 7 milliseconds and where were we on verell at 48.3 milliseconds okay so that's like a what average
08:00 - 08:30 36 for so that's a 12 millisecond increase that's like like what 20% 25% or something that's pretty cool why is there such a difference for example because verell ads logging themselves they want to bill you for these requests so they need to know how many requests you make they wrap your function um at least that's how I assume it works and then add some logging overhead to it which is like 12 milliseconds about in this case is this the most scientific research method I don't know you tell me
08:30 - 09:00 500 requests I think are a reasonable amount to test with a minimum latency of 25 and a Max of 224 so this is already faster than just deploying to versel but how fast could we actually get what about the completely stupid empty worker that is like the ideal scenario it doesn't get dumber and faster than that how does that compare well let's try it let's run the benchmarks on that super stupid demo empty work and see how it
09:00 - 09:30 turns out all right 450 requests and it's about to finish let's see average latency 403 milliseconds so that is higher than the J stack one I'm surprised I'm surprised it should be lower now I'm not going to cherry pick this I want you to see the actual first time I'm running this together with you now I ran these like an hour ago but I'm not going to cherry pick this I'm going to leave this in the video cuz I don't
09:30 - 10:00 want to distort anything this actually seems slower than my J stack that is unusual I just want to run this again to verify because usually this should be a bit like this should be the fastest one right because it's the most simple one the jstack also adds a little bit of stuff but not as much as versel do so what we really expect to happen let's run let's let us run in the background is the following order right we expect this to be the fastest this to be the second fastest because I implemented an AR tecture that I'm about to show you
10:00 - 10:30 and this to be the slowest because versell adds logging overhead that jstack doesn't right and let's verify that let's look at this example average latency 38.5 five and how much did we get on jstack 36. 37 I find this very interesting I wonder why this could happen um because Network late or network bandwidth shouldn't be a big issue so the order that we actually get right now is this we get my Approach as the fastest which
10:30 - 11:00 I mean nice but I'm a bit confused we get the empty Cloud flare worker as the second and then the versel edge function as the third okay anyways millisecond difference or not the bottom line is J Stag and this architecture that we're using is doing pretty good in this example specifically why why do we get these fast results I want to show you that in code because the architecture from here is really really interesting I implemented this experiment yesterday and it works surprising L well and I
11:00 - 11:30 want you to have some takeaways and learn this with me so how it works is basically our back end is split into something called routers this is a standard nextjs project with a difference being we have a typesafe router system in here which sounds abstract but all that allows us to do is let's go into a front end component here get our data type safely from the back end using a type safe API like client dot post Dot and then we always can see the available methods from our actual back end here on the front end with full
11:30 - 12:00 type safety this is pretty pretty cool and for example we can get the recent post by saying recent. get and what this actually does is it does an RPC a remote procedure call to or post router and it gets or most recent blog article from our database and simply sends that back to the front end right that's all that's happening here it's really really simple stuff and the router system allows us to now deploy each router to a separate Cloud flare worker so if a request comes
12:00 - 12:30 in for a post we only ever load all the dependencies that are needed in the post router and not for example the ones that are used in other routers like the test router these dependencies in the test router these are pretty tame dependencies but if there was for example stripe in here there's no need to load stripe in the post router when we only need it inside of another router you get the point it ties into loading as little dependencies as possible that's how we make this performance possible and now comes the super cool
12:30 - 13:00 thing that I implemented yesterday that I want to show you this deploy command as soon as I hit enter now what's going to happen is our entire back end all of our routers that we have in the stack are deployed to separate cloudflare workers let me zoom out here a bit and show you the entirety of the console we found two routers the test and the post router in this case we generate code dynamically for each router and then deploy each router to a separate worker by building it and then deploying it we can see the size that each router has
13:00 - 13:30 the gzip size that's actually put onto Cloud flare and as well as the startup time the cold start time which is really really low on cloud flare that this API now has same as the other router right each router now maps to a separate worker in Cloud flare and that's how we make this so so fast if we get a request for a post from our database no dependencies that live in any other router need to be loaded and all of this is served from a super fast super cheap
13:30 - 14:00 runtime how cool is that so just to show you the behind the scenes right so what happens when we run the deploy command is there's a disc folder being created with the actual router content that uses hono the lightweight framework for cloudfill workers under the hood and registers a single API route mapping to the router right we build this into JavaScript using ES build so in the post. JS we can see the actual JavaScript minifi that's output this is you can see 26 lines of code but really
14:00 - 14:30 it's only eight the entire thing is completely inlined and minified for you know best performance on cloud fair so it looks really ugly here we have these weird like names because that's minified code that is actually pushed to Cloud flare for maximum possible performance and that's all the code that this router needs to run right or the post router that was the code for the Post router all types saafe middle Wares the regular expression router from h the actual
14:30 - 15:00 logic that handles your requests and serves them back to the front end all of that is now in this bundle deployed to the workers right in 90 kiloby here or 85 kiloby here with a startup time in the millisecond range I'm having a lot of fun just tinkering with this and you know pushing the question of really realistically how fast can we make an API like like what's possible what can I work on how can I pack it into an open source tool that you guys enjoy using and then publish it as as my own stack
15:00 - 15:30 because I think that's that's a really cool thing right we can all just have faster apis so during the experiment I realized two things first off this approach of deploying code is pretty nice it works and the second thing is API latency does not have to correlate oneto one with developer experience usually the better the developer experience the worse the performance because there are more abstractions on top of the code in the stack that I'm developing I genuinely aim for this be the nicest way to start an xjs project
15:30 - 16:00 is it currently probably not but I want it to be one day that's my goal for this and I think this approach of deploying your code takes us a big step in the right direction now if you want to roast my benchmarking approach or you have comments to leave on this stack what you think about it if you like it maybe you don't like it then please do so in the comments and that's going to be it for this video I'm going to see you in the next one until then have a good one and bye-bye