The BEST AI setup for students and faculty is way cheaper than you think

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In this video, Noah from Learn Meta-Analysis discusses his perspective on using AI tools and hardware for students and faculty as of April 2025. He shares his experiences with both local and web-based AI, emphasizing the value of using existing hardware and tools like open web UI to manage local models and connect to APIs. Noah argues that the current hardware most people possess is sufficient for local AI applications, and more expensive upgrades might not offer substantial benefits. He highlights the benefits of using Chat LLM for access to various AI models without significant costs, suggesting that flexibility and cost-effectiveness should guide resource-constrained users.

Highlights

Noah praises open web UI for easily managing local and web-based AI models. 🌟
He debates whether upgrading his GPU offers enough benefits for the cost. 💵
Most people's current computers are adequate for handling local AI models. 💻
Noah uses AI to assist with emails, rewriting, and other tasks. 📧
Free APIs help avoid the need for expensive AI model subscriptions. 🚫💸
Chat LLM lets users access a variety of models from one platform. 🔎

Key Takeaways

You don't need the latest high-end hardware for effective AI use. 😊
Open web UI is an excellent tool for managing local AI models and connecting to APIs. 🌐
Existing hardware is often sufficient for running local AI applications. 🛠️
Connecting to free APIs can provide access to a wide range of AI models without cost. 💰
Chat LLM offers flexibility by providing access to multiple AI models. 🔄

Overview

Noah from Learn Meta-Analysis dives into the best AI setups currently available for faculty and students, emphasizing his love for open web UI. This tool allows seamless interaction with both local AI models and external APIs without the hassle of needing the top-of-the-line hardware. Instead of chasing after high-performance GPUs, Noah suggests leveraging the computing power you already possess, which is often more than adequate for typical academic needs.

He discusses his consideration of upgrading his current GPU but concludes that the cost may not justify the performance gains. Larger GPUs like the RTX3090 might handle more substantial models, but their necessity is questionable for his regular tasks, which involve managing 8 to 16 billion parameter models. For users who require AI for academia-related activities like rewriting papers or answering emails, the existing hardware should suffice.

Moreover, Noah shares that Chat LLM is an essential tool in his AI arsenal, offering a fantastic way to explore various large language models without locking into an expensive subscription model. In the rapidly evolving landscape of AI, flexibility and economic considerations become crucial, making Chat LLM a great choice for those with budgeting constraints.

Chapters

00:00 - 00:30: Introduction to Video Topic In this introductory chapter, Noah discusses what he considers the best setup for utilizing AI in 2025, focusing both on local and web-based options for students and faculty. He emphasizes his preference for open web UI, highlighting its ability to integrate with various external and local AI models, making it a versatile and powerful tool for private and educational purposes.
00:30 - 01:30: Discussion on Local Models and Hardware The chapter delves into the practicality of using local models and the dependency on hardware capabilities. It highlights a common query about the necessity of powerful GPUs like 3090, 4090, or even 5090 often showcased by YouTubers. However, it emphasizes that an 8 GB GPU can also be effective, suggesting that extreme hardware might not always be required for running local models.
01:30 - 02:30: Hardware Considerations for AI The chapter "Hardware Considerations for AI" explores the necessity of having advanced hardware for AI tasks, focusing particularly on students and faculty. The key argument is that the best hardware for working with Large Language Models (LLMs) is what users currently possess. This perspective is discussed in the context of its sufficiency for effective learning and application of AI technologies. The chapter emphasizes that investing in newer and more powerful components isn't always necessary for educational purposes.
02:30 - 04:00: NVIDIA GPU Choices and Costs The chapter discusses NVIDIA GPU choices, focusing on the host's recent decision-making process regarding upgrading their GPU hardware. Initially, the host reserved the NVIDIA Spark (formerly known as Digits), but expresses uncertainty about proceeding with the purchase. This highlights considerations in choosing the right GPU amidst changing product names and offerings.
04:00 - 05:00: Private AI and Software Use Cases The chapter discusses considerations in upgrading to a refurbished 4060 Ti graphics card with 16GB of VRAM. The narrator currently owns a standard 4060 with 8GB of VRAM, which is described as 'cheap' despite being refurbished. They contemplate doubling their VRAM capacity for less than $500 and potentially offsetting some of the cost by selling their existing card for $200 to $250.
05:00 - 06:30: Exploration of Different AI Models The chapter discusses the practical limitations and considerations in adopting different AI models, particularly in terms of hardware requirements and cost. The narrator shares a personal reflection on the utility of upgrading to 16GB of RAM, weighing the benefits against the financial implications. The chapter delves into the specifics of certain AI models such as Olama and the constraints faced in achieving higher computational capabilities needed for handling substantial model sizes, like the mentioned 70 billion parameter model, within a reasonable budget.
06:30 - 08:00: Changing Perspectives on AI Services The chapter explores the varying capabilities and performance of AI services, particularly focusing on DeepSeek. It discusses the compatibility of different parameter versions of AI models with personal hardware, such as an 8 GB graphics card. While smaller models run efficiently, larger ones, like the 14 billion parameter version, begin to push the limits of this hardware, spilling slightly into RAM, and making the 32 billion parameter version impractical for regular use on such a setup. This highlights the challenges in scalability and performance when using advanced AI services on consumer-grade machines.
08:00 - 09:30: Conclusion and Final Thoughts In the concluding chapter, the author discusses the comparison between different models and specifically highlights Mistral Small, a model that is favored by the author despite its larger size of 15 GB. The author notes the technical feasibility of running Mistral Small within 16 GB of VRAM in Q4, expressing satisfaction with its performance. The author reflects on whether Mistral Small is significantly better than other newer models.

The BEST AI setup for students and faculty is way cheaper than you think Transcription

00:00 - 00:30 Hey everybody, this is Noah and I want to talk to you guys today about what I think is the absolute best setup for AI, local and web- based AI for students and faculty right now in it's the end of April in 2025. So, as you guys know, I am a huge fan of open web UI. I have it hooked up to a number of external models and I also have a whole bunch of local models for private AI use. So, what do I think about this? I think open web UI is an absolutely phenomenal way to use
00:30 - 01:00 local models. I think it's a great way to connect to APIs, but let's talk about hardware for a minute. Okay, so you probably know with local models, you're constrained by the hardware you have. So the question that often comes up is like, oh, do I need a 3090 or do I need a 4090, maybe even a 5090? Because that's what you see all the YouTubers have, right? That's that's what you see everybody have. Well, here's the thing. Okay, here's what I want to show you. I have an 8 gigabyte GPU of VRAM. I've been thinking about this a lot. I've
01:00 - 01:30 been thinking about do I need more? Do I need more VRAM? Do I need Do I need more compute? Do I need more power? Will it what will it actually get me? And here here's where I've landed. The absolute best hardware for LLMs is what you have right now. So that is that is my opinion is what you have right now. So let me walk you through why I think this is the case. All right. So I'm going to talk about this from a use perspective for students and faculty and I'm also going to talk about this from a hardware
01:30 - 02:00 technology and LLM perspective. So let's talk about hardware technology first. So as you guys know a few weeks ago I did a video on how I reserved the uh Nvidia Spark or Digits or whichever one of those things they're calling it now. I think they're calling it digits. No, they're calling it Spark. It was project digits. Now it's the Spark. So, you saw a video on how I reserved the Spark. The reality is I don't think I'm going to buy it. So, I've been thinking about, should I upgrade my GPU? Now, let me give you an example of what I've been thinking about. So, one of the cards
02:00 - 02:30 that I've been thinking about is this 4060 Ti. So, as you know, I have a regular 4060 and it has 8 GB of VRAM. This one here has 16 and it's, you know, a little less than 500 bucks. Yes, it's refurbished. It's probably not the best thing out there, but it's quote unquote cheap, right, at like less than 500 bucks. So, I was thinking about, you know, I can double my VRAMm for 500 bucks, and I can probably sell my current 4060 for maybe like 200 or 250 bucks. So, in the end, I'd probably only
02:30 - 03:00 end up spending like $250. But in reality, when I started thinking about it, what does 16 gigabytes get me that I'm not already that I don't already have, right? And the reality is I don't think it actually gets me that much because when you look at some of these models, let me pull up an Olama. So when you look at some of these models, right? So let's go to let's see what it's got. Uh 3.3. That's the 70 billion. I'm never going to be able to reach that, right? In a realistic amount of money. 43 gigs. No,
03:00 - 03:30 just not happening. Uh Deepseek R1. Lots of people love DeepSseek. Well, let's look at some of these tags here, right? So the seven or eight billion parameter versions, those fit within my 8 GB graphics card. Look, four or five gigs. Uh, as soon as we bump up to 14B, we're now exceeding, but this will still run well on my machine. Um, because it's only 9 gigs. So, only one one gig spill over into RAM. But then when we get up to 32, no, it's just it's just going to be really really slow, right? So, in in reality, what does this get me? I can already run the 14. I can't
03:30 - 04:00 run the 32. Now, let's go look at a different model. What about Mistl? Well, Mistl small, as you know, is one of my all-time favorite models, but it's 15 GB here. So, I could technically barely squeeze this in to that 16 VRAMm right in Q4. It It should in theory fit. So, that's good. I I would be able to run Mistral Small. So, that would make me happy. But, is Mistral Small notably better than Let me pull up some of the new models here. Is it notably better
04:00 - 04:30 than Granite 3.38B? which is 5 gigabyte file. I don't know. It might be. But I can tell you this right now. If I think about do I want to spend 500 bucks to run Mistl small locally instead of using the 100% free API? No. That's that that was my answer today. Today my answer was no. So the biggest model that I run on a regular basis on my personal machine is 54. So
04:30 - 05:00 let's jump over to that real quick. Okay. So that's annoying. Let's go to popular. There we go. So five 54 here is a 14 billion parameter model. It runs pretty well on my machine and it's only 9 gigs. So I'm pretty happy with how 54 runs. I don't feel that I actually need a bigger GPU to run 54 well on my machine. I'm quite happy with what I have. So where does this land me? This lands me in what I was thinking about today. Do I want to spend 500 bucks to essentially be able to run only like 24 billion parameter models
05:00 - 05:30 locally when I can currently run 16 billion parameter models locally? I I won't be able to really realistically reach that 32B, I don't think. So, let's just take a look real quick. Um, let's look at some of these llamas because I I know. Oh, no. They don't even come in 32. I thought they did. Um, so QWQ, that's one that's popular. QWQ. So, that comes in 32B. Let's check the size. 20 gigs. Yeah. See, so that's not even going to fit in that 16 GB card. So, basically what this $500 card
05:30 - 06:00 would let me do is run one model that I really like locally, even though I can currently run it on the web. So, the other option is like a RTX3090. Let's just see what they look what they're priced at right now on Newegg. So, 3090, we're looking at like two grand. Oh, that's a lot of money. Uh, so let's look at lowest price here. Uh, it's and that's also like this is an old card. Okay, so here we go. We're at 16. And yeah, I know there's some of you out there going to be like, "Yeah, you can get it on eBay for like a,000." Yeah, you can get used ones on eBay for
06:00 - 06:30 like a,000. Um, but that's still $1,000. It's twice as much. And these are 24 GB cards. What can we fit on there? Well, we can fit QW on that card at 32B, but personally, I don't like reasoning models, so that doesn't really offer me any advantages here. So, long story short, where did I end up? Where I ended up was I started thinking about my use case. What do I like private AI for? I like private AI for helping me with emails. I like private AI for helping me with rewriting things on my own personal
06:30 - 07:00 papers and such. And I like private AI for doing rag. But here's the thing. I don't think I actually need much bigger models than what I run now. Think about that for a minute. Like there there's a big gap, right? So there there's a couple different sizes of models. We have these tiny models and I think they're generally terrible. So I don't use them. At least they're terrible for my use case. I'm sure they're good at specific things, but for my use case they're not good. I'm talking about like the 1 billion parameters, 2 billion, 3 billion parameters. Most of those don't function
07:00 - 07:30 very well for me. Exception granite 2 billion parameters does work pretty well for me. Outside of that, I haven't had the best luck with those. So the sweet spot for me from what I found for small local models is between that 8 and 14 billion parameters because there's a number of models that come in those sizes. They tend to run well on my 8 gigabyte graphics card and I don't have a current Mac, but I'd imagine that models that size probably run pretty well on a Mac, too. And that is why my thoughts right now are that the hardware
07:30 - 08:00 you own is probably better than the hardware you don't for most people's average use cases. Now, I know there's people who do fine-tuning. That's different. I know there's people who have to run these big models locally. That's different. So, what what are my current thoughts? What am I currently doing? I am using open web UI for my local models as I showed you before. I like it because I have a whole bunch of local models here, right? You can see I have just a really nice list and I have some of my uh custom models written here as
08:00 - 08:30 well. And then for external I have it hooked up to Mistl and I have it hooked up to Gemini. So I have and then I have my own custom models based on those two things. Right? So that gives me access to web- based ones right here in open web UI and I don't pay for those. Those are free. However, it does they do use your data as training data. So I think that is a wonderful solution. The hardware you have to run your local small models and then hook up these free APIs. You can also hook up paid APIs, right? So if you want to use claude or
08:30 - 09:00 something, if you have whatever your, you know, pick your favorite flavor of LLM, you can do the pay as you go API and connect it to open web UI so you can have all your stuff in one place. Now, I don't currently do that. And the reason I don't currently do that and have a paid one, I know in a video a little while ago, I was talking about, you know, how I was thinking about switching away from chat LLM. And in reality, what I realized was I'm not going to cancel this yet. Okay? And why? And that's because in my use case, I use all these
09:00 - 09:30 different models for different things, right? So, like I really like Perplexity Pro when I do a search. I really like to use Claude for coding. I really like Gemini 2.5 Pro for a lot of different things. I don't know if you guys have tried that model yet, but it is sweet and it is really effective. So, I I've been really really happy with Gemini 2.5 Pro so far. Um, and then the new GPT 4.1. Honestly, haven't tried it out that much yet, but my typical workflow has traditionally when I'm using these really big models kind of gone from claw to GPT and now I'm finding I use Gemini
09:30 - 10:00 2.5 Pro. And so, what's the reality of this? The reality of this is I personally feel that LLMs are accelerating at such a rate right now that I don't want to commit to one particular LLM. I don't want to commit to paying only for Claude or only GPT or only Gemini. I really like the flexibility that Chat LLM offers me. And I know you're probably thinking like, noah, you kind of said the opposite in your in your last video about Chat LM. Well, the reality is I want you guys to
10:00 - 10:30 know my honest thoughts about these things. And my honest thoughts about these things change over time, right? They change as I use things. They change as new models come out. They change as new hardware requirements come out. And I just think that right now we're at this point where so much is changing with LLMs. These things, you got to remember, they're only 3 years old, right? I think it was November of 2022 when chat GPT first got released to the world. Think about where we are now compared to then, 2 and 1/2 years. Think about the innovation that's taken place. I don't know if you guys use the original chat GBT, but personally I
10:30 - 11:00 thought it was really bad compared to most models that we have now. Like a lot of the local models like Granite for example, I personally feel that in a lot of ways Granite 3.3 8b is as good if not better than what I remember of the early versions of chat GBT. Right? We are seeing amazing advancements in LMS. I mean even think about Gemini. Like I did not like Gemini personally. I didn't I didn't like it very much until Gemini 2.0 0 and then at 2.0 they made some really great advancements and now at 2.5
11:00 - 11:30 it's even better and I didn't know that was even going to be possible. So as I've thought about like what do I want to spend my money on and uh like using this for academic use and uh like I mentioned before I don't I don't do fine-tuning quite yet so that's a whole different conversation. and most of my uses, question answering, rag, um those types of things. Uh writing code, all that kind of fun stuff, then I don't think I need different hardware than I have currently. And I think you probably don't need different hardware either because if you have certain things that
11:30 - 12:00 you need to keep private, then there's probably a small model out there that'll work pretty well for you given your hardware constraints. And if you are like me and you don't want to subscribe specifically to like one particular LLM, Chat LLM offers a really good opportunity here to try out all of these different models. And I mean, I honestly don't use Deepseek, but it doesn't bother me that it's listed because one day I might find something on DeepSeek I want to use. And then I don't use Grock. Um I haven't used Llama for Maverick
12:00 - 12:30 yet. I'm excited to do that. I just haven't done it yet. Like I like that they keep these things updated with the most recent models and I don't have to buy another service to go get that. Now Chat LM has some pretty cool other things coming along as well. I I don't want this to be like a marketing video for Chat LM. So I'm not going to go through all the other features that it has right now. At at some other point when I whenever I do my next like monthly review or uh you know couple month review, probably do like one six months out of the initial thing. So that's probably next month coming up
12:30 - 13:00 here in a couple weeks I would guess. Um, I'll do a more in-depth review on Chat LM, but here here's the takeaway message I want you guys to have from this. And I I spent too much time thinking about this to be very honest. Um, I I'm I'm really curious about what you guys think too. So, please let me know in the comments what your thoughts are. My thoughts right now for students and faculty or other people who are resource constrained, and I know all academics are resource constrained. As those who are resource constrained, we don't just have tons of money laying around. the computer you have is
13:00 - 13:30 probably better than the computer you don't for the you know for LLM use cases. So unless you are you know working on machine learning in which case you probably already have a pretty nice machine or you are somebody who definitely needs to have large models or your machine your computer hardware is really quite old then like chances are you can run a reasonably sized model. If you have a computer that's, you know, with that's within the last 5 years and is relatively recent, you can probably run a reasonably sized model on the hardware that you have at reasonable
13:30 - 14:00 speeds. If I was going to make a blanket suggestion for the smallest model that I think will run on most machines and like be pretty good still. Granite 3.32B, that is a quite small model. I think it will run pretty well on even machines that don't have a GPU and you know, you're kind of running CPU only. Don't quote me on that because I haven't tested it. I've only tested it on my machine which has 8 gigs of VRAM. But I I'm I feel pretty confident recommending Granite 3.32B to people because I use
14:00 - 14:30 that model. Even with having more VRAM available, I still use the 2 billion parameter model. Particularly for RAG, I I really like the 2 billion parameter model. Um the 8 billion is also really good. And like I said, the biggest model that I find myself using locally is 54 and that's 14 billion parameters. So those models I think will run on many machines pretty well. And I know there's a lot of machines it won't run on, right? Like we we just need to be honest with ourselves. Some of us, you know, we might be using Chromebooks or something like that where we don't have the compute available. And that's a little
14:30 - 15:00 bit different story, right? In that case, we might not be able to actually have the compute resources to be able to use some of these local AI models. But if you have a computer already that's capable of running open web UI and running some of these smaller models or even these tiny models down at like two billion parameters, it might not actually be worth you going ahead and buying more expensive equipment. One of my students pointed something out to me the other day. They said, "Noah, how much are you paying for chat LLM?" And I said, " $10 a month because that's what it costs. Cost 10 bucks a month for access to all these different LLMs." And they said, "Uh, how if you buy that new
15:00 - 15:30 GPU?" Because I was I was talking about a 3090 at the time. So they're like, if you buy that new GPU, how many months will it take for you to pay off how long you could have been using Chat LLM instead using better models through Chat LM? And I was like, why do you have to bring things up like this? So like, here's what we're going to do. We're going to actually do this math. So I pay 10 bucks a month. I already know the answer to this, and I'm sure you do, too. But let's say we have $1,000 for a new GPU. Uh, and we're going to divide
15:30 - 16:00 that by 10 bucks a month that I pay for Chat LM. That would be a hundred months of use of chat LLM. A hundred months, right? That's years. That's many, many, many years. So his point was over that many years, like let's say over the next like six or seven years, do you really think that whatever model you got that 3090 for is going to be still the best model that you want to use? And my answer is like, of course not. Not with the speed that LM are moving at. And he's like, "So why would you invest in
16:00 - 16:30 hardware that you just told me is probably going to be outdated?" And I was like, "Man, you just saved me a bunch of money." So this all brings me back to my conclusion that I think that the hardware you have in terms of large language models right now, unless you need the privacy of a private AI model, right? Unless you need the privacy of a private AI model for what you're doing or your work is around fine-tuning and things like that, then I think the hardware you have in this case is
16:30 - 17:00 probably better than the hardware you don't. That's Noah telling you you probably don't need to spend more money. And this is from the guy who wants to build more computers. Like I was literally called my brother and I was like, "Talk me into like talk me out of this." And he just brings up the he brings up the same thing my student did of like, "How long will that card be relevant?" I'm like, "Oh, I don't know, a couple months maybe if I'm lucky. Maybe a year tops." And he's like, "What can you actually run on that that you can't already run?" I'm like, "I don't know, like one model that I use for some things, but not most things." They're
17:00 - 17:30 like, "What's the point?" So, I'll get to the point so I don't I don't keep rambling here, friends. So, what is my point? My point is the hardware you have is probably the better than the hardware you don't. Unless you need that privacy layer. If you need that privacy layer, it's a totally different conversation. Second, I'm still a fan of Chat LLM. I know in my last video I seemed like I was kind of going downhill with it and I was thinking about changing. That's actually not the case anymore. I'm I'm I'm back on the Chat LLM train. Are there something like could I save a
17:30 - 18:00 little bit of money if I was only using one LLM? Probably for my use case. But what I realized over the last couple weeks since I made that last video is I really like the flexibility of being able to jump between leading models all in one place. So that said, I'm going to stop talking here. Uh I want to know your guys' thoughts on this. Drop me some notes in the comments if you don't mind. What are your thoughts on this idea of the hardware you have is better than the hardware you don't? And this idea of like using Chat LM for web-based models. I do want to just address one critique somebody had here in case
18:00 - 18:30 somebody brings it up, which is they said um they essentially didn't like chat LM because you couldn't have folders to save your chats in. You actually can do that. I don't know if that existed at the time. It might not have, but it does exist now. I'll show you guys that whenever I do the six-month review or whatever it is, which be sometime next month. So, that said, I'm really curious on your thoughts on this. Do you agree? Do you disagree? Um ple please feel free to share your thoughts. So, that said, have a great week, guys. I will see you in the next video.