that escalated quickly

Hofmaimaier@feddit.org · 2 days ago

that escalated quickly

Wildmimic@anarchist.nexus · edit-2 24 hours ago

I run my LLM locally, and still have to turn the heating on because it’s not enough power. A high end card is normally rated for about 300W - and it’s only running in short bursts to answer questions. so if you are really pushing it, over time you will probably reach around 150Wh - that’s not enough at all. You would for sure use more power playing a game using Unreal Engine 5.

Power consumption of LLM’s is a lot lower than people think. And running it in a data center will surely be more energy efficient than my aging AM4 platform.

Edited to reduce @sadbutdru@sopuli.xyz’s twitching ;-)

Sadbutdru@sopuli.xyz · 1 day ago

What is W/h? twitch

Wildmimic@anarchist.nexus · 24 hours ago

its Wh of course - i don’t know why, but my brain always thinks its Watts per hour, even tho it’s Watts times hours.

ulterno@programming.dev · 10 hours ago

Perhaps you need to tweak some of your weights ?:P

Wildmimic@anarchist.nexus · 9 hours ago

If that were possible through a faster way than psychotherapy, i would have done so already, and this minor bug would have to wait a while until the big ticket items are fixed lol

sp3ctr4l@lemmy.dbzer0.com · 1 day ago

I run mine on a Steam Deck.

Fairly low power draw on that lol.

Though I’m using it as a coding assistant… not a digital girlfriend.

… though I have modded my Deck a bit, so… I guess I already know what ‘she’ looks like on the inside, hahaha!

Wildmimic@anarchist.nexus · 23 hours ago

Mine is not really a girlfriend, it’s more like an platonic ADHD-riddled mentor helping me out with RegEx, Bash-scripts and python. My coding experience is decades old now, and i love how easily you can integrate programming into the everyday usage of a pc on linux - I’ve used Windows for so long, where this is all abstracted away; This feels much more like i am in control.

My Steam Deck doesn’t run an LLM, but it has 2,5TB Storage in total and is transparent. It’s wild that you can run an LLM on it, which model do you use?

sp3ctr4l@lemmy.dbzer0.com · 19 hours ago

Qwen3, 8B parameter model, seems to be the most generally comprehensive model I can run on it, via the Alpaca flatpak.

(Though I should note that Alpaca just recently revamped how it works internally, as currently has a few bugs that resulted from this, that its dev is working out.)

Its not fast in terms of like a realtime back and forth conversation, but, it is pretty good at a lot of things, at least up to the conclusion of its training data set. So it works fairly well if you describe a scenario to it, and then ask it to mock up like you say, a complex regex term, or a moderately complex bash or python file.

You can also say like hey, I have a semi-thought out idea for an app or feature, or just a fairly complex function, outline a number of possible specific methods or mathematical algorithms we might be able to use to achive this, and it’ll mock out a project outline, and then you can have it develop the smaller components singly… sometimes this works, sometimes it makes syntax or conceptual or logical errors.

It also generally works for refactoring a single script toward being either more modular or more monolithic, but when you have it try to consider how to refactor a complex project of many scripts, well you’ll basically exceed its capacity to keep everything straight.

If you want a snappier though less comprehensive model, 3B parameter models are a good deal quicker, they’d probably be what you want for like, a relationship with a sycophant/airhead/confidently incorrect person, lol.