It helps differentiate between GNU/Linux users and the five people who use GNU/Hurd
It helps differentiate between GNU/Linux users and the five people who use GNU/Hurd
Yes of course, there’s nothing gestalt about model training, fixed inputs result in fixed outputs
I suppose the importance of the openness of the training data depends on your view of what a model is doing.
If you feel like a model is more like a media file that the model loaders are playing back, where the prompt is more of a type of control over how you access this model then yes I suppose from a trustworthiness aspect there’s not much to the model’s training corpus being open
I see models more in terms of how any other text encoder or serializer would work, if you were, say, manually encoding text. While there is a very low chance of any “malicious code” being executed, the importance is in the fact that you can check the expectations about how your inputs are being encoded against what the provider is telling you.
As an example attack vector, much like with something like a malicious replacement technique for anything, if I were to download a pre-trained model from what I thought was a reputable source, but was man-in-the middled and provided with a maliciously trained model, suddenly the system I was relying on that uses that model is compromised in terms of the expected text output. Obviously that exact problem could be fixed with some has checking but I hope you see that in some cases even that wouldn’t be enough. (Such as malicious “official” providence)
As these models become more prevalent, being able to guarantee integrity will become more and more of an issue.
I’ve seen this said multiple times, but I’m not sure where the idea that model training is inherently non-deterministic is coming from. I’ve trained a few very tiny models deterministically before…
I’m not sure where you get that idea. Model training isn’t inherently non-deterministic. Making fully reproducible models is 360ai’s apparent entire modus operandi.
There are VERY FEW fully open LLMs. Most are the equivalent of source-available in licensing and at best, they’re only partially open source because they provide you with the pretrained model.
To be fully open source they need to publish both the model and the training data. The importance is being “fully reproducible” in order to make the model trustworthy.
In that vein there’s at least one project that’s turning out great so far:
Holy crap there are still working nitter instances? God bless
You could try Guix! It’s ostensibly source based but you can use precompiled binaries as well (using the substitute system)
It’s a source-first Functional package distro like Nix but uses Scheme to define everything from the packages to the way the init system (Shepherd) works.
It’s very different from other distros but between being functional, source-first, and having shepherd, I personally love it
This is because all LLMs function primarily based on the token context you feed it.
The best way to use any LLM is to completely fill up it’s history with relevant context, then ask your question.
Doesn’t this just do what gets done through convolution anyway?
What’s the point of this.
The project was using a way to bypass requiring a backing account to proxy the requests, but the API update broke that
The instances that chose (and choose) to go the extra mile by creating and maintaining proxy account(s) are the ones still working
If the instance gets too popular the twitter goons quickly figure out what the proxy account is and ban it, though. So it’s a constant game of cat and mouse.
Seems like the thing I’ve always considered true: you can turn a mediocre game into a masterpiece with the right application of music.
Not that I’m saying Stardew is mediocre, but good music seems to uplift a game more than any other part.
I thought it was atomic age and information age…
Or was that just empire earth…
Tango closed cause it was the one of the only studios under Zenimax that wasn’t currently making a game with “executive producer: Todd Howard” squirted all over it
What do you do for file syncing, if you don’t mind me asking
There’s a weird implicit conservancy in tech circles around the dictatorial nature of corporate leadership.
It stems from this weird externalization of corporate decision making that just turns everything that happens at large companies into the machinations of the unknowable machine of capital.
“Of course they were fired, they protested in a way that disrupted the business, if the business is disrupted the machine must correct itself, and it did so by releasing the corporate anti-bodies of leadership to fire the disruptive element. Thus the machine is corrected. This is all logically sound, and thus impervious to moral inquisition.”
And how do you do this in gnu shepherd?
You don’t!
How do you end up doing this? I’ve been wanting to do the same thing and I’m curious how proton and apparmor interact
It’s cause Epic/McKesson has complete control over the EMR world so everything has to work with them to some degree.
GNU health is great but I haven’t seen where it could support the massive amount of legal and monetary hoops that Epic and co have to jump through as well.
For some reason there just isn’t a lot of volunteer efforts/space for open source development in the healthcare world.
I would recommend instead to use the AI Horde: https://stablehorde.net/ It’s a collection of people hosting stable diffusion/text generation models
There’s also openrouter which can connect to ChatGPT with a token-based system. (They check your prompts for hornyposting though)