cross-posted from: https://infosec.pub/post/13676291
I've been building MinimalChat for a while now, and based on the feedback I've received, it's in a pretty decent place for general use. I figured I'd share it here for anyone who might be interested!
Quick Features Overview:
- Mobile PWA Support: Install the site like a normal app on any device.
- Any OpenAI formatted API support: Works with LM Studio, OpenRouter, etc.
- Local Storage: All data is stored locally in the browser with minimal setup. Just enter a port and go in Docker.
- Experimental Conversational Mode (GPT Models for now)
- Basic File Upload and Storage Support: Files are stored locally in the browser.
- Vision Support with Maintained Context
- Regen/Edit Previous User Messages
- Swap Models Anytime: Maintain conversational context while switching models.
- Set/Save System Prompts: Set the system prompt. Prompts will also be saved to a list so they can be switched between easily.
The idea is to make it essentially foolproof to deploy or set up while being generally full-featured and aesthetically pleasing. No additional databases or servers are needed, everything is contained and managed inside the web app itself locally.
It's another chat client in a sea of clients but it is unique in its own ways in my opinion. Enjoy! Feedback is always appreciated!
Self Hosting Wiki Section https://github.com/fingerthief/minimal-chat/wiki/Self-Hosting-With-Docker
I thought sharing here might be a good idea as well, some might find it useful!
I've added some updates since even the initial post which gave a huge improvement to message rendering speed as well as added a plethora of new models to choose from and load/run fully locally in your browser (Edge and Chrome) with WebGPU and WebLLM
If this project sees the value of privacy & security for local & self-hosted LLM chat, why does this project only offer proprietary, corpo means for contributions & communications?
I'm not sure I understand at all?
It's fully open source, can run/connect any number of fully local models as well as the big name models if a user chooses to use them.
Can you expand on what you mean?
Can you expand on what you mean?
Choosing proprietary tools and services for your free software project ultimately sends a message to downstream developers and users of your project that freedom of all users—developers included—is not a priority.
— Matt Lee
That seems like a pretty naive and biased approach to software to me honestly.
Ease of use, community support, feature set, CI/CD etc..all should come into play when deciding what to use.
Freedom at all costs is great until you limit the community development and potential user base by 90% by using a completely open repo service that 5% of the population uses or some small discord alternative.
So then the option is to host on multiple platforms/communities and the management and time investment goes up keeping them in sync and active.
As with most things in life, it's best to look at things with nuance rather than a hard stance imo.
I may stand it up on another service at some point, but also anyone else is totally free to do that as well. There are no restrictions.
These have tradeoffs you don’t see when certain groups cannot participate due to personal or systemic political or philosophical reasons. You also can’t hear from that crowd since they haven’t been given a place to voice.
In the case of chat & forges, these are solved for quality free options (& even decentralized in some cases). The choice are at least in the good enough category if not better in some aspects (& worse in others). For chat a room in Libera.Chat or OFTC is free & meant for free software—even if it is labeled as unofficial it still gives folks a sanctioned place who wish to avoid Discord for privacy, security, preformance, or US services being blocked (as well as being an out-of-band option for when a server is inevitably down). For forges, living in part of world where Microsoft often heavily throttles my bandwidth & all outages are during my day time, it is never a bad idea to configure your VCS to push to a second mirror like Codeberg, et al. not just for freedom reasons but resilience from server outages & censorship (see
youtube-dl
or the Switch emulators or nations that have blocked the whole IP due to something governments didn’t like in someone else’s repo). When you start coding around Microsoft GitHub’s Actions or API or Discussions or any specific integration without an eye to the generic/portable approach which is easier done from the start, dependence starts to add up. While readonly mirror would suck for freedom of contributions/communications, it is an option if it is seen as too noisy or too much of a burden to support multiple forges outages & censorship are real (especially if not in the West).“Enshitification” is the buzzword for services whose quality goes down & devolves to ads + selling user data for profit maximization—usually because they can because users/groups are now locked into the service having relied too heavily on their infrastructure. We see free software projects still stuck on Sourceforge & Slack due to lock-in. Having started with the free option, the lock-in probably can’t happen. Even having one option supported as a backup makes one cognisant of features that aren’t going to port when these US-based, profit-driven entities decide to gradually make things worse to the point where users want to leave with history showing us this has happened several times.
You might say it is pragmatic, but I think it’s both lazy & short-sighted to not have these near-zero-effort options set up even as a back up (truly can be set & forget if really wanted)—especially when you think these values are good enough for the service you are building but also interacting on Lemmy, a decentralized, self-hostable platform (who said they have every intention of migrating their code to self-hosted as soon as ForgeFed is merge for federation).
This looks great! I imagine the documents you upload are used for RAG?
If so, do you also show citations in the chat answers for what context the model used to answer the user's query?
I ask because Verba by weaviate does that, but I like yours more and I'd like to switch to it (I've had a hard time getting Verba to work in the past).
Thanks!
Unfortunately currently there isn't a true RAG implementation largely due to the fact that this site/app is fully self contained with no additional servers or database etc..which is typically required for RAG.
For now file uploads are stored in the browser's own local database and the content can be extracted and added to the current conversation context easily.
I definitely want to add a more full RAG system but it's a process to say the least, and if I implement it I want it to be quite effective. My experience with RAG generally has left me quite unimpressed with a few quite decent implementations being the exception.
That's cool! Why a dedicated page/appand not a bot on e.g. matrix?