In my prior post I’ve written about how to run a chat with a large-language-model on your PC. This time I want to focus on scripting this with Node.js and letting the AI- and the „normal“-world interact with each other.
Most programs in the AI-field are based upon Python and pytorch. So the question arises, why I’ve chosen Node.js. The answer is simple: Because most people on the web can use it. The learning-curve in this regard is - at least in my perception - lower than using Python and in the same run learning a Python-based web-framework.
LLaMA-Node provides an easy-to-use API for running llama.cpp directly within your node-process. All you have to do is install two dependencies. So you can begin by creating a package.json-file:
Now run npm i in your console to install these dependencies. Please note, that I’m using Node 18, that provides a new readline-package.
This time I used a vicuna-model with 13B parameters. In my opinion it works pretty well and did the job for my test-requests. All of these models base on LLaMA and may inherit its licensing, so take care, what you do with these models. Wikipedia already lists a lot of models that are Open-Source and at a given time these will become good enough to use as well.
Using German :)
Let’s make it a little more difficult:
Ok, that last one was a pipe dream.
I’ve also tried the 7b-variant of the vicuna-model, but it was not able to follow my prompts.
But we don’t have to stop right here. Let’s improve on our prompting a little. I tried adding the following:
Prompting again this outputs:
When adding a little more JavaScript this works as well:
The output becomes 2.2. The return 1; at the end of scaleOf ensures the example still works in case we don’t provide any units.
Of course this example is not so impressive, but it shows that it’s not that complicated to integrate a LLM with „normal software“. You have to keep in mind, that this is all still in development, but I think it’s very interesting to see that you can already do a lot with very little effort.
I’ve also tried to improve the prompt, so that it directly gives me the correct factor for the metric prefixes, but I failed to get consistent output with both the unit and the factor.
LangChain provides another solution for JavaScript and TypeScript, but I’m not so sure, if this already provides llama.cpp-support.
Technology Briefing#1: Large Language Models und Commodity AI