Hello! This is a new page so expect this page to grow in the
future.
This is an introduction to AI for people who don't know much about
it.
AI is a combination of computer program and data which can help a
person find information or do tasks. The parts of the program that do
tasks are sometimes called "agents". AI can do different things so the
names for these things are not yet standardized.
Glossary
A "prompt" is a sentence of what you tell the AI to do. It's a good
idea to be specific.
A "model" is the data the AI uses. An LLM is a type of model called
a "Large language model".
An "LLM" is a Large Language Model. This is the file you download
which the AI talks to to do your task or answer your question. These
files can range from 2GB in size to 200GB or more. So they can take a
while to download in some cases. An LLM size is also based "parameters",
see below. LLMs have at least 1 billion parameters.
"MoE" model is a "Mixture of Experts". It only loads part of the
model in memory at a time so you can use larger models.
"Parameters". These are pieces of data the LLM learns. The more
paramters there are, the larger the LLM file is and the more VRAM you
need to run it.
"Agent" is a tool that uses AI to get things done, or execute
tasks.
"1 bit AI". This is a type of AI that uses ternary bits: each bit
has 3 values. So the AI file is smaller and needs less memory. But they
are not as accurate. Microsoft's BitNet is one example of 1 bit AI.
Some AIs are free, some cost a monthly fee. There are also local free
AIs that you can run on your own computer like Ollama and AnythingLLM. These are the two
simplest to get running. But they generally need a GPU (video card) with
more VRAM. The larger the VRAM the larger the AI model the computer can
handle because the software loads the whole model in memory. Without a
GPU your answer will take a while to display. NVidia GPUs are the most
widely supported GPUs.
AI is changing so quickly that tutorials are often outdated in 6-12
months. Example: Microsoft Copilot Studio tutorials. I recommend you do
not use tutorials older than a year. And expect to hit roadblocks in
almost every tutorial (that is what I've found.)
2 Types of AI
There are several types of AI which are specialized to do different
things.
Chatbot. This is where you ask AI a question and it
gives you an answer based on the data it was trained on. If an AI is
trained on biased data the answers will still be biased.
Coding AI. These can help you write programs or
just give snippets of code. Some are better than others.
Image generator. This is an AI to generate images
based on your prompt. You can specify the style of the image, objects in
it, the background of the image, and more.
Image identifier. This AI can identify what is in
an image. You can ask it "Is this an image of a cat or dog?" Or just ask
it "What animal is this?" This normally requires uploading an image to
the AI.
RAG is Retrieval Augmented Generation. This allows
you to do research and get data only from the documents you give it.
Documents can include Word files, PDF files, Youtube ideos, .txt files,
and more. Which file types the AI service supports varies by the
service. These are handy for putting your own research in and asking for
a summary. There are many uses for RAG AI.
Music generator. There are also Ais that can
generate music. Specify the style, the tempo and more.
Voice impersonator. This can speak text using the
voice of someone else.
OCR Processing. These are models that are
specialized in extracting text from images. This is sometimes called
"image-to-text". In some PDFs, like scanned PDFs on Google Scholar, the
PDFs are just a series of images one image per page.
Translator. An AI that translates from one language
to another. This can be a written or spoken/audio language.
As AI improves there may be more types of AI created.
2.1 Model types
Instruct model variations tend to be more accurate
and use less tokens.
2.2 AI Switches
Some AIs use switches like "/switch". For example, Qwen is a very
wordy AI, what you see is it is "thinking". To use a switch add it to
the end of your prompt. Here's a Qwen switch: "Tell me what a group of
crows is called. /no_think". Turn on thinking again use "/think". See
the individual model page for switches that it supports.
Or use the switch on the command line for models that support it:
ollama run llama3.2 /set no_think
Using switches depends on how you are running the LLM.
# This is a comment. Put LLM name below.
FROM llama3.2:8b
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.1
PARAMETER num_predict 512
Using the Modelfile
Edit a plain text file, put settings in it.
Run this command at the CLI:
ollama create my-llama -f Modelfile. This only needs to be
done once or when you make changes to Model file. The data is stored
internally.
Now run your modelfile with ollama run my-llama. This
works because the LLM is also listed in the Modelfile.
Switches in Python:
from transformers import pipeline
pipe = pipeline("text-generation", model="meta-llama/Llama-3.2-8B")
output = pipe(
"Explain black holes simply:",
temperature=0.7,
top_p=0.9,
max_new_tokens=200
)
2.2.2 Qwen3
/no_think. Usage, at the end of your prompt:
What is a group of crows called? /no_think To turn thinking
back on do /think.
Some models will accept /set no_think or
/set think at the end of a prompt.
2.3 Free chatbots online
You will need to sign into some of these if you want them to save
your chat history. Many of these websites use common LLMs like LLama
(from Meta), or GPT (from OpenAI).
Aria. This is part of the Opera browser only. You
must install the browser to use it. https://opera.com
Chatgpt. (by OpenAI) You have to sign up but it no
longer requires a cell phone to sign up. Limits are generous but it does
have limits. GPT is the AI model, ChatGPT is the app. https://chatgpt.com
Copilot (Microsoft). Requires a free login. I don't
know what the limits are. Look on the sidebar for more features.
Library: where you have AI make images, do research reports. Labs: other
stuff MS is working on that you can try. Software in Labs may change
frequently. https://copilot.com or https://copilot.microsoft.com
Gemini. https://gemini.google.com Generous limits, I have not
hit a limit yet after asking about 15 questions in one day, spaced out
throughout the day. Limits might be hourly. Gemini has built-in image
generation called Nano Banana. If you go to Gemini and ask it "Make me
an image of a cat" it will start up Nano Banana automatically.
Huggingface chat. Yes, Huggingface also has its own
chatbot. https://huggingface.co/chat/ Huggingface also has forums
on AI, 100s of models to download, provides free 100GB of space for
projects, many demo websites, and more. It's a major AI hub.
Leo. This is part of the Brave browser only, it
does not have a separate AI website. You must install the browser then
open the Leo tab. https://brave.com
LocalAI. Text generation. Image generation, Audio
processing. Understand and analyze images. Multiple LLM model support.
Model Context Protocol (MCP) for agentic capabilities. WebUI and REST
API support. Local recall (based on your GPU memory). https://localai.io/features/
Perplexity AI is an AI-powered search engine and
chatbot that utilizes advanced technologies such as natural language
processing (NLP) and machine learning to provide accurate and
comprehensive answers to user queries. https://perplexity.ai
Qwen. Chinese model for programming. This model is
better for programming than most other models. https://chat.qwen.ai
WebLLM. This was very slow to respond, it is not
usable. This could not answer basic questions like "How do I define what
parameters are for an LLM". No login required as of 4/8/2025. But maybe
this is some prototype or they don't yet have enough hardware to use, or
they are not running it with a GPU. I don't know how this is different
from Gemini or ChatGPT. As I ask a question it shows me status messages,
how much data is fetched, and what percent of parameters have been
fetched. 105 seconds have gone by with 100% fetched. I asked "How is
food-based red dye made?" Chat: https://chat.webllm.ai/https://webllm.mlc.ai/
LocalAI. Text generation. Image generation, Audio
processing. Understand and analyze images. Multiple LLM model support.
Model Context Protocol (MCP) for agentic capabilities. WebUI and REST
API support. Local recall (based on your GPU memory). https://localai.io/features/
To do OCR on a PDF you generally have to convert each page of the PDF
to an image, like a PNG file. Then use a loop to process each PNG file
to extract the text. Then check each image/page that the text is
correct.
Some popular OCR models are: microsoft/trocr-base-printed,
salesforce/blip-image-captioning-large,
Codellama via Ollama. "A large language model that
can use text prompts to generate and discuss code". https://ollama.com
VSC extensions. There are many Visual Studio Code
extensions to help you connect with a remote (cloud) or local AI to help
you code. Be careful, some LLMs just are not good at coding. Qwen models
are generally good at coding.
I use Gemini AI with an
extension with VSC. It is slow on some days but limits are higher for
the free account. Install the VSC extension and use your Gemini sign in
username and password.
2.8 Free music AI
Riffusion. https://www.riffusion.com/ Give it a prompt to make
music from. This is also a bot on Discord.
Run your AI locally at no charge! Some apps include RAG generation,
some don't. Some support local and online models where you supply your
API key (which costs money).
Deepseek v3. At 671 billion parameters this is a
monster size file to run locally. Estimated file size is 671GB. I think
this one is Chinese. https://deepseekv3.net
Guardvaark. Uses free Ollama models. Zero outbound
network activity, designed for secure, critical networks. Features:
agents, RAG search, video generation, video chat, image generation, find
problems in programming code, GPU management, interconnector sync,
content pipelines. https://guaardvark.com/ Github: https://github.com/guaardvark/guaardvark
Hermes. This makes agents. Soe people like this
better than OpenClaw. It has: scheduled automations, it delegates and
parallelizes, real sandboxing, full web and browser control. 40+ build
in tools: 40+ built-in — web search, terminal, file system, browser
automation, vision, image generation, text-to-speech, code execution,
subagent delegation, memory, task planning, cron scheduling, multi-model
reasoning, and more. Share skills via https://agentskills.io. On Windows this requires WSL2.
Home page: https://hermes-agent.nousresearch.com/ Docs: https://hermes-agent.nousresearch.com/docs/
Openclaw. Open source AI chat bot with agents you
can make. This one is very popular. https://openclaw.ai Get more skills from others at https://clawhub.ai/.
ZeroClaw. RUST-based lightweight software to make
agents. In contrast to traditional runtimes which consume significant
memory and CPU even when idle, ZeroClaw operates with a minimal
footprint, making it accessible to a wider range of hardware, from
powerful servers to low-cost edge devices. https://zeroclaw.net/
3 AI Forums
AnythingLLM discord.
Huggingface. This is one of the biggest AI forums.
Huggingface is also a source for hundreds of LLM models to plug into
software like Ollama and AnythingLLM, test AI sites testing these
models, and much more. HF gives 100GB of free space to users for these
items: "Spaces" are for running demo sites. "Hub" is for storing LLMs,
and models. Currently they have 2.7 million models. https://discuss.huggingface.co
Ollama discord.
4 AI ranking lists
Aka "leaderboards". These are websites that test AI models and rank
them on each test.
LLM-stats. Top AI list by AICodeKing on Youtube.
Choose an category of AI to see the rankings for that category. Sort on
any column by clicking on the column header. One benchmark (column) is
for coding and is called "SWE-BENCH" and another is "CODE ARENA". https://llm-stats.com/
Memory and storage has gone up in price a lot since mid-2025. In some
cases the memory or storage per unit price has gone up 700-1300%.
Several major manufacturers are reporting their whole 2026 production
run is already sold out to major customers, mainly datacenters that are
being built. So the price of memory has become important. Meanwhile DRAM
manufacturer revenue spikes with price increase. https://datatrack.trendforce.com/Chart/content/88/global-branded-dram-manufacturers-revenue-total
Is this just about greed?
HDD. This is Hard Disk Drives. They use magnetic
media on spinning platters and are not as fast as SSDs. This is an older
technology than SSDs.
NAND memory is used for flash drives like SSDs, USB
flash drives. It is much faster than HDDs (hard disk drives) since it
has no moving parts.
DDR4 and DDR5 are main computer RAM types. Memory
types have different speeds even within the DDR4 and DDR5
categories.
LPCAMM2 (Low Power Compression Attached Memory
Module). This is the new laptop memory standard.
HBM4 (High Bandwidth Memory 4) is the new standard
for data centers and AI.
Historical prices
DRAM Exchange. Has current prices with daily high
and low, and a link to historical prices. But you need to be a member to
get the historical prices. This also has AI and memory-related news. https://www.dramexchange.com/
Jukon05. Graph of long term trend of DRAM prices
(2000-2025). You will notice that a graph line stops when that type of
memory is no longer sold. https://x.com/jukan05/status/1969551230881185866
PC Part Picker. Pick your parts to build your own
PC, but it also has historical memory prices. https://pcpartpicker.com/ Go to Trends page to see
historical prices for CPUs, memory, monitors, power supplies, storage,
video cards (GPUs). https://pcpartpicker.com/trends/ These graphs appear to
show 18 months of data and that is not changeable.