Hello! This is a new page so expect this page to grow in the
future.
This is an introduction to AI for people who don't know much about
it.
AI is a combination of computer program and data which can help a
person find information or do tasks. An AI can do research very quickly
for you. An AI can also be an agent, which does tasks for you, even on a
specified schedule. AI can do different things so the names for these
things are not yet standardized.
This HTML page created from Markdown using the wonderful and free Pandoc! Pandoc can read and convert to
many types of documents.
Glossary
These are not in order because we have to explain some things before
others.
A "prompt" is a sentence of what you tell the AI to do, or what you
want to ask it. It's a good idea to be specific.
A "model" is the data the AI uses to answer your question or perform
a task. An LLM is a type of model called a "Large language model".
An "LLM" is a Large Language Model. When running a local AI, this is
the file you download which the AI talks to to do your task or answer
your question. These files can range from 2GB in size to 200GB or more.
So they can take a while to download in some cases. An LLM size is also
based "parameters", see below. LLMs have at least 1 billion
parameters.
"MoE" model is a "Mixture of Experts". It only loads part of the
model in memory at a time so you can use larger models.
"Parameters". These are pieces of data the LLM learns. The more
paramters there are, the larger the LLM file is and the more VRAM you
need to run it.
"Agent" is a tool that uses AI to get things done, or execute
tasks.
"1 bit AI". This is a type of AI that uses ternary bits: each bit
has 3 values instead of 2. So the AI file is smaller and needs less
memory, but they are not as accurate. Microsoft's BitNet is one example
of 1 bit AI.
A "token" is a unit which can be a phrase, one word, or even partial
words. Unfortunately each AI engine uses different types of tokens so
they are not compariable directly as tokens. But you can compare the net
cost for the exact same prompts over different AI services. Here's a
good video explaining tokens. https://www.youtube.com/watch?v=nKSk_TiR8YA by Matt
Pocock.
"Context window" is how many tokens the LLM will save to give your
next questions context. Example: If you mention Python 3.14 in your
first chat, and that first chat "scrolls off" the context window (it is
no longer in the context window) you may have to mention again that you
are using Python 3.14.
Some AIs are free, some cost a monthly fee, and for some you pay by
the tokens used. There are also local free AIs that you can run on your
own computer like Ollama and AnythingLLM. These are the two
simplest to get running. But they generally need a GPU (video card) with
more VRAM. Nvidia video cards are more widely supported with minimal or
not extra setup. The larger the VRAM the larger the AI model the
computer can handle because the software loads the whole model in
memory. Without a GPU your answer will take a while to display. NVidia
GPUs are the most widely supported GPUs by local AI software to use on
your PC.
AI is changing so quickly that online tutorials are often outdated in
6-12 months. Example: Microsoft Copilot Studio tutorials. I recommend
you do not use tutorials older than a year. And expect to hit roadblocks
in almost every tutorial because a screen has changed. That is what I've
found.
2 Types of AI
There are several types of AI which are specialized to do different
things.
Chatbot. This is where you ask AI a question and it
gives you an answer based on the data it was trained on. If an AI is
trained on biased data the answers will still be biased.
Coding AI. These can help you write programs or
just give snippets of code. Some are better than others.
Image generator. This is an AI to generate images
based on your prompt. You can specify the style of the image, objects in
it, the background of the image, and more.
Image identifier. This AI can identify what is in
an image. You can ask it "Is this an image of a cat or dog?" Or just ask
it "What animal is this?" This normally requires uploading an image to
the AI.
RAG is Retrieval Augmented Generation. This allows
you to do research and get data only from the documents you give it.
Documents can include Word files, PDF files, Youtube ideos, .txt files,
and more. Which file types the AI service supports varies by the
service. These are handy for putting your own research in and asking for
a summary. There are many uses for RAG AI.
Music generator. There are also Ais that can
generate music. Specify the style, the tempo and more.
Video generator. Generates videos. These use lots
of processing power so any free versions will be very limited.
Voice impersonator. This can speak text using the
voice of someone else.
OCR Processing. These are models that are
specialized in extracting text from images. This is sometimes called
"image-to-text". In some PDFs, like scanned PDFs on Google Scholar, the PDFs are just
a series of images one image per page.
Translator. An AI that translates from one language
to another. This can be a written or spoken/audio language.
As AI improves there may be more types of AI created.
2.1 Model types
Instruct model variations tend to be more accurate
and use less tokens. These are also called "non-thinking" models,
meaning you will not see how they think.
Thinking models show you all the steps it is doing
to get to its result. It is very wordy. Some thinking models support
switches to turn off thinking, some do not support these switches. See
"AI Switches" below.
2.2 AI Switches
Some AIs use switches like "/switch". For example, Qwen is a very
wordy AI, what you see is it is "thinking". To use a switch add it to
the end of your prompt. Here's a Qwen switch: "Tell me what a group of
crows is called. /no_think". Turn on thinking again use "/think". See
the individual model page for switches that it supports.
Or use the switch on the command line for models that support it:
ollama run llama3.2 /set no_think
Using switches depends on how you are running the LLM and what
switches the LLM supports.
2.2.1 LLama3.2
Run Llama3.2 via ollama and bash:
ollama run llama3.2:8b -t 0.7 --top-p 0.9 --repeat-penalty 1.1
Command line:
ollama run llama3.2:8b -t 0.7 --top-p 0.9 --repeat-penalty 1.1
Create a Modelfile, call it "my-llama".
# This is a comment. Put LLM name below.
FROM llama3.2:8b
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.1
PARAMETER num_predict 512
Using the Modelfile
Edit a plain text file, put settings in it.
Run this command at the CLI:
ollama create my-llama -f my-llama. -f means
use the following model file. This only needs to be done once or when
you make changes to Model file. The data is stored internally.
Now run your modelfile with ollama run my-llama. This
works because the LLM is also listed in the Modelfile.
Setting LLM switches in Python:
from transformers import pipeline
pipe = pipeline("text-generation", model="meta-llama/Llama-3.2-8B")
output = pipe(
"Explain black holes simply:",
temperature=0.7,
top_p=0.9,
max_new_tokens=200
)
2.2.2 Qwen3
/no_think. Usage, at the end of your prompt:
What is a group of crows called? /no_think. To turn
thinking back on do /think.
Some models will accept /set no_think or
/set think at the end of a prompt. Ex prompt:
What color is the sky? /set no_think. This setting is saved
until changed again.
Qwen3 may or may not support switches. Some are automatically set to
have thinking on. There may also be a version that does not think.
Ollama may support switches when you run the model like:
ollama run qwen3:4b /set no_think
2.3 Free chatbots online
You will need to sign into some of these if you want them to save
your chat history. Many of these websites use common LLMs like LLama
(from Meta), or GPT (from OpenAI). There are often various versions and
sizes of each LLM.
Aria. This is part of the Opera browser only. You
must install the browser to use it. https://opera.com
Chatgpt. (by OpenAI) You have to sign up but it no
longer requires a cell phone to sign up. Limits have been reduced
greatly, sometimes you have to wait 4 hours before chatting again. GPT
is the AI model, ChatGPT is the app. https://chatgpt.com
Copilot (Microsoft). Requires a free login. I don't
know what the limits are for the free personal account. Look on the
sidebar for more features. Library: where you have AI make images, do
research reports. Labs: other stuff MS is working on that you can try.
Software in Labs may change frequently. https://copilot.com or https://copilot.microsoft.com
4/8/26 Microsoft Copilot (work account only) now supports Claude
Sonnet LLM.
Gemini. https://gemini.google.comThis has been slower
to answer questions since Apr 20, 2026. Generous limits, I have
not hit a limit yet after asking about 15 questions in one day, spaced
out throughout the day. Limits might be hourly. Gemini has built-in
image generation called Nano Banana. If you go to Gemini and ask it
"Make me an image of a cat" it will start up Nano Banana
automatically.
Gippr. Tusk browser is another browser as well. https://tuskbrowser.com/gippr/ The Tusk Browser can make
a news feed just for you.
Huggingface chat. Yes, Huggingface also has its own
chatbot. https://huggingface.co/chat/ Huggingface also has forums
on AI, 1000s of models to download, provides free 100GB of space for
projects, many demo websites, and more. It's a major AI hub.
Leo. This is part of the Brave browser only, it
does not have a separate AI website. You must install the browser then
open the Leo sidebar. In the Advanced Settings there is a choice of
different models, a free model is Claude Haiku, there are other premium
(paid) model choices as well where you might have to enter your own API
key. By default Leo can summarize the current web page you are looking
at. https://brave.com
LocalAI. Text generation. Image generation, Audio
processing. Understand and analyze images. Multiple LLM model support.
Model Context Protocol (MCP) for agentic capabilities. WebUI and REST
API support. Local recall (based on your GPU memory). https://localai.io/features/
Perchance. NO LIMITS. This is a more personable AI
with a little humor. The avatar looks like anime. You can choose
different personalities to talk to or make your own, generate images, do
some roleplaying game on the fly, have a therapist personlity. If you
edit an existing character you can save it as a new character or click
on the Unknown character and describe how it should act. You can also
backup your data using the Export button (which I assume are memories
and settings.) If you go into Settings, on the left hand side click on
your character to get back to your messages. https://perchance.org/ai-character-chat
Perplexity AI is an AI-powered search engine and
chatbot that utilizes advanced technologies such as natural language
processing (NLP) and machine learning to provide accurate and
comprehensive answers to user queries. https://perplexity.ai
Qwen. Chinese model for programming. This model is
better for programming than most other models. There are various
versions and sizes of Qwen. https://chat.qwen.ai
WebLLM.This was very slow
to respond, it is not usable. This could not answer basic
questions like "How do I define what parameters are for an LLM". No
login required as of 4/8/2025. But maybe this is some prototype or they
don't yet have enough hardware to use, or they are not running it with a
GPU. I don't know how this is different from Gemini or ChatGPT. As I ask
a question it shows me status messages, how much data is fetched, and
what percent of parameters have been fetched. 105 seconds have gone by
with 100% fetched. I asked "How is food-based red dye made?" Chat: https://chat.webllm.ai/,
https://webllm.mlc.ai/
2.4 Paid AI services
Keep tags one line line or they will become headers.
Tags will be put at the end of each bullet: #monthly = cost is
monthly, not based on tokens used. #tokenbased = cost is based on tokens
used. Buy tokens and then refill as needed.
Since the AI industry is changing rapidly I will not put each plan
features and cost here.
Groq. It uses a special GROQ LPU. Groq supports
many models like LLMs, Text-to-speech, Automatic Speech Recognition, and
other tools. They also have a table of how many tokens you get per
dollar by plan. https://groq.com Pricing: https://groq.com/pricing
#tokenbased
NinjaChat. There is no free option, it
always says "You are out of credits." At the top choose from
many LLMs like: Seedance, Gemini 2.5 flash, Gemini 2.5 Pro, Gemini 3
Flash Preview, GPT models, O3 mini, Claude Sonnet 4.5, Claude Haiku 4.5,
Llama models, Kimi models, GLM models. https://www.ninjachat.ai/
Ollama. It has a free service but paid services
start at $20usd/month or $200/year. Paid services are mostly to use the
large cloud-based models. https://ollama.com/pricing Bug reports or help go to: mailto:hello@ollama.com
Run your AI locally at no charge! Some apps include RAG generation,
some don't. Some support local and online models where you supply your
API key (which costs money).
Deepseek v3. At 671 billion parameters this is a
monster size file to run locally. Estimated file size is 671GB. I think
this one is Chinese. https://deepseekv3.net
Guardvaark. Uses free Ollama models. Zero outbound
network activity, designed for secure, critical networks. Features:
agents, RAG search, video generation, video chat, image generation, find
problems in programming code, GPU management, interconnector sync,
content pipelines. https://guaardvark.com/ Github: https://github.com/guaardvark/guaardvark
Hermes. This makes agents. Some people like this
better than OpenClaw. It has: scheduled automations, it delegates and
parallelizes, real sandboxing, full web and browser control. 40+ build
in tools: 40+ built-in — web search, terminal, file system, browser
automation, vision, image generation, text-to-speech, code execution,
subagent delegation, memory, task planning, cron scheduling, multi-model
reasoning, and more. Share skills via https://agentskills.io. On Windows this requires WSL2.
Home page: https://hermes-agent.nousresearch.com/ Docs: https://hermes-agent.nousresearch.com/docs/
Pi.dev. The base on which Openclaw was built. This
is a minimal tool for AI to build your own tools. I don't know how else
to describe it. "Pi is a minimal terminal coding harness. Adapt pi to
your workflows, not the other way around. Extend it with TypeScript
extensions, skills, prompt templates, and themes. Bundle them as pi
packages and share via npm or git." It supports 15+ AI providers and
hundreds of models. Sessions are stored as trees. https://pi.dev Github: https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent
ZeroClaw. RUST-based lightweight software to make
agents. In contrast to traditional runtimes which consume significant
memory and CPU even when idle, ZeroClaw operates with a minimal
footprint, making it accessible to a wider range of hardware, from
powerful servers to low-cost edge devices. https://zeroclaw.net/
2.6 Free image generators
NOTE: Free websites are often limited as using AI takes a lot of
processing power. Some limits are higher than others. This is why we use
local AIs.
LocalAI. Text generation. Image generation, Audio
processing. Understand and analyze images. Multiple LLM model support.
Model Context Protocol (MCP) for agentic capabilities. WebUI and REST
API support. Local recall (based on your GPU memory). https://localai.io/features/
Meshy. Convert a single image into a 3d printable
file. https://meshy.ai
Minimax. Free plan: Bonus credits for daily login.
Standard plan: $14.99usd/month. https://minimaxaivideo.com/
Sora. The app has been closed. I'm not sure about
the website. It looks like you have to login to use the Sora website but
the app has been discontinued. https://openai.com/sora/https://sora.com
Vizard. Turn your long videos into short clips with
AI. No sign up required to try it. https://vizard.ai
2.9 OCR Models
To do OCR on a PDF you generally have to convert each page of the PDF
to an image, like a PNG file. Then use a loop to process each PNG file
to extract the text. Then check each image/page that the text is
correct.
Some popular OCR models are: microsoft/trocr-base-printed,
salesforce/blip-image-captioning-large,
Cursor. This is the AI that deleted PocketOS
production data and all backups on a server, then lied about it. It
eventually told the truth that it deleted these items. Use
agents carefully and have off-site backups.https://cursor.com
Codellama via Ollama. "A large language model that
can use text prompts to generate and discuss code". https://ollama.com
VSC extensions. There are many Visual Studio Code
extensions to help you connect with a remote (cloud) or local AI to help
you code. Be careful, some LLMs just are not good at coding. Qwen models
are generally good at coding.
I use Gemini AI with an
extension with VSC. It is slow on some days but limits are higher for
the free account. Install the VSC extension and use your Gemini sign in
username and password.
Ollama models can also be connected to VSC with VSC extensions.
2.12 Paid coding AI.
1. Mimo coding. $6 for 4 billion tokens? With a learning path for new users. Make real-world projects. Supports English and German. Paths for: Full-stack development, front-end or back-end, Python development.
1. Free program. For casual learners. This must have ads.
2. Pro. $9.99/month billed yearly. Unlimited keys, no ads.
3. Max. $24.99/month billed yearly.
4. Software dev with AI.
2.13 Chinese models
Chinese models have been shown to be a security risk where the actual
LLM leaks data back to Chinese servers. This is a large security risk
since China is famous for IP theft.
Huggingface. This is one of the biggest AI forums.
Huggingface is also a source for hundreds of LLM models to plug into
software like Ollama and AnythingLLM, test AI sites testing these
models, and much more. HF gives 100GB of free space to users for these
items: "Spaces" are for running demo sites. "Hub" is for storing LLMs,
and models. Currently they have 2.7 million models. https://discuss.huggingface.co
Aka "leaderboards". These are websites that test AI models and rank
them on each test.
Artificialanalysis.ai. Has bar graphs comparing
LLMs for accuracy, cost per 1000 tokens, speed, etc. There is a scatter
plot for intelligence vs cost. You want high intelligence and low cost
for a good value. You can also get recommendations here based on your
priorities if it is: tokens, intelligence, or speed. It compares 521+
models but only the top ones are in the initial graphs. They also have
an "Artificial Analysis" index to give an overall rating of each LLM. https://artificialanalysis.ai
LLM-stats. Top AI list by AICodeKing on Youtube.
Choose an category of AI to see the rankings for that category. Sort on
any column by clicking on the column header. One benchmark (column) is
for coding and is called "SWE-BENCH" and another is "CODE ARENA". https://llm-stats.com/
2026-0200 The hype over AI has been over the top. First, around 70%
of firms actively use AI, particularly younger, more productive firms.
Second, while over two thirds of top executives regularly use AI, their
average use is only 1.5 hours a week, with one quarter reporting no AI
use. Third, firms report little impact of AI over the last 3 years, with
over 80% of firms reporting no impact on either employment or
productivity. https://www.nber.org/papers/w34836?utm_campaign=Artificial%2BIntelligence%2BWeekly&utm_medium=web&utm_source=Artificial_Intelligence_Weekly_467
2026-0223. GPT fails tests of triage. "ChatGPT Health performance in
a structured test of triage recommendations." "Among gold-standard
emergencies, the system undertriaged 52% of cases, directing patients
with diabetic ketoacidosis or impending respiratory failure to 24–48 h
evaluation rather than the emergency department, while correctly
triaging classical emergencies such as stroke and anaphylaxis." https://www.nature.com/articles/s41591-026-04297-7 Alt
link: https://pubmed.ncbi.nlm.nih.gov/41731097/
2026-0226. Eight times out of 10, ChatGPT Health sent a suffocating
woman to a future appointment she would not live to see, researches
discovered. ChatGPT Health regularly misses the need for medical urgent
care and frequently fails to detect suicidal ideation, a study of the AI
platform has found, which experts worry could “feasibly lead to
unnecessary harm and death”. https://www.theguardian.com/technology/2026/feb/26/chatgpt-health-fails-recognise-medical-emergencies
2026-0408. Computerphile: Featuring academic experts, this channel
frequently discusses the technical "nightmare" scenarios and safety
concerns inherent in AI development. https://www.youtube.com/@computerphile
Memory and storage has gone up in price a lot since mid-2025. In some
cases the memory or storage per unit price has gone up 700-1300%.
Several major manufacturers are reporting their whole 2026 production
run is already sold out to major customers, mainly datacenters that are
being built. So the price of memory has become important. Meanwhile DRAM
manufacturer revenue spikes with price increase. https://datatrack.trendforce.com/Chart/content/88/global-branded-dram-manufacturers-revenue-total
Is this just about greed?
HDD. This is Hard Disk Drives. They use magnetic
media on spinning platters and are not as fast as SSDs. This is an older
technology than SSDs.
NAND memory is used for flash drives like SSDs, USB
flash drives. It is much faster than HDDs (hard disk drives) since it
has no moving parts.
DDR4 and DDR5 are main computer RAM types. Memory
types have different speeds even within the DDR4 and DDR5
categories.
LPCAMM2 (Low Power Compression Attached Memory
Module). This is the new laptop memory standard.
HBM4 (High Bandwidth Memory 4) is the new standard
for data centers and AI.
Historical prices
DRAM Exchange. Has current prices with daily high
and low, and a link to historical prices. But you need to be a member to
get the historical prices. This also has AI and memory-related news. https://www.dramexchange.com/
PC Part Picker. Pick your parts to build your own
PC, but it also has historical memory prices. https://pcpartpicker.com/ Go to Trends page to see
historical prices for CPUs, memory, monitors, power supplies, storage,
video cards (GPUs). https://pcpartpicker.com/trends/ These graphs appear to
show 18 months of data and that is not changeable.