Tutorial v1.1

Setting Up AI Agents for Academic Research

60-75 min read | For communication researchers, social scientists, and graduate students

Last updated July 4, 2026

A note on freshness: AI is moving at astonishing speed. Model names, prices, install commands, and best practices in this tutorial were accurate as of July 4, 2026, but some details may be outdated by the time you read this. When in doubt, check each tool's official site — or just ask your agent to confirm the current version and pricing before you follow a step.

Start here: let FireUp do the setup for you. Most readers should skip the manual assembly and use the FireUp wizard — a plain-English questionnaire that generates a ready-to-run setup file in about two minutes. It provisions your always-on research manager and its two coding assistants (Claude Code + Codex), bakes a provenance trail into their standing rules, and keeps every secret on your own machine.

Then read the rest of this tutorial to understand what FireUp set up and how to work well with your agent team — not to build it by hand.

Open the FireUp wizard →

Module 0: Why Should Researchers Care About AI Agents?

"The Principal Investigator Model"

I'm not a statistician. I'm not a programmer. But I've run research labs, managed graduate students, and coordinated multi-site studies. I know how to identify the right person for the right task, how to evaluate their work, and how to synthesize outputs into publishable research.

Working with AI agents is the same skill set. You don't need to become a programmer. You need to become a sophisticated research manager — someone who knows which agent to deploy for which task, how to read their outputs, how to orchestrate them into rigorous research workflows, and when to spot something that feels off. That gut feeling when an agent's output doesn't quite add up? We call this AI intuition — and it's a skill you'll develop.

The Philosophy Shift

Old Mental Model	New Mental Model
"I need to learn Python to do computational research"	"I need to learn to orchestrate agents that know Python"
"Programming is writing code from scratch"	"Programming is prompting, reviewing, and validating code written by AI"
"Solo researcher"	"Principal investigator managing an agent team"

What This Tutorial Teaches

This isn't a Python/R/Linux tutorial. This teaches you to:

Prompt agents in natural language to accomplish research tasks
Configure pre-built frameworks rather than coding from scratch
Validate agent outputs using standardized research methodologies
Orchestrate multiple agents for multi-author validation studies

The Intuitionist Approach

The Intuitionist methodology is something we're actively developing at AgentAcademy. It applies dual-process cognitive theory to AI-assisted social science research:

System 1 thinking (fast, intuitive, creative) — Let AI agents brainstorm freely, generate ideas, explore possibilities
System 2 thinking (slow, deliberate, critical) — Use different AI agents to rigorously validate, critique, and refine
Document everything with standardized templates for reproducibility

The key insight: AI agents can embody both thinking styles, and by orchestrating them properly, researchers can harness creative exploration AND rigorous validation in the same workflow.

Key Takeaway

"You're not becoming a developer. You're becoming a research manager with agents as your team."

Module 1: The Terminal — Your Co-Working Space

Don't panic. You don't need to become a Linux expert. You only need about 5-6 basic commands to get started — and once you have an AI agent running, you can just ask it to do the rest. The agent knows the commands; you just need to know enough to start a conversation with it.

The terminal is a text-based interface to your computer. It's faster, scriptable, and creates a log of everything you've done.

Where to Find the Terminal

The terminal is available on every major operating system:

macOS: Open "Terminal" from Applications → Utilities, or search with Spotlight (Cmd+Space)
Windows: Use "Windows Terminal" or "PowerShell". For full Linux compatibility, install WSL (Windows Subsystem for Linux)
Linux: Usually called "Terminal" or "Console" — accessible from your applications menu

The VPS Option: Your Dedicated Research Server

Prof. Wayne's preferred setup is running a Linux VPS (Virtual Private Server) — a remote computer in the cloud that you rent by the month. Recommended providers: Hostinger (~$9/month for 8 GB) and Contabo (~$7/month for 8 GB) — both far cheaper than DigitalOcean for the same memory. Other options include Linode and Hetzner. (This is not a product placement — just honest recommendations!)

How much RAM? For regular research, 8 GB is the sweet spot. 16 GB is ideal if you can afford it, but not always required. 4 GB is enough for light experimentation only.

Why use a VPS for AI research?

Separation: Your research environment is completely segregated from your personal and office laptops. No mixing of work files, no accidental data leaks.
Persistence: Long-running agent tasks continue even when you close your laptop. Start a 4-hour analysis, go to dinner, come back and check results.
Consistency: Same environment every time. No "it works on my machine" problems.
Security: API keys and sensitive data live on the VPS, not on laptops that could be lost or stolen.

# From your laptop, connect to your VPS
ssh root@192.168.1.100

Let's break this down:

ssh — The command to make a secure connection to a remote computer
root — Your username on the VPS. "root" means you have full administrator access
192.168.1.100 — A sample IP address (yours will be different). Your VPS provider gives you this unique number when you set up your server

# cd means "change directory" — it moves you into a folder
cd ~/research-projects    # Go to the research-projects folder

# Now start Claude Code
claude

You don't need a VPS to start — your local terminal works fine for learning. But as your agent workflows grow, a dedicated research server becomes invaluable.

Essential Commands

What You Want	What You Type	What It Does
"Where am I?"	`pwd`	Shows your current folder location
"What's in this folder?"	`ls`	Lists files and folders here
"Go into a folder"	`cd foldername`	Moves you into that folder (cd = "change directory")
"Go back up"	`cd ..`	Moves you up one folder level
"Make a new folder"	`mkdir foldername`	Creates a new folder (mkdir = "make directory")
"Do something as admin"	`sudo command`	Runs with administrator privileges

A Note on `sudo` (Super User Do)

Some commands require administrator access — like installing software. When you see "Permission denied," prefix with sudo:

# This might fail with "Permission denied"
apt install tmux

# This works because sudo gives you temporary admin powers
sudo apt install tmux

The tmux Multiplexer: Keeping Your Sessions Alive

Here's a common frustration: You're running a 2-hour analysis on your VPS. You go to dinner, or your WiFi drops, or your laptop goes to sleep. When you reconnect... your session is gone. The analysis stopped when your connection died.

This is why we use tmux. It creates persistent sessions that keep running on your VPS even when your laptop goes to sleep, your internet connection drops, or you go to bed and come back in the morning.

# macOS
brew install tmux

# Ubuntu/Debian
sudo apt update && sudo apt install tmux

# Start a session
tmux new -s research

Action	Keys
Create new window	`Ctrl+b` then `c`
Next window	`Ctrl+b` then `n`
Detach (session keeps running)	`Ctrl+b` then `d`
Reattach	`tmux attach -t research`

Key Takeaway

"The terminal is your co-working space with AI agents. They know the commands so you don't have to memorize them."

Module 2: Your Agent Roster

What Does "Agentic AI" Even Mean?

You've probably used ChatGPT or Claude in a browser — you type a question, it types back an answer. That's conversational AI. Agentic AI is different: these are AI systems that can actually do things on your behalf.

An AI agent can:

Read and write files on your computer
Run commands in the terminal
Search the web and summarize findings
Execute multi-step tasks without you supervising each step

Think of it this way: regular AI is like texting a friend for advice. Agentic AI is like hiring a research assistant who can actually go do the work — pull the data, run the analysis, write the first draft — and come back with results.

Quick Vocabulary

Here are terms you'll encounter throughout this tutorial, explained in plain language:

Prompt: The message or question you type to an AI agent. Just like emailing a research assistant — you write what you need, they respond.
Context window: How much text the AI can "remember" in a single conversation. Think of it like working memory. A larger context window means the agent can read longer documents.
Token: A chunk of text the AI processes. Roughly ¾ of a word. You pay per token, so this matters for cost.
API key: A password-like code that lets you access an AI service. You keep it secret like a password.
Open-source vs. Open-weight: Important distinction for AI models:
- Open-source = Full training code, data recipes, and weights are public. True transparency.
- Open-weight = Only final model weights released (you can run it), but training code/data are proprietary. Most "open" AI models are actually open-weight.
LLM (Large Language Model): The AI "brain" that powers agents. The most capable families as of mid-2026: the Claude 5 family and Claude Opus 4.8, and Fable 5. Other strong models: Claude Sonnet 4.6, GPT-5.5, Gemini 3 Pro, and the open-weight GLM 5.2, Kimi K2.6, and DeepSeek v4.

Skills — Extending What Agents Can Do

Skills are pre-packaged capabilities you can add to AI agents. Think of them like apps for your smartphone — the phone works fine without them, but skills add specialized functionality.

Examples of skills:

A code-reviewer skill that knows best practices for reviewing code
A literature-search skill that knows how to query academic databases
A CommDAAF skill that enforces our research validation framework
A citation-formatter skill that outputs proper APA/Chicago citations

How to use skills in Claude Code:

/skill code-reviewer

Or in natural language:

"Use the CommDAAF skill to set up a new validation study for this dataset."

Skills encode domain expertise — the skill author has already figured out the best prompts and validation steps. You're leveraging their expertise without developing it yourself.

Caution: Only install skills from trusted sources. A malicious skill could potentially access your files or API keys. Stick to official repositories and well-known contributors.

The AI Services Landscape

Before we dive into specific tools, here's a map of how AI services are organized:

Mainstream AI Model Providers — Companies that build and host their own AI models:

Provider	Latest Models (mid-2026)	How to Access
Anthropic	Claude Opus 4.8, Sonnet 4.6, Haiku 4.5	Claude.ai ($20 Pro / $100+ Max)
OpenAI	GPT-5.5, GPT-5.5-mini	Codex Go (~$8/mo) / Plus ($20) / Pro
Google	Gemini 3 Pro, Gemini 3 Flash	Google AI Studio

API Marketplaces — One API key to access many models:

OpenRouter (openrouter.ai) — The main one we recommend. One account, 300+ models from 60+ providers. Often cheaper because they route to the cheapest provider. Free tier includes 25+ models.
Ollama Cloud — Focuses on running open-source models. Good for privacy-conscious researchers.

Open-Weight Models from China:

Model	Lab	Why It Matters
Kimi K2.6	Moonshot AI	Excellent reasoning, powers many agentic workflows
GLM 5.2	Zhipu AI (Tsinghua)	Strong multilingual; the recommended engine for your always-on manager via Ollama Cloud
DeepSeek v4	DeepSeek	Exceptional coding, very cost-effective
Qwen 3	Alibaba	Strong general-purpose, good context handling

Censorship note: Models hosted in China may filter sensitive topics. Good news: When hosted on US/EU infrastructure (via OpenRouter or Ollama Cloud), they run without censorship — full capabilities for research.

Different Tools for Different Jobs

There are two categories of AI agent tools you should know about:

Terminal-Based Coding Agents — Run in your terminal, can read/write files, execute code:

Agent	What It Is	Best For
Claude Code	Anthropic's terminal assistant	Deep research, file analysis. One of the two assistants FireUp sets up.
Codex	OpenAI's coding assistant	The other assistant FireUp sets up — excellent for code-heavy tasks
Gemini CLI	Google's command-line AI	Google integration, long context windows
OpenCode	Open-source terminal assistant	Full transparency, works with any model
Kimi Code	Moonshot AI's coding assistant	Cost-effective, strong performance

Messaging-Based RA Managers — Live in Telegram/Discord/Slack as your command center:

Agent	The Key Insight
Hermes Agent	Lives in Telegram/Discord/Slack. Takes notes, remembers context, orchestrates your terminal sessions. Your "RA manager."
OpenClaw	Personal AI in WhatsApp/Telegram. Coordinates your research workflow, triggers tasks in terminal agents.

The powerful insight: Hermes and OpenClaw aren't doing the heavy lifting — they're your command center. Chat with them on your phone, they take notes, and when you're ready for serious work, they help orchestrate your Claude Code/Codex sessions on your VPS.

The Hierarchy: How Your Agent Team Is Organized

FireUp arranges your agents into a simple chain of command. You talk to one always-on manager; it coordinates two coding assistants that do the heavy lifting:

You (the human) — approve, decide, keep the final say │ ▼ Always-on research manager (OpenClaw / Hermes, on Telegram) GLM 5.2 via Ollama Cloud (~$20/mo), running 24/7 │ ┌──────┴───────┐ ▼ ▼ Claude Code Codex (the two coding assistants, own subscription own subscription each on its own plan)

The manager is deliberately cheap to run (an open-weight model on a small server), because its job is to remember, notify, and route — not to write the analysis. The assistants run on their own subscriptions and do the actual coding and file work when you ask.

When to Use Which

The simple answer: Claude models are the gold standard. Codex (OpenAI) is also excellent. Chinese open-source models (Kimi, GLM, DeepSeek) are significantly cheaper and catching up fast.

Prof. Wayne's Current Stack (mid-2026):

Always-on manager: OpenClaw/Hermes on Telegram, powered by GLM 5.2 via Ollama Cloud (~$20/mo, runs 24/7)
Coding assistants: Claude Code and Codex, each on its own subscription — the two the FireUp wizard wires up for you

Key Takeaway

"You don't need to pick one tool — build a stack: messaging-based RA manager + terminal-based coding agents."

Module 3: Setting Up Claude Code — Your Stellar RA

Why Start with Claude Code?

Claude Code is like having a research assistant sitting at your computer who can:

Read your files — It can look through your data files and documents
Write for you — It creates folders, writes scripts, and organizes your work
Explain as it goes — Unlike most software, it tells you what it's doing and why
Remember context — It can keep track of long conversations about your research project

What This Will Cost

The recommended approach: Subscribe to Claude Max ($100/month). This gives you generous usage limits sufficient for heavy-lifting agentic research — and predictable costs.

Why not pay-per-use API? API billing can be unpredictable and expensive. A single intensive research session can rack up $20-50 in charges. With Max, you know exactly what you're paying.

Access restrictions warning: AI companies are becoming more protective. Recent example: Anthropic restricted access when they detected OpenClaw orchestrating Claude Code sessions. Third-party orchestration may violate terms of service. Safest approach: use Claude Code directly with your own subscription.

Approach	Cost	Predictability
Claude Max subscription	$100/month	Predictable, sufficient for heavy research
Claude Pro subscription	$20/month	Predictable, good for moderate use
Pay-per-use API	Variable	Unpredictable, can get expensive

We'll show you cheaper alternatives using open-weight models in Module 4.

Installing Claude Code

The simplest way (copy and paste this):

On a Mac or Linux computer, open Terminal and type:

curl -fsSL https://claude.ai/install.sh | bash

On Windows, open PowerShell and type:

irm https://claude.ai/install.ps1 | iex

That's it. When it's done, type claude --version to confirm it worked.

Connecting to Your Account

The easiest approach: just run claude in your terminal and follow the interactive prompts:

claude

The first time you run it, Claude Code detects you're not logged in and guides you through authentication. If you have Claude Pro ($20/month) or Max ($100/month), it will open your browser to log in. Follow the prompts and you're connected.

Your First Conversation

cd ~/Documents
claude

Now you're talking to Claude Code. Try typing:

"Can you create a folder called 'research-project' with subfolders for 'data', 'notes', and 'drafts'?"

You just gave an instruction in plain English, and a computer task got done. That's the whole idea.

Set Up Codex Too — Your Second Coding Assistant

FireUp's team uses two coding assistants: Claude Code and OpenAI's Codex. Having both lets your manager route each task to whichever is stronger, and lets you cross-check one against the other. Installing Codex is just as easy — and, like Claude Code, it needs no Node:

curl -fsSL https://chatgpt.com/codex/install.sh | sh

(Prefer npm? npm install -g @openai/codex also works, but it needs Node 22+.) Then start it and sign in with your OpenAI account:

codex

The first run walks you through logging in. The cheapest paid entry is Codex Go (~$8/month); the $20 Plus tier unlocks more usage and cloud features.

Key Takeaway

"Claude Code turns your natural language requests into computer actions — no programming required."

Module 4: Token Economics and the Political Economy of AI

The Political Economy Question

AI companies control the means of production. They set prices, decide who gets access, and can change terms at any moment. Recent examples: Anthropic restricting third-party orchestration of Claude Code, OpenAI changing pricing structures.

This isn't doom and gloom — it's just reality. The question: as users, what strategies can we adopt to maximize research output while managing this dependency?

Strategy 1: Understand the Cost Structure

Model	Input (per 1M)	Output (per 1M)
Claude Opus 4.8	$5.00	$25.00
Claude Sonnet 4.6	$3.00	$15.00
Kimi K2.6	~$0.50	~$2.00
GLM 5.2	~$0.40	~$1.50
DeepSeek v4	~$0.30	~$1.00

The insight: Chinese open-weight models are 5-10x cheaper than frontier US models — and they're catching up in quality fast.

Strategy 2: Don't Put All Eggs in One Basket

Claude Code + Codex — The two coding assistants FireUp sets up; use whichever is stronger for the task and cross-check between them
OpenCode — Works with any model provider, gives you freedom
Hermes/OpenClaw — Your always-on manager, cheap to run on an open-weight model like GLM 5.2 (~$20/mo)

Smart researchers use expensive models for critical work and cheap models for exploration.

Strategy 3: Use Subscription Plans Wisely

Max plan ($100/month) — Predictable, sufficient for heavy research
Pro plan ($20/month) — Good for moderate use
Pay-per-use API — Unpredictable, use with caution

A typical FireUp starter stack runs about $50/month: Claude Pro ($20) for Claude Code + Codex Go (~$8) for Codex + Ollama Cloud Pro (~$20) for the always-on manager on GLM 5.2.

Key Takeaway

"Understand who controls access. Diversify your tools. Use expensive models when quality matters, cheap models when iterating."

Module 5: Setting Up OpenCode — Your Multi-Model Freedom

The Key Advantage: You're Not Stuck with One Model

Here's the limitation of Claude Code: you can only use Claude models. Yes, Claude models are arguably the best in the market right now — but they're also expensive. And what if you want to:

Use GLM 5.2 for a quick draft (5x cheaper)?
Try Kimi K2.6 for a coding task?
Compare how DeepSeek v4 handles your analysis vs. Claude?

Claude Code can't do any of this. You're locked into Anthropic's ecosystem.

OpenCode solves this. It's built from the ground up to work with multiple model providers:

OpenCode's own plans — Direct access to various models
OpenRouter — The marketplace we discussed in Module 4
Ollama Cloud — Run open-weight models in the cloud
Direct API connections — Connect to any provider you want

Installing OpenCode

Same pattern as Claude Code:

curl -fsSL https://opencode.ai/install | bash

Then verify it installed:

opencode --version

Configuration — Just Run It

Like Claude Code, the easiest approach is to just run the tool and let it guide you:

opencode

The first time you run it, OpenCode will walk you through:

Choosing your model provider (OpenRouter, Ollama Cloud, direct API, etc.)
Entering your API credentials
Selecting your default model

You can always change these settings later with /settings or /connect.

Why Multi-Model Access Matters for Research

Scenario	Best Model Choice
Critical analysis requiring best reasoning	Claude Opus 4.8 (via API or Max plan)
Quick exploratory work, drafts	GLM 5.2 or Qwen 3 (cheap, fast)
Coding tasks	Kimi K2.6 (excellent at code)
Comparing model perspectives	Run same prompt through 3-4 models
Budget-conscious bulk processing	DeepSeek v4 (very cost-effective)

Key Takeaway

"OpenCode gives you the freedom to use any model from any provider. This means lower costs, more flexibility, and the ability to pick the right tool for each job."

Module 6: Setting Up OpenClaw — AI on Your Phone

Claude Code and OpenCode run on your computer in the terminal. OpenClaw runs on messaging apps you already use — WhatsApp, Telegram, Slack, Discord.

Why does this matter for researchers?

Work from your phone — Check on your analysis from anywhere
Get notifications — "Your 500-article coding task finished"
Collaborate easily — Team members can message the same AI assistant
Keep working while away — Start a task, go teach a class, come back to results

Is This Right for You? OpenClaw requires more setup than Claude Code. If you just want to get started, skip this module for now and come back later. Claude Code is easier.

The Lazy Setup (Prof. Wayne's Approach)

If you already have Claude Code set up, just ask it:

"Look at https://openclaw.ai/ and install it on this server and set up an OpenClaw agent for me to talk to in Telegram."

Yes, this really works. Your existing agent handles the installation.

Manual Setup

npm install -g openclaw@latest
openclaw onboard

The onboarding wizard guides you through setup in under 10 minutes.

Connecting Telegram

OpenClaw needs a way to communicate with you:

Create a Telegram bot — Open Telegram, search for @BotFather, use /newbot
Get your bot token — BotFather will give you a long token string
Enter it in OpenClaw setup — The setup wizard asks for this

Choosing Your AI Model

OpenClaw works with any model provider you prefer:

Codex Plan — If you have a Codex subscription
Ollama Cloud — The recommended default: run the manager on GLM 5.2 via Ollama Cloud (~$20/mo)
OpenRouter — Access to hundreds of models
Direct API keys — Connect to Anthropic, OpenAI, etc.

What You Can Do With OpenClaw

Once set up, you can text your AI things like:

"Review my draft paper and list the main issues"
"What's the status of my analysis task?"
"Summarize the 10 most recent papers on [topic]"

Key Takeaway

"OpenClaw brings your AI assistant to messaging platforms — powered by any model you choose, not locked to one provider."

Module 7: Keeping Your API Keys Safe

What's an API Key Again?

Remember from Module 2: an API key is like a password that lets you access AI services. You don't want to accidentally share it with the world.

The Danger: Accidentally Sharing Your Keys

Imagine writing your API key directly into a research script, then sharing that script with a colleague or uploading it to GitHub. Now anyone can use your key and charge things to your account. This happens more often than you'd think.

The Solution: Environment Variables

An "environment variable" is just a way to tell your computer: "Remember this value, but don't write it into any files."

Here's the simplest approach:

When you open your terminal, type this (replacing with your actual key):

export ANTHROPIC_API_KEY="sk-ant-api03-your-actual-key-here"

Now start Claude Code:

claude

That's it. Claude Code can now see your key, but the key isn't written in any file that could accidentally get shared.

Making It Permanent

The export command only works for your current terminal session. To make it permanent:

On Mac:

echo 'export ANTHROPIC_API_KEY="your-key-here"' >> ~/.zshrc

On Linux:

echo 'export ANTHROPIC_API_KEY="your-key-here"' >> ~/.bashrc

The Golden Rules

Never put API keys in files you share — Not in scripts, not in emails, not in screenshots
Try not to paste API keys into prompts — Use environment variables instead. (Prof. Wayne admits he sometimes pastes keys directly for convenience — it works, but not best practice.)
If you accidentally share a key, regenerate it — Go to your provider's website and create a new one
Set spending limits — Both OpenRouter and Anthropic let you cap how much can be charged

Routine Security Audits

Ask your AI agent to audit your own security. Try prompts like:

"Audit this server for cybersecurity vulnerabilities. Check for exposed API keys, weak permissions, and security misconfigurations."
"Review my .bashrc, .zshrc, and environment files for any hardcoded secrets."
"Check if any of my project files contain API keys that might accidentally get committed to git."

Make this a routine practice — once a week or before major project milestones.

Key Takeaway

"Keep your API keys out of your files and your prompts. Use environment variables, and ask your AI agent to routinely audit your security."

Module 8: The CommDAAF Framework

CommDAAF (Computational Multi-Model Data Analysis and Augmentation Framework) is AgentAcademy's quality assurance system for AI-conducted research.

CommDAAF builds on DAAF (Data Analysis and Augmentation Framework), originally developed by the DAAF Contribution Community. CommDAAF extends DAAF for communication research, adding multi-model validation protocols, communication-specific codebook templates, and integration with Intuitionist workflows.

Using CommDAAF as a Skill

CommDAAF is available as an importable skill for Claude Code:

/skill commdaaf

Or in natural language:

"Import the CommDAAF skill and set up a new content analysis study with 3-model validation."

The skill creates proper folder structure, guides codebook definition, sets up multi-model validation, and generates publication-ready methodology documentation.

The Q1-Q3 Validation Framework

Checkpoint	Question	Purpose
Q1	"What exactly was asked?"	Documents prompts, data, and task design
Q2	"What exactly was received?"	Captures raw agent outputs
Q3	"What does this mean?"	Interpretation with human validation

Core Principles

Multi-Model Validation — 3+ AI models independently analyze identical datasets
Reliability Metrics — Cohen's κ, Fleiss' κ reported for inter-model agreement
Adversarial Review — AI reviewers critique studies before publication
Transparent Failures — Corrections and retractions published openly

Key Takeaway

"CommDAAF is your quality assurance system — it ensures your AI-assisted research is rigorous enough to publish. Import it as a skill to automate best practices."

Module 9: Setting Up Hermes Agent — Your Always-On RA Manager

Both Hermes Agent and OpenClaw (Module 6) live in your messaging apps. The key difference:

OpenClaw — More technical, designed for researchers who want fine-grained control
Hermes Agent — Smoother onboarding, stronger memory features, built by Nous Research

Think of them as two options for the same role: your "RA manager" who lives in Telegram/Slack.

This is the top of the hierarchy. The manager is the always-on layer between you and your two coding assistants (Claude Code + Codex) — see the diagram in Module 2. FireUp sets it up for you, running on GLM 5.2 via Ollama Cloud (~$20/mo). Module 15 covers the provenance trail, knowledge base, and server maintenance it comes configured with.

Why a Messaging-Based RA Manager?

Claude Code and OpenCode are powerful — but they run in your terminal. A messaging-based RA manager like Hermes:

Takes notes for you — "Remember that I'm working on a framing study"
Stays available 24/7 — Message from your phone while commuting
Coordinates your agents — "What was I working on yesterday?"

Choosing Your AI Model

Like OpenClaw, Hermes can be powered by any model you prefer:

Recommended default: GLM 5.2 via Ollama Cloud (~$20/mo) — cheap enough to leave running 24/7, with solid reasoning
Via OpenRouter: Access to hundreds of models
Via Codex Plan: If you have an OpenAI subscription

The Lazy Setup (Ask Claude Code)

If you already have Claude Code running:

"Look at the Hermes Agent documentation and install it on this server. Set it up to connect to my Telegram."

Manual Setup

curl -fsSL https://hermes-agent.nousresearch.com/install | bash
hermes setup

Hermes vs. Claude Code: Not Either/Or

Task	Best Tool
Serious analysis, file reading, code execution	Claude Code / Codex (terminal)
Quick questions while mobile	Hermes (Telegram)
Remembering context across days	Hermes (memory)

Key Takeaway

"Hermes Agent is your always-on RA manager in messaging apps — powered by any model you choose, great for mobile access and memory across sessions."

Module 10: Research Data Access — The Key Bottleneck

You've learned to set up AI agents. You understand the frameworks. But here's the reality: The biggest bottleneck isn't the AI tools — it's access to research data.

The Data Landscape for Social Scientists

AgentAcademy maintains a curated registry at the Data Landscape tab. Key categories:

Survey Data — Gold Standard Sources:

Source	Coverage	Access
ANES	U.S. elections, 1948-present	Open
GSS	U.S. social attitudes, 1972-present	Open
ESS	30+ European countries, 2001-present	Open
World Values Survey	100+ countries, 7 waves	Open

Political & Policy Data:

Source	What It Contains	Access
V-Dem	Democracy indicators, 202 countries	Open
ACLED	Conflict events worldwide, real-time	Free with key
Voteview	U.S. Congress roll-call votes	Open

Academic & Bibliometric Data:

Source	What It Contains	Access
OpenAlex	250M+ scholarly works	Open
Semantic Scholar	AI-powered academic search	Free with key

The Social Media Access Crisis

Warning: Social media data access has collapsed since 2023. Twitter/X API costs $42,000/month. Reddit restricted access. Alternatives: Bluesky (fully open), Reddit archives (Pushshift), SOMAR repository.

Working with Your Agent on Data

Once you have data, your agent can help:

"I have a CSV of 10,000 survey responses. Read the file and describe the variables."

"Use the OpenAlex API to find all papers on 'framing' published since 2020."

Key Takeaway

"AI agents amplify your research capacity — but only if you can access data. Know the landscape, start with open sources."

Module 11: Research Ethics in the Age of AI

AI is transforming academic research faster than institutions can respond. This creates genuine ethical ambiguity.

The Originality Question

If an AI agent writes significant portions of your paper, what does "authorship" mean?

The emerging consensus:

Authorial responsibility matters, not production method — You stand by the final product regardless of how it was produced
Disclosure is trending toward standard practice — AI usage declarations should become as ubiquitous as conflict of interest declarations

Data Privacy: When Research Becomes Training Data

When you paste research data into AI prompts, that data potentially becomes part of the AI system's memory.

Practical safeguards:

De-identify before prompting — Strip names, locations, dates from data
Use local models for sensitive data — Ollama runs entirely on your machine
Check provider data policies — API has different retention than subscription

The Training Pipeline Disruption

AI handles much "grunt work" that traditionally trained researchers. As one researcher admitted: "It is rational to adopt these tools. That's precisely why the problems will keep getting worse: no individual has an incentive to slow down."

Practical Ethical Guidelines

Disclose AI use — In methods sections, acknowledgments, or author notes
Verify agent outputs — Never publish AI-generated claims without checking
Protect human subjects — De-identify data before prompting
Maintain judgment — You're the PI; final responsibility is yours
Don't flood — Use productivity gains to do better work, not more mediocre work

Key Takeaway

"AI-assisted research raises genuine ethical questions. The answer isn't to avoid AI — it's to adopt it transparently, protect research subjects, and focus on work that advances knowledge."

Module 12: Research Ideation

Idea Generator 1: Replication at Scale

Take an existing study (e.g., Entman 1993 on framing) and ask: "Could I replicate this finding at 100x scale using agents to code news archives?"

Idea Generator 2: Multi-Model Reliability

Inspired by Bean et al. (2026): "How reliable are different LLMs at content analysis tasks in my domain?"

Idea Generator 3: Longitudinal Coordination Analysis

Detect coordinated information campaigns: "Are certain actors sharing messages in coordinated patterns?"

Idea Generator 4: Automated Literature Reviews

Synthesize faster with agent assistance: "What are the emerging themes in [topic] over the past 5 years?"

The Study Design Template

1. Research Question
2. Data Source (What exists? What's accessible?)
3. Agent Assignment (Which agent for which task?)
4. Validation Strategy (CommDAAF Q1-Q3)
5. Resource Estimate (Tokens × Cost)

Key Takeaway

"The tutorial infrastructure is now your launchpad. The research ideas are your original contributions — agents just help you execute faster."

Module 13: Your First Agent-Assisted Research Task

Let's do something concrete. We'll use Claude Code to create a coding scheme for analyzing how news articles frame climate change — a classic communication research task.

Step 1: Create a Project Folder

Open your terminal and type:

cd ~/Documents
mkdir climate-framing-study
cd climate-framing-study
mkdir data output

Step 2: Start Claude Code

claude

Step 3: Ask Claude to Brainstorm Categories

Now you're talking to Claude Code. Type something like:

"I'm doing a content analysis of climate change news coverage. Can you brainstorm 5-7 potential framing categories? For each frame, give me a name and a brief definition."

Step 4: Get a Second Opinion

Here's where it gets interesting. Ask Claude to critique its own work:

"Now pretend you're a different analyst reviewing this list. Which categories overlap? Which need clearer definitions? What's missing?"

Or even better: close this session and use a different AI (GPT-4o via OpenRouter) to review the categories. Different AI = independent validation.

Step 5: Finalize Your Codebook

Based on the brainstorm and critique, ask Claude:

"Create a final codebook with these refined categories. For each category include: name, definition, what counts as an example, and what doesn't count. Save this to output/codebook.md"

What You Just Accomplished

In about 30-45 minutes, you:

Generated initial ideas creatively (brainstorming phase)
Critically evaluated those ideas (validation phase)
Synthesized them into a usable research instrument
Documented the process for reproducibility

Notice that you didn't write any code. You had a conversation. The AI did the work. Your job was to know what you wanted, ask clear questions, and evaluate whether the output made sense.

Key Takeaway

"Your first agent-assisted task shows the pattern: describe what you want in plain English, let the AI do the work, then evaluate and refine."

Module 14: How Researchers Work With Agents

You've learned the tools and frameworks. Now let's see how researchers carry out studies with agents — the prompts they use, how they iterate with Claude Code, and how they manage multi-model validation.

A note on these examples: The prompts shown below are simulated reconstructions based on how researchers describe their process in the published papers — not verbatim transcripts. More importantly, this is a gross simplification of real agentic research. Actual studies involve dozens of back-and-forth exchanges, failed attempts, misunderstood instructions, and — if we're being honest — Prof. Wayne growing impatient with agents that confidently produce nonsense, forget context mid-conversation, or need the same correction explained three different ways. The clean workflows below represent the idealized version. Your real experience will be messier, and that's normal.

Case Study 1: Nonprofit Framing — Autonomous Coding at Scale

Challenge: 465 nonprofit mission statements to code for technocratic framing. Manual coding would take weeks.

Step 1: Project Setup

cd ~/research
mkdir nonprofit-framing-study
cd nonprofit-framing-study
claude

First prompt to Claude Code:

"I'm starting a content analysis study of nonprofit mission statements. I have 465 mission statements from IRS Form 990 filings. I want to code each one for 'technocratic framing.' Help me set up the project structure following CommDAAF conventions."

Step 2: Codebook Development (20 min)

"Based on academic literature on technocratic governance, draft a coding scheme with clear inclusion/exclusion criteria and 3 example mission statements for each category."

After 3-4 iterations refining fuzzy distinctions, the codebook is solid.

Step 3: Pilot Coding (30 min)

"Read the first 20 mission statements and code each one. Output: Organization name, Classification, Key phrases, Confidence level. Format as markdown table."

Researcher reviews, catches errors, refines codebook.

Step 4: Full Autonomous Coding (45 min)

"Now code all 465 mission statements using the finalized codebook. Save results to output/full_coding.csv."

Claude Code processes all 465 statements, reporting progress every 50 items.

Step 5: Validation

"Generate a stratified random sample of 50 for human validation. Calculate inter-rater reliability between AI and my human coding."

Result: κ = 0.78 — acceptable for publication.

Total time: ~3 hours from start to publication-ready dataset.

Read the full study →

Case Study 2: Medical Triage — Multi-Model Testing

Challenge: Test 100 medical scenarios across 5 different models, then have physicians validate.

The Multi-Model Setup (using OpenCode):

opencode

"Test medical triage across 5 models. For each of 100 scenarios, send to: claude-sonnet-4.6, gpt-5.5, gemini-3-pro, kimi-k2.6, glm-5.2. Log each response. Do NOT let any model see other models' responses."

Agreement Analysis:

"Calculate pairwise Cohen's κ between each model pair. Compare to ground truth triage level. Identify high-disagreement scenarios."

The Discovery: Analysis revealed the "safety-burden trade-off" — models that never under-triaged had the highest over-triage rates.

Read the full study →

Case Study 3: Frame Analysis — The Full CommDAAF Protocol

Challenge: 719 social media posts coded by 3 independent models with complete documentation.

Three-Model Setup (separate sessions):

# Terminal 1 — Claude
tmux new -s claude-coding && claude

# Terminal 2 — GLM
tmux new -s glm-coding && opencode --model glm-5.2

# Terminal 3 — Kimi
tmux new -s kimi-coding && opencode --model kimi-k2.6

Identical prompt to all three:

"Code these social media posts for communication frames: (1) Economic, (2) Moral/Values, (3) Conflict, (4) Human Interest, (5) Attribution. Output frame number and one-sentence justification for each."

Key discipline: each model codes in its own session, never seeing other outputs.

Consensus Calculation:

Unanimous agreement: 605 posts (84.1%)
Majority agreement: 89 posts (12.4%)
No consensus: 25 posts (3.5%)

HILAR — Human Review:

"Export all non-unanimous cases for human review. Include each model's classification and reasoning."

Researcher adjudicates each case, documenting reasoning in Q3.

Read the preprint →

The AgentAcademy Workflow Pattern

Phase	What You Do	Tools
1. Setup	Create project structure, import CommDAAF	Claude Code
2. Codebook	Develop and iterate coding scheme	Claude Code
3. Pilot	Test on sample, refine	Claude Code
4. Full Coding	Process full dataset	Claude Code / OpenCode
5. Multi-Model	Run through 2-3 independent models	OpenCode + OpenRouter
6. Consensus	Calculate agreement, flag disagreements	Claude Code
7. HILAR	Human review of disagreements	Manual + Claude Code
8. Documentation	Compile Q1/Q2/Q3 reports	Claude Code

Prompts That Work

For codebook development:

"Based on [theoretical framework], develop a coding scheme for [construct].
Include: clear definitions, inclusion/exclusion criteria, 3 examples per category."

For autonomous coding:

"Code all items in [file] using the codebook. For each: ID, classification,
key evidence, confidence. Save to [output file]."

For multi-model validation:

"Send same task to multiple models independently. Do not reference other models."

For reliability:

"Calculate inter-coder reliability between [file1] and [file2].
Report: Cohen's κ, percentage agreement, confusion matrix."

Key Takeaway

"Agent-assisted research follows a clear workflow: develop codebook iteratively, run autonomous coding, validate with multiple models, apply HILAR for disagreements, document everything. The prompts are conversational — you're managing a research team, not writing code."

Module 15: Responsible Agentic Research — What FireUp Bakes In

AI has made producing research cheap. Verifying it has not gotten any cheaper — and that gap is the whole problem. This module explains the design thinking behind the setup FireUp gives you, and the working habits that keep AI-assisted research trustworthy.

The mission in one line: responsible agentic research rests on three pillars — transparency (the team shows its work), reproducibility (decisions and results live in files, not chat), and accountability (a human approves every finding). FireUp wires all three into your agents' standing rules by default.

Verification Debt

The economist Paul Goldsmith-Pinkham names the core risk in "Integration and Collaboration in AI Research Work" (2026): agents can now generate analyses, tables, and prose far faster than any human can check them. The unchecked output piles up as verification debt — a backlog of results that look finished but nobody has actually confirmed.

His fix is to borrow the scaffolding software engineers already use to review one another's work, so that verifying an agent's output stays as cheap as producing it. FireUp builds that scaffolding into your agents from the first run.

The Provenance Trail

Instead of handing you a finished wall of code and numbers to audit at the end, your agents are instructed to leave a reviewable trail as they work. FireUp writes these habits into the agents' standing-rules files (CLAUDE.md for Claude Code, AGENTS.md for Codex) and a research-provenance skill:

Meaningful commits — save work at checkpoints with a git message that says why a change was made, not just what changed.
Bounded-diff review — you review small, self-contained changes as they happen, instead of auditing thousands of lines all at once.
Decisions and log files — DECISIONS.md (why we chose a method), LOG.md (what was done and when), and SAMPLE_FLOW.md (a sample-attrition table: how many cases you started with, what got dropped, and why). Reasoning lives in files, not lost in a chat scroll.
No number is ever typed by hand — every figure in the paper comes from a code-generated file (results.tex, tables/, figures/) pulled into the manuscript with \input or \includegraphics. If a number can't be traced back to the code that produced it, it doesn't go in the paper.
Trace-a-number — at any point you can ask an agent "where did this number come from?" and it walks you back from the paper to the exact line of code and slice of data.

Optional but recommended: GitHub as a shared inbox. If you turn it on in FireUp, your project also gets a private GitHub repo where issues become a feedback inbox (labels like methodology, code-fix, writing, needs-human), and Overleaf syncs with the repo so the numbers in your draft always match the code. The human stays the approver.

The Three Pillars in Practice

Pillar	What it means	How the setup delivers it
Transparency	The team shows its work; every number traces back to the code that produced it	Meaningful commits, no hand-typed numbers, trace-a-number
Reproducibility	Decisions, data steps, and results live in files, so the work re-runs to the same answer	DECISIONS.md, LOG.md, SAMPLE_FLOW.md, code-generated results
Accountability	A machine can't be held responsible for a finding, so the human keeps the final say	Human approves at checkpoints; agents flag "needs-human" calls

A Cited Knowledge Base for Your Field

An agent's training data is frozen at some past date, so left alone it reasons from whatever the field looked like then. FireUp closes that gap. Your research field plus the methods you selected drive a set of deep-research prompts that the agents run after go-live to build a cited knowledge base in .openclaw/knowledge/ — current norms, key references, and data-collection conventions for your area, with sources attached.

Your agents consult this knowledge base before doing methodological work, so they reason from today's field standards rather than stale training data. It refreshes roughly every six months: your manager notices when the refresh date has passed and nudges you in conversation — never on an automatic timer.

Keeping the Server Healthy

Your always-on manager lives on a small rented server, so FireUp sets up two layers of maintenance:

Automatic security updates — unattended-upgrades applies security patches on its own, no action needed.
A monthly full update, only with your approval — about once a month the manager nudges you in chat. Only when you say yes does it run one fixed, root-owned update script. If that update needs a reboot, the manager asks you to do it — the agent never reboots on its own.

This is deliberately conservative: the service account has no general admin rights, only permission to run that one specific update command after you approve it.

A note on security: No secret ever passes through AgentAcademy's servers. FireUp's setup script prompts for your API keys and channel tokens on your own machine and stores them in a locked-down file readable only by you. The manager runs as a dedicated non-root user, listens only on the server's own loopback interface, and — as above — holds exactly one narrowly-scoped admin command.

Key Takeaway

"Producing research is cheap now; verifying it isn't. Responsible agentic research keeps verification cheap — a reviewable trail, every number traced to code, and a human who approves each finding."

Technical Appendix

Terminal Commands Reference

Command	Purpose
`pwd`	Print current location
`ls -la`	List files (detailed)
`cd <dir>`	Change directory
`mkdir <name>`	Create directory
`cat <file>`	View file contents
`grep <text> <file>`	Search in file

Troubleshooting

Problem	Solution
"command not found: claude"	Restart terminal, or run install script again
"API key invalid"	Check key at provider dashboard, regenerate if needed
"rate limit exceeded"	Add delays between calls, upgrade API tier
"context length exceeded"	Summarize input, use chunking, or use longer-context model

Completion Checklist

Can open and navigate the terminal
Can create and source environment variables
Has Claude Code and Codex installed and configured
Can explain the difference between Claude Code, Codex, OpenClaw, and Hermes
Understands the hierarchy: human → always-on manager → Claude Code + Codex
Understands token costs and cost routing via OpenRouter
Can describe System 1 vs System 2 in the Intuitionist Workflow
Can articulate the CommDAAF Q1-Q3 framework
Understands the provenance trail (meaningful commits, DECISIONS/LOG/SAMPLE_FLOW, no hand-typed numbers)
Has completed at least one practice research task

← Back to AgentAcademy Launch Intuitionist →

Setting Up AI Agents for Academic Research

Module 0: Why Should Researchers Care About AI Agents?

"The Principal Investigator Model"

The Philosophy Shift

What This Tutorial Teaches

The Intuitionist Approach

Module 1: The Terminal — Your Co-Working Space

Where to Find the Terminal

The VPS Option: Your Dedicated Research Server

Essential Commands

A Note on sudo (Super User Do)

The tmux Multiplexer: Keeping Your Sessions Alive

Module 2: Your Agent Roster

What Does "Agentic AI" Even Mean?

Quick Vocabulary

Skills — Extending What Agents Can Do

The AI Services Landscape

Different Tools for Different Jobs

The Hierarchy: How Your Agent Team Is Organized

When to Use Which

Module 3: Setting Up Claude Code — Your Stellar RA

Why Start with Claude Code?

What This Will Cost

Installing Claude Code

Connecting to Your Account

Your First Conversation

Set Up Codex Too — Your Second Coding Assistant

Module 4: Token Economics and the Political Economy of AI

The Political Economy Question

Strategy 1: Understand the Cost Structure

Strategy 2: Don't Put All Eggs in One Basket

Strategy 3: Use Subscription Plans Wisely

Module 5: Setting Up OpenCode — Your Multi-Model Freedom

The Key Advantage: You're Not Stuck with One Model

Installing OpenCode

Configuration — Just Run It

Why Multi-Model Access Matters for Research

Module 6: Setting Up OpenClaw — AI on Your Phone

The Lazy Setup (Prof. Wayne's Approach)

Manual Setup

Connecting Telegram

Choosing Your AI Model

What You Can Do With OpenClaw

Module 7: Keeping Your API Keys Safe

What's an API Key Again?

The Danger: Accidentally Sharing Your Keys

The Solution: Environment Variables

Making It Permanent

The Golden Rules

Routine Security Audits

Module 8: The CommDAAF Framework

Using CommDAAF as a Skill

The Q1-Q3 Validation Framework

Core Principles

Module 9: Setting Up Hermes Agent — Your Always-On RA Manager

Why a Messaging-Based RA Manager?

Choosing Your AI Model

The Lazy Setup (Ask Claude Code)

Manual Setup

Hermes vs. Claude Code: Not Either/Or

Module 10: Research Data Access — The Key Bottleneck

The Data Landscape for Social Scientists

The Social Media Access Crisis

Working with Your Agent on Data

Module 11: Research Ethics in the Age of AI

The Originality Question

Data Privacy: When Research Becomes Training Data

The Training Pipeline Disruption

Practical Ethical Guidelines

Module 12: Research Ideation

Idea Generator 1: Replication at Scale

Idea Generator 2: Multi-Model Reliability

Idea Generator 3: Longitudinal Coordination Analysis

Idea Generator 4: Automated Literature Reviews

The Study Design Template

Module 13: Your First Agent-Assisted Research Task

Step 1: Create a Project Folder

Step 2: Start Claude Code

Step 3: Ask Claude to Brainstorm Categories

Step 4: Get a Second Opinion

A Note on `sudo` (Super User Do)