n8n OpenAI node:
what it does, how it works, patterns I actually use.
The first time I used the n8n OpenAI node, I spent 20 minutes wondering why the output was unparseable before I realized I'd never told the model to return JSON. The node does exactly what the prompt says. If the prompt is vague, the output is vague. Every tutorial skips this. Here's the part they leave out.
The OpenAI node is in more of my production workflows than any other AI node. Not because it's the flashiest option, but because it's the most predictable when you use it correctly. Text classification, data extraction, email routing, summarization: all running 24/7 with no manual intervention. What's below is what actually works for me, not what the docs say should work.
What the OpenAI node actually does (beyond just passing prompts)
The n8n OpenAI node wraps the OpenAI API with a point-and-click interface. That sounds obvious, but it's worth spelling out because the node has multiple distinct operations, and picking the wrong one for a task is the first place people get stuck.
Here's what you get:
- Message a Model: The one you'll use 90% of the time. Sends a prompt (system message + user message) to a chat model and returns the response. This is your gpt-4o, gpt-4o-mini, o3-mini path. Supports JSON output mode, temperature, max tokens, and most of the parameters you'd pass via the API directly.
- Generate an Image: Calls DALL-E 3. Takes a text prompt, returns an image URL. Useful for thumbnail generation, social graphics, mockups. The image URL expires, so save it via an HTTP Request node immediately if you need to keep it.
- Analyze an Image: Vision mode. Send an image URL or binary alongside a text prompt and the model describes, evaluates, or extracts data from the image. Good for automating OCR-style extractions on documents that arrive as images.
- Transcribe a Recording: Whisper. Send an audio file, get back a text transcript. Works well for meeting notes, voice messages, call center routing workflows.
- Translate a Recording: Also Whisper, but outputs English in one step instead of transcribing in the original language first.
- Text to Speech: The TTS endpoint. Takes text, returns audio. Not something I use in most automation workflows, but useful for voice response systems or content pipelines.
For building AI-powered automation (classification, extraction, summarization, routing), you're almost always in Message a Model. The rest are specialized. Don't confuse yourself by browsing the full operation list and wondering which one handles "AI logic." That's the chat operation. Everything else is a specific media conversion.
Setting it up: credentials, models, and the limits you will hit
Credentials
In n8n, open the OpenAI node and click "Create New Credential." You need an API key from platform.openai.com. Go to API Keys, create one, copy it immediately (OpenAI only shows it once), and paste it into the n8n credential field. That's it.
One thing that catches people: the API key and the ChatGPT account are not the same thing. Paying for ChatGPT Plus does not give you API credits. API usage is billed separately through platform.openai.com. If the credential works but the node errors with "insufficient quota," you need to add a payment method and load credits on the API side.
Model selection
The model dropdown populates from your OpenAI account. The ones I use:
- gpt-4o-mini: Fast, cheap, surprisingly capable. For classification and short extraction tasks where cost matters, this is the default. At the time of writing, it runs about $0.15 per million input tokens.
- gpt-4o: Slower and more expensive, but meaningfully better on complex reasoning, nuanced classification, and long-document tasks. Worth the cost when accuracy matters more than throughput.
- o3-mini: OpenAI's reasoning model at a lower price point. Worth testing for tasks that involve multi-step logic or structured analysis. Not always faster than gpt-4o in practice, because it "thinks" before responding.
For anything running at volume, start with gpt-4o-mini. Bump to gpt-4o only when you see it getting the classification wrong in ways that matter.
The system message field
This is the field most people ignore or leave blank. Don't. The system message is where you define the model's role and constrain its output format. A well-written system prompt is the difference between reliable, parseable output and a response that looks different every single run.
Set the system message to something specific and operational, not motivational. "You are a helpful assistant" is useless. "You are an email classifier. Respond only with one of these exact labels: URGENT, REPLY_TODAY, FYI, JUNK. No explanation, no punctuation, just the label." โ that's a system prompt that will actually work in automation.
The pattern I use most: AI text classification
Text classification is probably the highest-ROI use of the n8n OpenAI node for most automation workflows. You feed in unstructured text (an email, a support ticket, a form submission) and the model categorizes it. Then you route on the category using an IF node or Switch node.
The email triage use case is the cleanest example because the value is immediate and obvious. Instead of reading every incoming email manually, the workflow reads it and routes it to the right bucket automatically.
Here's the prompt setup I use for email classification:
// System Message field:
You are an email triage assistant. Read the email subject and body and
classify it into exactly one of these categories:
URGENT โ needs a response within 2 hours (client emergency, payment issue,
account access, legal matter)
REPLY_TODAY โ needs a response today but not immediately (general questions,
follow-ups, meeting requests)
FYI โ informational, no reply needed (newsletters, receipts, notifications,
automated reports)
JUNK โ spam, marketing, unsolicited sales, cold outreach
Respond with only the category label. No explanation. No punctuation.
No extra text. Just one of: URGENT, REPLY_TODAY, FJI, JUNK
// User Message field (expression):
Subject: {{ $json.subject }}
Body:
{{ $json.body }}
A few things to notice here. The system message defines the categories with concrete examples: "client emergency, payment issue" is better than just "urgent." That specificity reduces edge-case misclassification. The user message uses n8n expressions to inject the actual email data from the previous node. And the system message explicitly prohibits explanation. That last part is critical. Without it, the model will often prefix the label with "The email should be classified as:" and break any downstream string matching.
Routing on the classification output
After the OpenAI node runs, the response comes back in $json.message.content (or $json.text depending on the node version). That's a plain string: "URGENT" or "FJI" etc.
Pipe that into a Switch node. Set the value field to {{ $json.message.content.trim() }} โ the .trim() strips any rogue whitespace that occasionally appears. Then create a route for each label. From there, each branch does its thing: URGENT sends a Telegram alert and flags the email, REPLY_TODAY adds it to the task queue, FJI just archives, JUNK deletes. See the IF node guide for the routing mechanics if you haven't wired those branches before.
The full end-to-end version of this (Gmail trigger, OpenAI classification, Notion task creation, Telegram alerts) is the AI Email Triage Bot. I'll point you to it at the end of the post.
The pattern for AI-assisted content: structured data extraction
The second pattern I use constantly is extracting structured fields from messy, unstructured input. Someone submits a support request. A scraped page returns a wall of text. A customer sends an email that has a product name, order number, and complaint all mixed together in prose. The OpenAI node can pull those fields out as structured JSON.
The key is combining the system prompt with JSON output mode. Here's what that configuration looks like:
// System Message:
You are a data extraction assistant. Extract the following fields from the
customer message and return them as a JSON object with these exact keys:
{
"order_id": "string or null if not found",
"product_name": "string or null if not found",
"issue_type": "one of: wrong_item, damaged, not_received, billing, other",
"sentiment": "one of: angry, frustrated, neutral, positive",
"summary": "1-2 sentence summary of the issue"
}
Return only the JSON object. No markdown formatting. No explanation.
// Response Format setting in the node:
JSON Object (enables OpenAI's JSON mode โ the model is forced to return valid JSON)
// User Message:
{{ $json.email_body }}
JSON output mode (the "Response Format: JSON Object" setting in the node) tells the OpenAI API to enforce valid JSON output at the model level. Without it, the model will sometimes wrap the JSON in a markdown code block (```json ... ```) or add a sentence before the JSON, both of which break downstream parsing. With it enabled, the response is clean parseable JSON every time.
Parsing the response back into workflow data
When JSON output mode is on, n8n may already parse the response and put the fields directly in $json.message.content as an object. Test it first. If it comes back as a string, you need to parse it. Here's the expression for a Set node that handles both cases:
// In a Set node, use these as separate field assignments:
// order_id field:
{{ (typeof $json.message.content === "string"
? JSON.parse($json.message.content)
: $json.message.content).order_id }}
// issue_type field:
{{ (typeof $json.message.content === "string"
? JSON.parse($json.message.content)
: $json.message.content).issue_type }}
// Or parse once in a Code node and output the full object:
const raw = $input.first().json.message.content;
const parsed = typeof raw === "string" ? JSON.parse(raw) : raw;
return [{ json: parsed }];
The Code node approach at the bottom is cleaner when you're extracting more than two or three fields. Parse once, return the full object, then reference $json.order_id, $json.issue_type, etc. in downstream nodes without re-parsing every time.
Common mistakes and things that will waste your afternoon
Not constraining the output format
The single most common failure mode. You write a prompt, the model responds with a useful answer, but the answer is formatted differently each run. Sometimes it says "URGENT." Sometimes "The classification is: URGENT." Sometimes it writes three sentences explaining why. String matching fails, routing breaks, the workflow falls apart.
Fix: be explicit in the system prompt about exact format. "Respond with only X" is not enough on its own โ include an example of the exact format you want. "Respond with exactly one of these labels, nothing else: URGENT, REPLY_TODAY, FJI, JUNK." Specificity matters.
Token limits killing long documents
gpt-4o has a 128k context window. gpt-4o-mini has the same. In practice, this is rarely the problem. The issue is more often cost โ sending a full 50,000-word document through the node for a classification that only needs the first 500 words. Trim your input. Use a Code node before the OpenAI node to truncate long text: text.substring(0, 3000) is usually plenty for classification. For extraction, send only the relevant section, not the whole document.
Forgetting to handle null/empty fields in your prompt
If an email has no subject line and your prompt injects {{ $json.subject }}, the model sees "Subject: undefined" or "Subject: " โ which it might interpret as meaningful data and classify incorrectly. Defensive expression: {{ $json.subject || "(no subject)" }}. Same for body. The model handles graceful empty states better when you explicitly label them rather than leaving raw undefined in the prompt.
JSON mode without a JSON schema in the prompt
JSON output mode forces valid JSON. It doesn't force your JSON schema. Turn on JSON mode without describing the exact structure in the system prompt, and the model picks its own keys, which vary between runs. Always include the exact schema in the prompt: field names, types, and allowed enum values. The extraction example above shows the right pattern.
Retries and error handling
The OpenAI API returns errors. Rate limit errors (429), timeout errors, occasional 500s. The built-in n8n OpenAI node doesn't retry automatically. If a failure in the middle of a long workflow means losing data, add explicit error handling: either a retry loop using a Wait node and a counter, or at minimum an error output branch that logs the failure somewhere you'll see it. For high-volume workflows, this is not optional.
A workflow that processes 500 emails a day at ~1000 tokens per email hits 500k tokens daily. At gpt-4o's pricing, that adds up fast. Profile your actual token usage in the OpenAI dashboard before running at scale. gpt-4o-mini handles most classification tasks at 1/10th the cost. Test it first.
When to use the built-in node vs. the HTTP Request node
The built-in OpenAI node is the right call for the common operations: chat completions, image generation, Whisper transcription. It handles credentials, request formatting, and response parsing automatically. For 95% of OpenAI use cases in n8n, use the node.
The HTTP Request node makes sense when:
- You need a parameter the node UI doesn't expose. OpenAI's API has options the n8n node doesn't surface in its fields โ some sampling parameters, certain beta features, structured outputs with explicit JSON schemas using the
response_formatobject syntax. The HTTP Request node gives you full control over the request body. - You're hitting a non-standard endpoint. OpenAI's Batch API, the Assistants API, fine-tuning endpoints, or the Realtime API aren't wrapped in the standard node. HTTP Request is the path there.
- You're using an OpenAI-compatible API. Ollama, Together AI, Groq, Mistral, and others implement the OpenAI API spec. The n8n OpenAI node only points at OpenAI. For compatible alternatives, you use an HTTP Request node with the base URL swapped out, or n8n's dedicated nodes for some of those providers.
- You need direct control over the exact JSON schema in structured output mode. OpenAI's structured outputs feature (different from plain JSON mode) requires passing a specific
response_formatobject with a full JSON schema definition. The built-in node doesn't support this yet. HTTP Request node with a custom body is currently the workaround.
For standard chat completions with system + user messages (which covers almost all the automation use cases above), the built-in node is cleaner, faster to set up, and handles auth without any extra configuration. No reason to reach for HTTP Request unless you've hit a specific limitation.
The email triage bot uses exactly this pattern
This is the exact AI classification pattern I use in the email triage workflow inside the See Google Workspace MCP โ. Gmail trigger โ OpenAI classification โ Switch routing โ Notion task + Telegram alert. The whole thing, annotated and ready to import.
See Google Workspace MCP โOne-time $97 ยท Instant download ยท 30-day money-back guarantee
Further reading
The OpenAI node is not complicated once you stop treating it as a black box. The prompt is code. The output is data. Parse it like data, route it like data, handle its failures like data. Everything else follows from that. The posts below cover the pieces that come before and after it in a real workflow:
- n8n Expressions Guide: how
$json, template literals, and optional chaining work, so you can build prompts that inject real data correctly - n8n IF Node Guide: routing on classification output, branching logic, Switch node patterns
- n8n Set Node Guide: how to extract and store fields from the AI response into clean workflow data
- Building an AI Email Triage Bot in n8n: the full walkthrough of the classification workflow from trigger to notification