Prototype · Technical write-up

Screener Copilot

A Chrome extension that turns plain English into a live stock screen – built in about 40 minutes.

This prototype was created before my interview at Finviz.com. (The position wasn’t a match in the end – but that’s another story.) While I was researching their competition, I noticed that TradingView.com had just released a beta AI chart copilot.

So I thought: Finviz has this famous Screener that lets you apply a ton of filters and parameters – and, coincidentally, they all go into the URL. Prototyping an AI copilot for it shouldn’t be hard.

So with about 40 minutes until the interview, I kicked it off. A few minutes before it started, I had this prototype:

The prototype, a few minutes before the interview.

Afterwards I figured I’d share it as an example of the kind of thing I prototype. These are the prototypes that open a whole different type of conversation.

The first version called the Anthropic API directly from the extension – fine for a 40-minute prototype, but not something you’d ship in a published widget, where the key would be sitting right there in the browser. So I went looking for alternatives, and ended up trying three different ways to run the model.

The first surprise: Chrome now ships a built-in model, Gemini Nano, so the extension can do everything on-device, without calling any external service at all. I wanted to try it, and it works. It’s “nano” though – small context window, modest capabilities – so it’s not perfect. But (1) it’s perfectly fine for a prototype, and (2) these models will probably keep doubling in capability every few weeks. Either way, it was a fun experiment with in-browser models, one I’ll likely reuse in other prototypes and products.

I also wanted to try a free or cheap hosted model. From a Chrome extension you need a proxy for that – and that proxy can be as little as a simple Cloudflare Worker calling Cloudflare’s own free-tier AI models, like Llama. The quality is genuinely good: noticeably better than Gemini Nano, and close to Claude Haiku.

You can download the extension and try it yourself: open it on finviz.com and start chatting. In the settings you can toggle between all three models – on-device Nano, hosted Llama, and Claude – and feel the difference.

Install it

Unpacked Chrome extension · v1.0.0 · ~24 KB

Download .zip
  1. 1. Download and unzip the file.
  2. 2. Open chrome://extensions and turn on Developer mode (top-right).
  3. 3. Click Load unpacked and pick the unzipped screener-copilot folder.
  4. 4. Pin the extension and click its icon to open the side panel.
  5. 5. Open finviz.com/screener and start typing.

Defaults to Chrome’s on-device model (Gemini Nano) – no key, nothing leaves your machine. Other models are optional, in Settings. Independent technology demo – not affiliated with or endorsed by Finviz. Privacy.

What follows is a technical write-up of how it all works under the hood, if you’re interested. And honestly – this is the point where I mostly stop writing and hand over to the AI, just steering and polishing.

Architecture at a glance

Chrome side panel (the extension)
 │
 ├─ callModel(text)
 │    ├─ Nano          → runs on-device, inside Chrome (free, no key)
 │    └─ callWorker()  → Cloudflare Worker ─┬─ Workers AI · Llama  (free)
 │                                          └─ Anthropic · Claude  (key on Cloudflare)
 │
 └─ buildUrl() → navigate the Finviz tab → scrape tickers → feed back into context

Three model backends sit behind one interface, and the model never touches the page directly – it only emits a small JSON object. Everything else is plain DOM and URL work.

The one trick that makes it work

The screener keeps its entire state in the URL – filters in f=, an optional preset signal in s=, sort order in o=, the column view in v=:

/screener?v=111&f=cap_small,fa_pe_profitable&s=ta_newhigh&o=-marketcap

So “the AI toggles the dropdowns” is, underneath, just building that string and navigating to it – no fragile clicking, no automating someone else’s UI:

function buildUrl({ filters = [], signal, view = "111", order }) {
  // Build the query by hand: values are sanitized codes, and Finviz wants
  // literal commas in f= — URLSearchParams would percent-encode them.
  const parts = [`v=${view}`];
  if (filters.length) parts.push(`f=${filters.join(",")}`);
  if (signal)         parts.push(`s=${signal}`);
  if (order)          parts.push(`o=${order}`);   // e.g. "-marketcap" = biggest first
  return `https://finviz.com/screener?${parts.join("&")}`;
}

One contract for every model

Every backend returns the same JSON object – an action discriminator plus the screen, or a plain-text reply. No tool-use, no provider-specific parsing:

// apply / refine a screen
{ "action": "screen",
  "filters": ["cap_small", "fa_pe_profitable", "sh_price_u10"],
  "signal": "ta_newhigh", "order": "-marketcap", "view": "111",
  "explanation": "Profitable small caps under $10 at new highs, biggest first." }

// answer a question instead
{ "action": "reply",
  "text": "Those 12 are mostly regional banks — here's why each fit…" }

The shape is enforced differently per backend – Chrome’s Prompt API takes a responseConstraint JSON schema, the hosted models use JSON mode – but the extension parses one format, so adding or swapping a model is a contained change.

The request loop

  1. 1. Ask the model – with a curated cheat-sheet of filter codes and a forced JSON shape, so it returns structured data, never prose to scrape.
  2. 2. Decide – the model picks screen or reply; the code doesn’t guess intent.
  3. 3. Build the URL from the returned codes.
  4. 4. Drive the tab – navigate the live screener tab.
  5. 5. Read it back – scrape the matched tickers from the rendered table and feed them into the model’s context, so the next turn is grounded in the actual results.

Three model backends

The only part that costs anything is the model call, so it picks the cheapest capable option:

  • Gemini Nano (on-device) – Chrome ships a small model in the browser. On a capable desktop it runs locally, free, nothing leaves the machine. Feature-detected and used by default.
  • Llama (Workers AI) – the keyless fallback. Runs on a Cloudflare Worker via the AI binding; free, no external account.
  • Claude Haiku (Anthropic) – the most reliable option, for when the small models get sloppy. Same Worker, different branch.
async function callModel(text) {
  if (model === "gemini-nano") {
    if (await nanoUsable()) return callNano(text);   // on-device when available
    // …else fall through to the free hosted backend
  }
  const backend = model === "worker-anthropic" ? "anthropic" : "workersai";
  return callWorker(text, backend);
}

Both hosted models go through a small Cloudflare Worker. That matters for one reason: no API key ever lives in the browser (or in the published source). The key is a Worker secret, encrypted on Cloudflare, injected only at runtime; the Worker also forces a cheap model and rate-limits per IP, so a public endpoint can’t be coerced into unbounded spend.

// inside the Worker — the key only exists here, never client-side
if (backend === "anthropic") {
  const res = await fetch(ANTHROPIC_URL, {
    method: "POST",
    headers: { "content-type": "application/json",
               "x-api-key": env.ANTHROPIC_API_KEY,
               "anthropic-version": "2023-06-01" },
    body: JSON.stringify({ model: "claude-haiku-4-5", system, messages }),
  });
  const { content } = await res.json();          // Claude → content blocks
  return reply(content.map((b) => b.text).join(""));
}
// Workers AI — runs on Cloudflare, JSON mode
const { response } = await env.AI.run(LLAMA, {
  messages, response_format: { type: "json_object" },
});
return reply(typeof response === "string" ? response : JSON.stringify(response));

Making unreliable models safe

Small models are sloppy in specific, repeatable ways, and an invalid filter code doesn’t error on Finviz – it’s silently dropped, so a screen looks like it worked when it didn’t. Two thin, model-agnostic layers catch the common failures.

First, models copy the human glosses straight out of the cheat-sheet – emitting fa_div_high(>5%) instead of fa_div_high. So every code is reduced to its bare token before use:

// "fa_div_high(>5%)" → "fa_div_high"
const cleanCode = (s) =>
  (String(s).toLowerCase().match(/[a-z][a-z0-9_.]*/) || [""])[0];

Second, the parser is deliberately tolerant: small models drop the action field, nest the filters array, or wrap each code as an object. Rather than reject that, it infers and repairs – so malformed JSON never leaks into the chat as raw text:

// infer "screen" even if the model forgot to say so
const isScreen = a.action === "screen"
  || (a.action !== "reply" && (Array.isArray(a.filters) || a.explanation));

// flatten nesting, and unwrap {code:"sec_energy"} → "sec_energy"
const filters = (a.filters || []).flat(Infinity)
  .map((x) => typeof x === "string" ? x : x?.code || "")
  .filter(Boolean);

What it deliberately does not do is guess intent – it won’t flip a sort direction or invent a code the model didn’t produce. Structural repair, yes; semantic guessing, no. When a small model gets the meaning wrong, the answer is a better model (Claude), not a heuristic.

Where it honestly stops

The filter and sort codes are a curated, verified subset, not the whole of Finviz. Covering everything is really a data problem – you’d generate the full catalog from Finviz’s own filter data, then either prompt a large model with all of it or retrieve the relevant slices for a small one, and validate every emitted code against the catalog. Right-sized for a demo; deliberately not built here.

Two harder edges: Finviz’s free filters are fixed buckets (“under $10”, not “under $12.50”), so continuous intent snaps to the nearest available threshold; and the whole approach hinges on screen state living in the URL. It ports cleanly to other URL-driven screeners and not at all to single-page apps that keep filter state in memory – there you’d be reverse-engineering a private API instead. The idea travels; this implementation mostly doesn’t.

What it’s a demo of

A working AI feature end to end: a structured output contract instead of hoping for parseable prose; a model-agnostic provider layer with graceful degradation from on-device to hosted; a key kept server-side so the client and the repo stay clean; resilience tuned to how real models actually fail; and reading the world back in to keep the conversation grounded. Built small, on purpose, and honest about its edges.

If you want to talk about how I prototype, email me.

Screener Copilot is an independent technology demonstrator. It is not affiliated with, endorsed by, or connected to Finviz; it simply drives Finviz’s public screener to show the interaction pattern. All trademarks belong to their respective owners.