Developer experience

One POST /scan away.

The API is RESTful and so is the docs. First-party SDKs for Python and TypeScript, plain curl for everyone else. All open-source, Apache-2.0.

Install

Python alpha

pip install scrapesmith
Requires Python ≥3.10 · source

TypeScript alpha

npm i @scrapesmith/sdk
Node ≥18, Deno, Bun, Workers · source

curl always

curl …
No client needed · examples below

Submit a URL

curl Python TypeScript
curl -X POST https://api.scrapesmith.io/scan \
  -H "X-API-Key: $SCRAPESMITH_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://suspicious.example/login"}'
# => {"id":"","status":"pending"}
from scrapesmith_client import ScrapeSmith

client = ScrapeSmith(api_key="ss_...")
scan = client.scan_and_wait("https://suspicious.example/login")
print(scan.verdict, scan.score)
# malicious 78
for sig in scan.signals:
    print(" -", sig["id"], sig["message"])
import { ScrapeSmith } from "@scrapesmith/sdk";

const client = new ScrapeSmith({ apiKey: process.env.SCRAPESMITH_API_KEY });
const scan = await client.scanAndWait("https://suspicious.example/login");
console.log(scan.verdict, scan.score);
for (const sig of scan.signals) {
  console.log(" -", sig.id, sig.message);
}

Find similar kits

curl Python TypeScript
curl https://api.scrapesmith.io/scan/$SCAN_ID/similar
# {"results":[{"id":"...","url":"...","distance":3,"verdict":"malicious","score":75}, ...]}
similar = client.similar(scan.id, max_distance=8)
for s in similar:
    print(f"distance={s.distance}  {s.verdict:>10}  {s.url}")
const similar = await client.similar(scan.id, { maxDistance: 8 });
for (const s of similar) {
  console.log(`distance=${s.distance}  ${s.verdict ?? "—"}  ${s.url}`);
}

Watch a brand

Register a brand and we'll match its keywords against every new scan. Hits fire a brand.match webhook event and surface as brand_watch.host_match / dom_match signals on the scan record. Hosts in allowed_domains (and their subdomains) never fire - the brand's real site doesn't trip its own watch.

curl Python TypeScript
curl -X POST https://api.scrapesmith.io/admin/brands \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Acme Bank",
    "keywords": ["acme", "acmebank", "acme-bank"],
    "allowed_domains": ["acmebank.com", "acme.com"]
  }'
import httpx

httpx.post(
    "https://api.scrapesmith.io/admin/brands",
    headers={"Authorization": f"Bearer {ADMIN_TOKEN}"},
    json={
        "name": "Acme Bank",
        "keywords": ["acme", "acmebank", "acme-bank"],
        "allowed_domains": ["acmebank.com", "acme.com"],
    },
).raise_for_status()
await fetch("https://api.scrapesmith.io/admin/brands", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${ADMIN_TOKEN}`,
    "Content-Type":  "application/json",
  },
  body: JSON.stringify({
    name: "Acme Bank",
    keywords: ["acme", "acmebank", "acme-bank"],
    allowed_domains: ["acmebank.com", "acme.com"],
  }),
});

Verify webhook signatures

Every webhook is HMAC-SHA256 signed over <timestamp>.<body>. Verify it on your endpoint — the SDKs ship a one-liner.

Python (FastAPI) TypeScript (Hono/Express)
from fastapi import FastAPI, Header, Request, HTTPException
from scrapesmith_client import verify_webhook

app = FastAPI()
SECRET = "..."  # from POST /admin/webhooks

@app.post("/scrapesmith-hook")
async def hook(req: Request,
               x_scrapesmith_signature: str = Header(...),
               x_scrapesmith_timestamp: int = Header(...)):
    body = await req.body()
    try:
        verify_webhook(secret=SECRET, body=body,
                       signature=x_scrapesmith_signature,
                       timestamp=x_scrapesmith_timestamp)
    except ValueError as e:
        raise HTTPException(status_code=401, detail=str(e))
    # ... event is authentic; process body
import { verifyWebhook } from "@scrapesmith/sdk";

app.post("/scrapesmith-hook", async (req, res) => {
  const body = await req.text();
  try {
    await verifyWebhook({
      secret: process.env.WEBHOOK_SECRET!,
      body,
      signature: req.header("x-scrapesmith-signature"),
      timestamp: Number(req.header("x-scrapesmith-timestamp")),
    });
  } catch (e) {
    return res.status(401).send((e as Error).message);
  }
  // ... event is authentic
});
Register a webhook with one POST to /admin/webhooks (admin token required, see API docs). The secret is returned once; keep it in your secret manager. We retry deliveries 4 times with exponential backoff and persist every attempt for the audit trail.

Build something with this.

Free tier is unlimited for evaluation. Production limits start at $49/mo — see pricing.

Try a scan Full API reference