Self-hosted search/scraping infrastructure on a mini-PC: a Cloudflare-fronted SERP API, a parallel Rust crawler, and 36 production sites on one box. Everything below is actually running — click the demo, copy the curl, read the code.
Type a query. This hits a public, rate-limited (30 req/min/IP) proxy of my own production search API at search.knew.kr — same backend, no key needed for evaluation.
Press Enter to search. Returns the same JSON shape as a SERP-API call.
curl -s "https://me.knew.kr/api/search?q=tantivy&format=json" \
| jq '.results[0] | {title,url,engine}'
const r = await fetch(
"https://me.knew.kr/api/search?q=" + encodeURIComponent("how does BM25 work") + "&format=json"
);
const data = await r.json();
console.log(data.results.slice(0, 3));
Source repos are private during application review. The demo above is the actual output — happy to walk through any of these on a call or share read-only access.
Powering this demo: Cloudflare TLS, nginx API gate with per-key rate limits, SearXNG meta-search in Docker, structured-JSON output.
→ ask for a walkthroughHigh-performance parallel web crawler for news, blogs, communities, and YouTube. Site-specific extractors, structured JSON, exposed as a tiny HTTP API.
→ ask for a walkthroughYouTube subtitle & comment extraction at scale — the kind of structured-data work SearchApi's vertical APIs (YouTube SERP) are made of.
→ ask for a walkthroughHow one mini-PC runs 36 sites: nginx vhost discipline, automated Let's Encrypt behind Cloudflare proxy, log triage, fail-recovery. Notes & scripts on request.
→ ask for a walkthroughSearchApi turns the messy surface of the web into clean, structured JSON. That's the exact problem I've already been solving for my own products — fronting SearXNG, writing parallel crawlers in Rust, keeping a small fleet of scrapers alive under real traffic. I want to bring that depth and shipping speed to making SearchApi the most reliable way to query the world's search engines.