Manual link checking works for occasional audits, but modern web development demands automation. When you're deploying multiple times per day, managing a portfolio of client sites, or maintaining a content-heavy platform, you need programmatic broken link detection. An API-based approach integrates link checking directly into your development workflow—catching broken links before they reach production, alerting you when external resources disappear, and maintaining link health without manual intervention.
TL;DR
- TinyUtils Dead Link Finder provides a JSON API for automation
- Integrate into CI/CD pipelines to fail builds on broken links
- Set up scheduled monitoring for production sites
- Parse structured JSON responses for automated remediation
- No API key required for reasonable usage
Why Use an API for Link Checking?
Automation Over Manual Clicks
Browser-based tools require human interaction—someone has to click, wait, review, and export. API-based checking runs unattended. Set it up once, and it works while you sleep. This isn't about convenience; it's about catching problems before users report them.
CI/CD Integration
Modern deployment pipelines run automated tests before releasing code. Link checking belongs in that pipeline. A broken link introduced in a content update should block deployment just like a failed unit test. The API enables this integration—your GitHub Action, GitLab CI, or Jenkins job calls the endpoint and acts on the results.
Continuous Monitoring
External links break without warning. The resource you linked to last month might be gone today. Scheduled API calls—daily, weekly, or hourly depending on your needs—catch these changes. Alert integrations (Slack, email, PagerDuty) notify your team immediately when links fail.
Programmatic Remediation
API responses are structured data, not visual reports. Parse the JSON, identify 404s, cross-reference with your CMS, and trigger automated fixes. Some teams automatically replace broken links with archived versions from the Wayback Machine. Others open tickets in their issue tracker. The API enables whatever workflow fits your needs.
The Dead Link Finder API
The TinyUtils Dead Link Finder exposes a JSON API that mirrors the web interface's functionality. Everything you can do through the browser, you can do programmatically.
Request Format
POST /api/check
Content-Type: application/json
{
"pageUrl": "https://example.com/page-to-check",
"scope": "domain",
"includeAssets": false,
"httpFallback": false,
"robots": "respect"
}
Parameters
| Parameter | Type | Description |
|---|---|---|
| pageUrl | string (required) | The URL to crawl and check links from |
| scope | string | "domain" (same domain), "same-origin" (same origin), or "all" (external too) |
| includeAssets | boolean | Check images, scripts, and stylesheets in addition to links |
| httpFallback | boolean | Try HTTP if HTTPS fails (not recommended for HSTS sites) |
| robots | string | "respect" (honor robots.txt) or "ignore" |
Response Format
{
"ok": true,
"rows": [
{
"link": "https://example.com/missing-page",
"status": 404,
"statusText": "Not Found",
"finalUrl": null,
"redirectChain": [],
"error": null
},
{
"link": "https://example.com/moved-page",
"status": 301,
"statusText": "Moved Permanently",
"finalUrl": "https://example.com/new-location",
"redirectChain": ["https://example.com/moved-page"],
"error": null
}
],
"meta": {
"runTimestamp": "2024-01-15T10:30:00Z",
"mode": "domain",
"totals": {
"checked": 47,
"ok": 42,
"broken": 3,
"redirects": 2
},
"requestId": "abc-123-def"
}
}
Error Response
{
"ok": false,
"message": "Invalid URL format",
"code": "INVALID_URL",
"requestId": "abc-123-def"
}
Integration Patterns
GitHub Actions
Add link checking to your GitHub workflow. This example runs on every push to main and fails if broken links are found:
name: Link Check
on:
push:
branches: [main]
schedule:
- cron: '0 6 * * *' # Daily at 6am UTC
jobs:
check-links:
runs-on: ubuntu-latest
steps:
- name: Check for broken links
run: |
RESPONSE=$(curl -s -X POST https://tinyutils.com/api/check \
-H "Content-Type: application/json" \
-d '{"pageUrl":"https://your-site.com","scope":"domain"}')
BROKEN=$(echo $RESPONSE | jq '.meta.totals.broken')
if [ "$BROKEN" -gt 0 ]; then
echo "Found $BROKEN broken links!"
echo $RESPONSE | jq '.rows[] | select(.status >= 400)'
exit 1
fi
Node.js Script
For more complex logic, use a Node.js script that can process results and take actions:
const checkLinks = async (siteUrl) => {
const response = await fetch('https://tinyutils.com/api/check', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
pageUrl: siteUrl,
scope: 'all',
includeAssets: true
})
});
const data = await response.json();
if (!data.ok) {
throw new Error(data.message);
}
const broken = data.rows.filter(r => r.status >= 400);
const redirects = data.rows.filter(r => r.status >= 300 && r.status < 400);
return { broken, redirects, meta: data.meta };
};
Python Integration
import requests
import json
def check_site_links(url, scope="domain"):
response = requests.post(
"https://tinyutils.com/api/check",
json={"pageUrl": url, "scope": scope}
)
data = response.json()
if not data["ok"]:
raise Exception(data["message"])
broken = [r for r in data["rows"] if r["status"] >= 400]
return broken, data["meta"]["totals"]
Scheduled Monitoring with Cron
Set up a cron job to check your site regularly and send alerts:
# Run daily at midnight
0 0 * * * /path/to/link-check.sh | mail -s "Link Check Report" team@company.com
Understanding Status Codes
| Code Range | Meaning | Action |
|---|---|---|
| 200-299 | Success | Link is working |
| 301, 308 | Permanent redirect | Update to final URL |
| 302, 307 | Temporary redirect | Monitor but keep original |
| 400 | Bad request | Check URL format |
| 401, 403 | Authorization required | Link may require login |
| 404 | Not found | Resource is gone—fix or remove |
| 500-599 | Server error | Temporary—recheck later |
| 0 or null | Connection failed | DNS failure or timeout |
Common Use Cases
Pre-Deploy Validation
Check staging environments before promoting to production. Catch content editor mistakes, broken CMS migrations, and misconfigured redirects before users see them.
Content Migration Audits
Moving to a new CMS or redesigning your site? Run link checks before and after to ensure no links were broken in the transition. Compare results to identify regressions.
SEO Monitoring
Broken outbound links can hurt your search rankings. Monitor external links regularly—especially to high-value resources you cite frequently. When external sites restructure, you'll know immediately.
Client Site Management
Agencies managing multiple client sites need automated monitoring. Set up scheduled checks for each client, aggregate results into dashboards, and demonstrate proactive maintenance in client reports.
Documentation Freshness
Technical documentation links to APIs, libraries, and external resources that change frequently. Regular link checks ensure your docs stay accurate and useful.
Best Practices
Rate Limiting
The API includes built-in concurrency limits to respect target sites. For checking multiple pages, stagger your requests rather than firing them all simultaneously. A 1-2 second delay between calls is courteous.
Error Handling
Always check the ok field before processing results. Handle timeouts gracefully—some servers respond slowly. Implement retry logic with exponential backoff for 5xx errors.
Result Caching
Don't check the same URLs repeatedly in short periods. Cache results for at least a few hours. This reduces load on both the API and the target sites you're checking.
Scope Selection
Start with "domain" scope for internal link validation. Use "all" scope when you need to verify external links too—but be aware this takes longer and includes links you can't fix directly.
Frequently Asked Questions
Are there rate limits?
The API has built-in concurrency limits to be respectful to target sites. For high-volume needs, batch your requests and add delays between calls. Excessive usage may be throttled.
Can I check multiple pages in one request?
Each API call checks links from a single page. For site-wide audits, call the API for each page you want to check. Parallelize within reason, but respect rate limits.
What about authenticated pages?
The API checks publicly accessible pages. Content behind login, paywalls, or IP restrictions won't be accessible. For authenticated content, you'll need custom tooling that can handle your authentication mechanism.
How long do requests take?
Response time depends on the number of links and the speed of the target servers. A typical page with 50 links might take 5-15 seconds. Pages with many external links to slow servers take longer.
What's the difference from the web tool?
Same functionality, different interface. The web tool is for one-off manual checks. The API is for automation. Same backend, same accuracy, different access method.
Can I use this for competitive analysis?
You can check any publicly accessible URL. However, respect robots.txt and don't hammer competitor sites with excessive requests.
Why Use an Online API?
- No infrastructure: No servers to maintain, no binaries to update
- Consistent behavior: Same results from any client or platform
- External perspective: Checks from outside your network, like real users
- Always current: Latest detection logic without client updates
- Simple integration: Standard REST API works with any language
Ready to Automate?
Start with the interactive Dead Link Finder to understand the output format, then integrate the API into your workflow. Catch broken links automatically, fail builds on errors, and maintain link health without manual effort.
For agency-scale monitoring, see our agency broken link workflow. For recovering already-broken links, check out fixing broken links with Archive.org.