A technical SEO site audit checks crawlability, index status, speed, and structure so you can fix blockers and lift organic reach.
You want a clear, no-nonsense path to review a site, spot blockers, and ship fixes. This guide walks through every layer: crawl access, indexing, speed, internal links, structured data, and more. You’ll get practical checks, tool tips, and a sane order of operations so your work lands real gains.
Audit Overview And Flow
Start wide, then go deep. First confirm that bots can fetch pages, that the right URLs are indexed, and that templates render clean HTML. Next, profile load speed, measure Core Web Vitals, and scan links. Finish with structured data checks, security, and log insights. The sections below give the exact steps.
Quick Checklist By Area
Use this at the start to frame your pass. Then work through each section with the detailed steps that follow.
| Area | What To Check | Helpful Tools |
|---|---|---|
| Crawl Access | robots.txt, server status codes, blocked paths, CDN rules | Search Console, curl, server logs |
| Indexing | Index coverage, canonical tags, pagination, parameters | Search Console, site: queries |
| Sitemaps | Presence, freshness, correct URLs, size | Sitemaps report, XML validators |
| Speed | Core Web Vitals, image weight, JS bloat, caching | PageSpeed Insights, Lighthouse |
| Internal Links | Depth, orphan pages, nav patterns, redirects | Crawlers, exports, logs |
| Structured Data | Valid types, rich result eligibility, errors | Rich Results Test |
| Content Templates | Head tags, H1-H4 order, duplication, thin pages | Browser view-source, crawlers |
| International | hreflang, regional URLs, language tags | Crawlers, hreflang validators |
| Security | HTTPS, mixed content, HSTS | SSL labs, browser console |
| Logs | Googlebot hits, crawl budget sinks, status spikes | Server logs, log parsers |
Technical SEO Audit Steps And Criteria
This section gives the full process. Follow the order to surface high-impact issues early and avoid chasing edge cases.
Step 1: Verify Crawl Access
Fetch the robots.txt file and read the rules. Confirm it loads fast and returns a 200. Check that key paths like product, blog, and category folders are not blocked by mistake. Note that Google enforces a size limit and caches rules, so stale files can linger after deploys. If you run behind a CDN, make sure the origin and edge both serve the same file.
Load a few key pages with curl to see raw headers. Look for 200s where expected, and clean 301/308 chains for old URLs. Watch for 4xx or 5xx bursts that hint at throttling or bad firewall filters. Then sample robots meta tags and X-Robots-Tag headers on pages that should index.
Step 2: Confirm Index Status
Open the coverage report and note valid, excluded, and error buckets. Sample URLs from each bucket to understand patterns. Pay close attention to canonicalization: the declared canonical should match the version you want in results, and the selected canonical should match the declared one. Remove thin variants where consolidation makes sense.
Check parameters. If you have sorting or tracking params, keep them crawlable only when they add content value. Use clean linking to preferred URLs. For pagination, use clear links and avoid infinite scroll that hides page discovery from bots.
Step 3: Review Sitemaps
Confirm that sitemaps exist, list canonical URLs, and update on content changes. Large sites can shard by type or by date. Keep the count under the limit and include only indexable pages. Submit the top-level index file and watch fetch dates and errors in the report.
Step 4: Measure Speed And Core Web Vitals
Run PageSpeed tests on key templates and devices. Note field data first, then lab detail. Chase largest elements: hero images, video, and large script bundles. Serve next-gen image formats, compress text, and cache static assets with long lifetimes. Defer non-critical JavaScript and ship less of it. Treat third-party tags as a budget item and remove what you don’t need.
On content pages, aim for fast first paint and smooth input. Lazy-load below-the-fold media. Preload critical fonts and avoid layout shifts from banners or embeds. If CLS spikes on ads, give slots fixed dimensions and reserve space.
Step 5: Inspect Internal Links
Run a full crawl. Sort by click depth to find pages buried in the nav. Add links from hubs, breadcrumbs, and related modules so key pages sit closer to the home page. Fix redirect hops and loops. Replace temporary redirects with permanent ones when the target is final.
Step 6: Validate Structured Data
Pick markup that matches the content: Article, Product, FAQPage, Organization, and so on. Validate with the testing tool and keep to the spec. Use the same content that users see; don’t add fake ratings or hidden text. Watch the Search Console enhancement reports to track errors over time.
Step 7: Check Templates And Head Tags
Open each major template and scan the head. Keep one H1, orderly H2-H4, a single canonical, and clean titles. Avoid duplicate meta descriptions across large sets. Remove legacy noindex flags from deploys. If multiple frameworks render pages, make sure the final HTML is consistent and indexable.
Step 8: International And Mobile
Where you ship content to more than one region or language, add hreflang pairs across variants and include self-references. Keep language codes valid and map each URL to a single region. Test on real phones. Check tap targets, lazy loading, and render blocking on mobile networks.
Step 9: Security And Clean URLs
Serve HTTPS across the whole site. Fix mixed content calls from scripts and images. Enforce one host and one scheme with 301s. Trim tracking junk from paths and move state to cookies where safe. Keep query strings stable and map legacy routes to today’s patterns.
Step 10: Logs, Crawl Budget, And Monitoring
Pull a two-to-four week slice of access logs and filter for Googlebot. Graph hits by section and status code. Look for waste on search pages, cart steps, and calendar views. Move low-value patterns behind disallow rules or noindex headers, and make sure high-value sections get regular hits. Set alerts for 5xx spikes and sitemap fetch failures.
Tool Setup And Data Sources
You don’t need a huge stack. A small set covers most needs: Search Console for coverage and enhancements, a crawler for internal links and tags, PageSpeed field data for user experience, and server logs for crawl truth. Keep results in a tracker so patterns pop fast. For baseline rules on content, technical basics, and spam, review Google Search Essentials.
Minimal Tool Stack
Here’s a lean setup that teams of any size can run.
| Need | Primary Tool | Notes |
|---|---|---|
| Coverage | Search Console | Use URL Inspection for live checks |
| Crawling | Desktop crawler | Export depth, titles, canonicals |
| Speed | PageSpeed Insights | Check field and lab data |
| Markup | Rich Results Test | Validate and compare pages |
| Logs | Server access logs | Sample two weeks per pass |
Detailed Checks With Fixes
This section lists common findings and straight fixes you can ship without drama.
Robots Rules And Edge Cases
Disallow only what should never appear in results, like admin paths and search results. Avoid blocking CSS or JS folders that render content. Handle trailing slashes and case variants so rules match real URLs. Keep the file under the size limit and serve UTF-8 plain text.
If you need to slow crawlers, don’t rely on crawl-delay in robots rules for Google. Use rate settings in Search Console instead. When outages hit, a 503 with a Retry-After header keeps trust while you fix things.
Index Bloat And Canonicals
Too many near-duplicate URLs make coverage messy and dilute signals. Point variants to one clean version with rel=canonical and internal links that match. Remove tag pages that have no value and block endless filter combos. On UGC pages, add noindex on empty or low quality states.
Sitemaps That Help Crawlers
List only indexable URLs. Keep lastmod fresh and avoid dates that never change. Use absolute, preferred URLs. If you run many languages, split sitemaps by locale so reporting stays clear. When content gets removed, delete the entry rather than serving 404s in the feed.
Speed Wins That Move Metrics
Compress and resize images. Ship AVIF or WebP where supported. Inline the CSS needed for first render and push the rest later. Break large script bundles into smaller chunks and defer non-critical parts. Cache with strong max-age and ETags. Reduce third-party scripts that block the main thread.
Internal Link Patterns That Lift Pages
Place key pages in header, footer, and hub lists so they get steady link flow. Use breadcrumb trails with markup. On large blogs or catalogs, build topic hubs that link to subtopics and new pieces. Fix orphan pages by linking them from category or tag lists with context.
Structured Data Fit And Quality
Match schema to visible content. Keep fields truthful and complete. If you mark reviews, keep names, ratings, and URLs clear. Use Organization markup on the home page for logo and contacts. For products, include price, availability, and SKU where possible and matched to the page.
International Setups That Work
Serve one language per URL. Use hreflang across all variants including self. Keep country codes and language tags valid. Map users to the right variant with links, not forced redirects, so bots can reach each version. Include regional sitemaps in the index file.
Security, Redirects, And Clean States
Force HTTPS and one host. Keep 301s short and direct. Retire chains from old site moves. Watch for parameters that explode into near endless URLs; cap them and keep crawl paths tidy. Use a clear 410 for gone content when you want fast removal.
Prioritize, Ship, And Track Results
Not every issue needs a sprint. Rank fixes by user impact and effort. Start with crawl blocks, index bugs on core templates, and speed wins on pages with traffic. Then move to link depth, sitemaps, and markup. Close the loop by tracking changes in coverage, field data, and conversions.
Priority Fix Matrix
Use this matrix to sort the backlog and plan sprints with your dev team.
| Issue | User Impact | Priority |
|---|---|---|
| robots rules blocking content | Pages vanish from results | High |
| slow LCP on main template | Lower engagement and revenue | High |
| redirect chains on core paths | Wasted crawl and slow loads | Medium |
| duplicate canonicals | Wrong version in results | Medium |
| stale or noisy sitemaps | Wasted crawl on low value URLs | Medium |
| orphan pages | Content never gets seen | Low |
Sample Timeline For Your First Pass
Day 1: set up tools, fetch robots rules, run a quick crawl, and queue PageSpeed tests. Day 2: review coverage buckets, sitemaps, and top templates. Day 3: map redirects, fix easy speed wins, and write tickets for canonicals and links. Day 4: validate markup and push sitemap updates. Day 5: review logs, set alerts, and publish a short change log.