Skip to main content
One of the fastest ways to train your AI assistant is by pointing it to your website. VozAgent scrapes the page content, processes it, and makes it searchable so your assistant can reference it during phone calls.

How Website Scraping Works

When you add a URL, VozAgent:
  1. Fetches the page at the URL you provide
  2. Extracts the text content from the page (stripping out navigation, ads, scripts, etc.)
  3. Indexes the content so your assistant can search it during calls
  4. Stores a snapshot of the content at the time of scraping
The item appears in your knowledge list with a Website badge and a globe icon. While the page is being scraped, you’ll see a Scanning status badge. Once processing is complete, the status clears and the content is ready for your assistant to use.

Adding a Website URL

There are two ways to add website content, depending on where you start.

From the Knowledge page

  1. Go to Knowledge in your dashboard sidebar
  2. Click the Add Knowledge button
  3. Select Website from the resource type options
  4. Enter the website URL in the Website URL field (e.g., https://example.com/services)
  5. Click Add Knowledge
The title is automatically generated from the domain name. For example, if you enter https://acmeplumbing.com/services, the title will be set to acmeplumbing.com.

From the Paste URL dialog

If you’re using the legacy document view, you may see the Paste from URL option:
  1. Click Add Document and select Paste from URL
  2. Enter a Name for the resource (e.g., “Services Page”)
  3. Enter the URL of the page you want to scrape (e.g., https://example.com/services)
  4. Click Upload
The helper text under the URL field reads: “Paste here the link of the website to scan.”

What Makes a Good URL to Add?

Not all pages are equally useful. Here are the best pages to add:
  • Services pages — what you offer, service descriptions, specialties
  • FAQ pages — common questions and answers your callers typically ask
  • About pages — business background, team info, company story
  • Pricing pages — rates, packages, estimates information
  • Location/contact pages — areas served, hours, addresses
  • Policy pages — cancellation policies, warranties, guarantees
Avoid adding pages that are mostly images, videos, or interactive content with little text. The scraper extracts text content, so pages with more written information produce better results.

One Page per URL

Each URL you add scrapes a single page. It does not automatically follow links or crawl your entire website. If you want your assistant to know about your services page, FAQ page, and pricing page, you need to add each URL separately. For example, to cover your main website content, you might add:
  • https://yourbusiness.com/services
  • https://yourbusiness.com/faq
  • https://yourbusiness.com/about
  • https://yourbusiness.com/pricing

Auto-Created Website Knowledge

When you first set up VozAgent and provide your business website, the platform automatically scrapes your site and creates a knowledge item. This item is marked with a System badge and cannot be deleted. System website items have a sync button (refresh icon) that lets you re-scrape the website to pull in the latest content. This is useful if you’ve recently updated your website. To re-scrape a system website item:
  1. Find the item in your Knowledge list (look for the System badge)
  2. Click the refresh icon button
  3. Wait for the scanning to complete
The button tooltip reads “Re-scrape website” for website-based system items.

Viewing Scraped Content

To see what content was extracted from a URL:
  1. Find the website item in your Knowledge list
  2. Click the eye icon (View content) button
A dialog opens showing the full extracted content. The dialog title reads “Website Training Resource” and shows the date the content was added. The scraped content is rendered as formatted text, so headings, paragraphs, and lists from your website are preserved.

Processing Status

After adding a URL, the item goes through a brief processing period:
StatusWhat it means
ScanningThe page is being fetched and processed. This usually takes a few seconds to a minute.
(No status badge)Processing is complete. The content is ready and searchable by your assistant.
ErrorSomething went wrong. The page may be inaccessible, blocked, or contain no extractable text.
If an item shows an error, you’ll see a brief error message explaining what went wrong. You can try adding the URL again or use a different page.

Important Notes

  • URLs must start with http:// or https:// — the system validates the URL format
  • The content is a snapshot — if your website changes, you’ll need to re-scrape system items or delete and re-add user-created items to pull in the updated content
  • Password-protected pages won’t work — the scraper needs public access to the page
  • Website items cannot be edited after creation — you can view or delete them, but you cannot modify the scraped content directly. To update, delete the item and add the URL again.