If you want to create a price comparison site or dropshipping store, WordPress scraper plugins can be very useful. Web scraping consists of gathering information from the web. That information is then organized or imported.
Some people consider scraping as an unethical or questionable activity. In actuality, web scraping can help you stay on top of changes. Price comparison sites can use scraped data to provide visitors with the most accurate information available.
There are plenty of WordPress scraping plugins available. In this post, I will mention some of the best WordPress content crawler plugins and their features so that you can choose the right tool for your needs.
Table of Contents
Best WordPress Scraper Plugins
Here are some of the best WordPress content scraper plugins you can use. Though they are paid options, all of them are packed with useful features.
Octolooks Scrapes is the most advanced content crawler and WordPress scraper plugin by far. It uses a visual selector to scrap content from any site automatically. To work, you need to match the visual selector with the corresponding WordPress field on the target page. You don’t need any programming knowledge or expertise.
The plugin’s easy to use interface was created to provide the best possible user experience. The configuration is accomplished in only a few basic steps. You can leave it in the background, and information will be pulled from the source websites.
You can create new tasks for crawling or use the default settings. You can also use this plugin as a WordPress RSS aggregator plugin.
Scrapes automatically fills out all supported fields. The Octolooks WordPress scraper plugin will automatically match the next page, featured image, content, and other important information with the source websites’ corresponding fields.
You can use the template option to personalize post layouts and choose in what order the information you scrape will appear on your website.
The regular expression find & replace feature can remove certain words or phrases from the scraped text. You can also use your own words to replace them. There are no limits to the number of rules that you can run.
Subtraction, addition, division, multiplication, and other mathematical operations can be run. This WordPress content crawler plugin can create new formulas and combine numbers in different custom fields.
Yandex Translate, DeepL Translate, Bing Microsoft Translate, or Google Translate can automatically translate scraped content. Or you can translate WordPress site automatically using plugins like Weglot (check Weglot review) and WPML (see WPML review).
You can use one of the WordPress auto spinner plugins to change scraped content or let third-party spinner service like WordAi (see WordAi review) and Spin Rewriter (check Spin Rewriter review) do the work for you.
Information scraped from source websites can be filtered to ensure that it meets the set rules. Monitor the content to ensure that it successfully passes from the filters to your site.
Custom fields support and custom post type from your WooCommerce store can be used to scrape content in the form of products.
External Importer Pro
External Importer Pro plugin allows you to extract product data from eCommerce websites and import them into the WooCommerce site. No API access, CSV feeds, or XML is needed.
The plugin extracts complete product data directly from store sites. All you need to do is enter the specific listing or product URL. There are no bulky CSV files or API access to deal with. Product availability and prices are automatically updated. You can manage every aspect of the imported information.
Your existing affiliate IDS will automatically be used (if you added them via setting options) when creating affiliate links. You can even set dropshipping product margins if you want to import products for dropshipping purposes.
- Automatic synchronization – Product availability and pricing information is automatically updated. Any products that are currently out of stock can be removed automatically. Updates are scheduled in the background so that they won’t interfere with any other operations.
- Automatic import – Once new products appear on the target site’s listing page, they will also automatically be imported to your website. You’ll always have the most updated products in your store.
- Unlimited products – The ability to import as many products as you want. You can import unlimited items from as many online store sites as you need.
- Avoid getting blocked – The plugin will read and abide by cookie sessions, daily query quotas, random query intervals, real browsers’ headers, robots.txt rules, user-agents rotation, requests throttling, etc., so that you don’t get blocked.
- Use affiliate networks – Use deep links or dynamically change them to generate affiliate links.
- Dropshipping features – You can create a dropshipping store, and items can be added as “simple” WooCommerce products. Flexible rules can be set for price markups.
- Local and global attributes – You get to determine the product specifications assigned as global attributes (or taxonomies). You can then implement various WooCommerce catalog filters and widgets.
- External images by URL – The ability to display external images without saving them to a local media library. External source sites can be scraped to pull the featured galleries and images you want to show on your site. This will greatly reduce the amount of hard drive storage on your server.
- Dynamic categories – Products with extracted category paths will be automatically imported to the corresponding category.
For more info about this content crawler plugin for WordPress, you can check my External Importer Pro review.
WP Content Crawler
WP Content Crawler plugin can automatically extract information from almost any site. It uses CSS selectors to find content. It uses the Visual Inspector tool that simplifies finding CSS selectors by clicking on the respective elements on the target sites.
- Visual Inspector – Clicking on an element will identify the CSS selector for that element. You can also find alternate CSS selectors that could be used. You don’t have to leave your admin panel to accomplish these tasks.
- Crawl posts (scrape, grab and save) – Once the post URLs have been defined, this WordPress content crawler will automatically crawl them in the background. This will occur after settings are configured.
- Recrawl (update) posts – Posts can be recrawled automatically to ensure that you have the most up to date content. You can opt to ignore older posts, select your update interval, and limit the number of times a particular post can be updated.
- Content templates – Shortcodes can be used to create a gallery, list item, title, post content, and excerpt templates. You can use the options box to create templates for all CSS selector values.
- Paginated posts – Paginated posts can also be saved. You don’t have to limit your searches to single page posts anymore.
- Custom general settings for each website – Custom general settings can be set for each post.
- Save all images – You can save all images in the post’s content.
- Save images as a gallery – Images found on a target page can be saved as a gallery.
- Proxy options – If your IP doesn’t have access to a particular site, you can use one or more proxies to pull information from target sites.
- Automatic translation – Amazon Translate API, Google Cloud Translation API, Microsoft Translator Text API, or Yandex Translate API can be used to translate posts automatically.
- Automatic spinning – Spinning can rewrite crawled content automatically. This can help to increase your search engine rankings. The plugin offers integration with paid services like Turkce Spin API and Spin Rewriter API.
- Save WooCommerce products – Attributes, advanced options, inventory, shipping, and product prices can be saved. Items can be saved as either external or simple products. You can also define items as virtual or create a downloadable file option.
- Regular expressions – Regular expressions can be specified in your “find-replace” options. This makes it easier to find and replace anything. Modifiers and delimiters can also be implemented to refine searches further.
- Save “alt” and “title” attributes – All “title” and “alt” attributes are automatically retrieved from the target site when you save images. Those attributes are then assigned to the respective saved images. Templates can be created to align with your search engine optimization strategies.
- Manual crawling tool – You can enter various URLs to save more than one post at a time using the manual crawling utility. Category URLs can also be entered for the tool to obtain the appropriate post URLs. You can set the crawler to crawl different posts simultaneously.
Scraper – Content Crawler Plugin for WordPress
Scraper Content Crawler plugin for WordPress is a plugin that automatically copies content and post from any site. It takes content creation to another level with its unique features and functions.
- Any website can be scraped – Using Regex and Xpath methods means that you can scrape any site you want.
- You can scrape attributes – Scraper can also retrieve element attributes. That means you can get links, image sources, video sources.
- Featured image – Any image can be extracted and set as the featured image.
- Content spinner – The A.I. Spinner plugin is fully supported. You can use this plugin to create unique content.
- Language translation – The scraper will automatically detect content, which can then be translated into whatever language you prefer.
- Gallery images – Any image can be parsed. You can use those images to create image galleries.
- WooCommerce products – All WooCommerce tags are also supported. This simplifies adding WooCommerce products to your store.
- Mathematical calculations – Math functions can subtract, add, divide or multiply numbers. This may come in handy in price calculations.
- Schedule tasks – You can assign tasks to be conducted at various intervals.
- Strip links – Strip links from original post content.
- Proxy support – You can use proxies for scraping purposes.
Crawlomatic Multisite Scraper
Crawlomatic Multisite Scraper plugin is a website crawling and scraping, post generator autoblogging plugin. You don’t need API’s to scrape content.
This plugin will crawl the URL (it will search all links on a page), visit and extract content from each crawled URL. The crawling process is customizable. You set the crawling depth, crawling rate, maximum crawled article count, crawl only links with specific class or ID, etc.
- The crawling of sitemaps is fully supported.
- The visual content selector support.
- You can paginate site crawling. Article crawling will resume on the next page of the target site.
- You can import prices for all crawled products (for WooCommerce-compatible sites). Dropshipping prices are automatically adjusted accordingly.
- You can raise the prices of imported items by a predefined number. You could also multiply the amount by a set number, which is a useful option for dropshippers.
- Proxies can be used for crawling.
- If you cannot direct crawl (if you’re blocked, for example), you can always crawl the particular page from the Google cache.
- Google Translate is supported. You can choose the language you want your site’s articles to appear in.
- Text spinners are also fully supported. You can change the text that’s generated automatically. Words can be changed with their synonyms if you prefer. SpinRewriter, The Best Spinner, TurkceSpin, WordAI, and others can be used.
- Site scraping and crawling can be configured to respect the robots’ HTML headers of scraped pages and robots.txt files of scraped sites.
- Tags and post categories of products can be created automatically.
- Website crawling and scraping can be used to embed DailyMotion, Flickr, IGN, Ustream.tv, Vimeo, or YouTube videos.
WP Scraper Pro
WordPress Automatic Plugin
WordPress Automatic plugin is a convenient tool that can automatically post to WordPress from almost any site. There are plenty of import selections.
Besides the usual articles, you can also import the following content: Amazon and Walmart products, YouTube, Vimeo, and DailyMotion videos, Flickr and Instagram images, eBay auctions, social media posts (tweets, pins, Reddit and Facebook posts), classifieds from Craigslist, iTunes content (such as songs, podcasts, apps, eBooks), SoundCloud songs, and even Envato items.
You can select the content source and apply filter options by tag, author, and category. This means that not all of the target information will be imported.
You get to choose the images, format, post template, type, and status that the plugin will fetch. There are also advanced translation and rewriting options. You can even automatically replace certain words that you don’t want to be displayed on your site.
You can set post statuses to either published or draft. Certain phrases or words can be excluded. You can also strip all links before publishing a post. Featured images can be automatically set.
Settings can be altered, so duplicate titles, non-English posts, and posts without any images are skipped. Custom fields are automatically added to posts, and multisites are supported.
WP Robot is an autoblogging and content curation plugin. It allows you to automatically create WordPress blog posts by scraping content from other sites. It drip-feeds information related to your particular specialty or niche. This ensures that you’ll always have the most current content.
More than 30 content sources are supported, and each content source is automated. They can be used in whatever combination you prefer to find quality content for your website. According to what you need, there are many ways in which this tool can assist.
WP Robot can pull content from e-Commerce sites if you’re looking to post products from Amazon, AliExpress, Etsy, etc. The plugin can pull images from Flickr and Pixabay, songs from iTunes, YouTube, Vimeo videos, etc.
Commission Junction and Linkshare are some of the affiliate networks that WP Robot supports, and you can automatically post offers from them. RSS feed content can also be added to your site. If you want more than what the existing modules provide, this can give you some added freedom. For more info, be sure to check my WP Robot review.
WordPress Scraper Plugins Conclusion
Web scraping (also known as web harvesting, web data extraction, and screen scraping) acquires vast information from various sites. This data is then saved to another website or a database. Many web scraping solutions require additional knowledge and can be rather complicated. Using mentioned WordPress scraper plugins, content scraping is very easy.
If you want to create an affiliate store, price comparison site, deal site, or dropshipping store, you will need to add products to your site. It is better way to automate that process instead of manually adding products.
For that purpose, you will need a good plugin for importing products. While there are many solutions available, most of them require that you have a feed or API which will be used to import products.
But what if you don’t have a feed? What then? How to import products to the site if you don’t have access to feeds? In this case, you will need a WordPress web scraper plugin.
DISCLOSURE: Posts may contain affiliate links. If you buy something through one of those links, I might get a small commission, without any extra cost to you. Read more about it here.