We are experts in crawling and scraping the internet to retrieve the data you need.

contact contact

Crawl Solutions

As an Apache Nutch Committer Openindex is an expert in web crawling. We can assist on setting up a Nutch web crawler service that fits your needs, or we do the crawling for you and provide you with the data you need. In any case we can setup a single machine or a cluster, using Hadoop, depending on the scale that needs to be crawled.

Feed your search engine

a Web crawler is often being used to deliver data to your search engine. We can help you set up a crawler that feeds your search engine and tackles all the difficulties that come along: content extraction, crawler traps, duplicates, etc.

Using Apache Solr/Lucene as a search engine, we are able to offer a crawler based custom open source search solution for your website, intranet or Document Management System.

Collect data

We can provide you with a crawler that collects data from the internet. For example we can provide you with a list of domains that use a certain CMS, that contain certain words or content or a certain widget. These data sets can be very useful for e.g. research or sales leads.

Scrape websites

We can provide you with a scraper that collects specific data from specific websites. This is a great solution if you would like to obtain, for example, all product descriptions from a certain (set of) webshop(s) on a regular base.

Data as a service

Openindex is happy to do the crawling or scraping for you. In this case, we provide you with the data you need, in the format you like, on a regular base or one-time.