Ask About Oil

 | Search | About | FAQ | News | Help | 
faq

How do you select the sites that you index?

We use directories and query results from our search engine and others. Users may send suggestions to: help@askaboutoil.com.

Sometimes I see search results that are not relevant to the petroleum industry. Why?

Some web sites mix petroleum information with other kinds of web sites. If the site is not well organized our default spider rules may collect some irrelevant pages. We are working hard to identify these sites and adjust our spider rules to take account of them.

Sometimes web sites relevant to petroleum are abandoned. When this happens spammers may take over that web address. Sometimes these may have very inappropriate content. If you see results like that, notify us at help@askaboutoil.com and we will remove them from our index.

I can't find a document that I know is on one of the sites you index. Why?

Our goal is to do the most valuable indexing that we can for given bandwidth constraints. As of July 2005 we are indexing 6 levels deep into our sites at least once every 30 days, this should cover most well designed web sites. We try to index all the html content we find. Some reasons why html content may be missing from our site include:

  • A poor site design may have placed it too far from the starting page for our spider to reach it. Studies have shown that most users will go no further than 3 clicks into a site to find information. Our spider goes as far as 6 clicks into a site.
  • A poor site design may have hidden the link that needs to be followed in the code of some script language, e.g., to implement a custom menu. These links cannot be followed by our spider.
  • The site may have been down or unreachable when our spider tried to visit that page. If so it should be indexed next time.
  • The url for the page may have included punctuation such as a ? mark that signals the spider that the page is generated in response to a query. Following query links may cause problems and our spider doesn't do this by default.

Other document types such as PDFs and office file formats, tend to be large for the amount of information that they contain. These documents are usually referenced from an HTML page with a summary or a descriptive link which will be indexed by default. In some cases these documents have been locked to disallow text extraction or are too large to be downloaded in a reasonable time. As a result we only index these document types by special arrangement, and then we limit the size of the documents we retrieve and respect any content locking we find on the documents retrieved.

If you are the webmaster of a site with special requirements for indexing. We can discuss these with you and either help you reconfigure your site, or make adjustments to our default spider and indexing rules.



Ask About Oil search covers the oil, gas, and petroleum related web including: government, education and research, as well as commercial oil & gas web sites. You get better search results for questions about petroleum, because we focus on bringing the best web search technology to indexing the oil and gas web.

This site was created by MesaVida as a service to the petroleum and mineral energy community. It was made possible by the availability of high quality open source software, including: