My domain is not being archived by spiders,crawlers, robots

toria · Feb 5, 2008

I was looking for the history of a .COM that I own. (I bought it recently and put up a blog. I went to http://www.archive.org trying to make sure it doesn't have a negative history for me to clean up.)

Their site tells me it cannot find any info on my site due to a Robots.txt file. It says "We're sorry, access to basicpolitics.com has been blocked by the site owner via robots.txt. Their FAQ page says the robots.txt file forbids their web crawler from archiving any domain with such a file.

It offers me a link to view the Robots.txt file, but when I click on it, I get this, "Failed Connection. We're sorry. Your request failed to connect to our servers ."

I have searched all of the files on my server and do not find such a file. I never placed one there and I don't think anyone else has, either.

A previous owner may have had a Robots.txt file that is causing them to continue by-passing my site when their spiders are wandering about.

I sent them an e-mail to [email protected] asking them how to get my site off their list of sites they don't archive. It has been several days and I have had no response.

Has anyone else encountered this?

How do I get my site off the list of what they don't archive?

I am the registrant of the site now, and I wish to get it archived, and otherwise taken notice of so I can drive more traffic to it.

Thank you in advance for your help.

toria

whitebark · Feb 5, 2008

Why do you care if archive.org is indexing your website? They won't send you any real traffic. Be more concerned about getting listed in google, yahoo and msn. If you are not getting indexed there - then you have a problem.

toria · Feb 5, 2008

How can I tell if google, yahoo and msn are indexing my site?

thanks,

toria

dcristo · Feb 5, 2008

toria said:
I was looking for the history of a .COM that I own. (I bought it recently and put up a blog. I went to http://www.archive.org trying to make sure it doesn't have a negative history for me to clean up.)

Their site tells me it cannot find any info on my site due to a Robots.txt file. It says "We're sorry, access to http://www.basicpolitics.com/ has been blocked by the site owner via robots.txt. Their FAQ page says the robots.txt file forbids their web crawler from archiving any domain with such a file.

It offers me a link to view the Robots.txt file, but when I click on it, I get this, "Failed Connection. We're sorry. Your request failed to connect to our servers ."

I have searched all of the files on my server and do not find such a file. I never placed one there and I don't think anyone else has, either.

A previous owner may have had a Robots.txt file that is causing them to continue by-passing my site when their spiders are wandering about.

I sent them an e-mail to [email protected] asking them how to get my site off their list of sites they don't archive. It has been several days and I have had no response.

Has anyone else encountered this?

How do I get my site off the list of what they don't archive?

I am the registrant of the site now, and I wish to get it archived, and otherwise taken notice of so I can drive more traffic to it.

Thank you in advance for your help.

toria

Hi toria - Getting listed in archive.org does not drive more traffic to your site. Disregard the previous history of the site and concentrate on future traffic growth and profits

toria said:
How can I tell if google, yahoo and msn are indexing my site?

thanks,

toria

In google.com you use the site command, ie.

Code:

site:domain.com

whitebark · Feb 5, 2008

toria said:
How can I tell if google, yahoo and msn are indexing my site?

thanks,

toria

go to each and type in your domain name with ext -i.e.- mydomain.com

You can also sign up at each for the webmaster tools. That way you can verify ownership, upload sitemaps, and check your indexing and backlinks etc.

https://www.google.com/webmasters/sitemaps/
http://webmaster.live.com/
https://siteexplorer.search.yahoo.com

toria · Feb 5, 2008

dcristo said:
Hi toria - Getting listed in archive.org does not drive more traffic to your site. Disregard the previous history of the site and concentrate on future traffic growth and profits.

In google.com you use the site command, ie.

Code:

site:domain.com

Thanks dcristo. I really appreciate those of you who are willing to help us newbies!

Best,

toria

whitebark said:
go to each and type in your domain name with ext -i.e.- mydomain.com

You can also sign up at each for the webmaster tools. That way you can verify ownership, upload sitemaps, and check your indexing and backlinks etc.

https://www.google.com/webmasters/sitemaps/
http://webmaster.live.com/
https://siteexplorer.search.yahoo.com

Thanks Whitebark, you guys are so incredibly helpful!

Best,

toria

dcristo · Feb 5, 2008

toria said:
Thanks dcristo. I really appreciate those of you who are willing to help us newbies!

Best,

toria

No worries. The reason I only gave the google site command is because you will probably find it will account for more then 90% of your search engine traffic.

Yahoo Site Explorer is also helpful:

http://siteexplorer.search.yahoo.com/

dummyhalf · Feb 5, 2008

Just on dcristo's point, I've noticed I get different results for:

site:www.blah.com
site:blah.com
site:www.blah.com/folder1
site:blah.com/folder2

...to be sure, to be sure....

HTH

Deleted member 5660 · Feb 6, 2008

creat a robots.txt file and save it in the main directory of your site. the file should just be 2 lines:

User-agent: *
Disallow:

dcristo · Feb 6, 2008

dummyhalf said:
Just on dcristo's point, I've noticed I get different results for:

site:www.blah.com
site:blah.com
site:www.blah.com/folder1
site:blah.com/folder2

...to be sure, to be sure....

HTH

You would have varied results if you have both www and non www pages indexed. This can be resolved by doing a 301 redirect from one version to the other.

VirtualT · Feb 6, 2008

upload a sitemap to webmaster tools, and get a PR3 or higher link or 2, and your site will be indexed in a couple of days.

PM me if you need a link I can give you one till your indexed

toria · Feb 6, 2008

VirtualT said:
upload a sitemap to webmaster tools, and get a PR3 or higher link or 2, and your site will be indexed in a couple of days.

PM me if you need a link I can give you one till your indexed

WOW! I really feel like a newbie now!

I know what a sitemap is and I downloaded a sitemap tool for Wordpress (what I'm using to create my blog). I haven't tried to use the sitemap tool yet, though.

But I don't know what it means to:

-- upload a sitemap to webmaster tools,
-- get a PR3 or higher link or 2

Sorry, those terms are not yet in my brain/database for reference. Can you explain a bit more?

Thanks for your kindness and patience with me.

Best,

Toria

VirtualT · Feb 6, 2008

toria said:
WOW! I really feel like a newbie now!

I know what a sitemap is and I downloaded a sitemap tool for Wordpress (what I'm using to create my blog). I haven't tried to use the sitemap tool yet, though.

But I don't know what it means to:

-- upload a sitemap to webmaster tools,
-- get a PR3 or higher link or 2

Sorry, those terms are not yet in my brain/database for reference. Can you explain a bit more?

Thanks for your kindness and patience with me.

Best,

Toria

sure, sign up for a google webmaster tools account, add your domain and verify it,
Then create a sitemap of your site using one of the free tools on the net, its basically a list of URL's of your site so it makes it easier for google to spider.
Upload the sitemap to your webmaster tools account.

If you have links to your site from other sites that google deems as important (high PR or Page Rank), you will get indexed alot quicker.

toria · Feb 6, 2008

VirtualT said:
sure, sign up for a google webmaster tools account, add your domain and verify it,
Then create a sitemap of your site using one of the free tools on the net, its basically a list of URL's of your site so it makes it easier for google to spider.
Upload the sitemap to your webmaster tools account.

If you have links to your site from other sites that google deems as important (high PR or Page Rank), you will get indexed alot quicker.

Thanks, Kris. You're so helpful!

Best,

Toria

My domain is not being archived by spiders,crawlers, robots

toria

Lifetime Exclusive Member

whitebark

Level 9

toria

Lifetime Exclusive Member

dcristo

Level 9

whitebark

Level 9

toria

Lifetime Exclusive Member

dcristo

Level 9

dummyhalf

Shopkeeper at Stockphoto.com

Deleted member 5660

Guest

dcristo

Level 9

VirtualT

Level 8

toria

Lifetime Exclusive Member

VirtualT

Level 8

toria

Lifetime Exclusive Member

Similar threads

The Rule #1

Members Online

☆ Premium Listings

Sedo - it.com Premiums

Premium Members

Latest Comments

Spread the Word!

New Threads

Our Mods' Businesses