Google Search Engine

This is a demo of the Google Search Engine. Note, it is research in progress so expect some downtimes and malfunctions. You can find the older Backrub web page here.

Google is being developed by Larry Page and Sergey Brin with very talented implementation help by Scott Hassan and Alan Steremberg.



Search Stanford

   


Search The Web

   


Current Status of Google:

Web Page Statistics
Number of Web Pages Fetched 24 million
Number of Urls Seen 76.5 million
Number of Email Addresses 1.7 million
Number of 404's 1.6 million
Storage Statistics
Total Size of Fetched Pages 147.8 GB
Compressed Repository 53.5 GB
Short Inverted Index 4.1 GB
Full Inverted Index 37.2 GB
Lexicon 293 MB
Temporary Anchor Data
(not in total)
6.6 GB
Document Index Incl.
Variable Width Data
9.7 GB
Links Database 3.9 GB
Total Without Repository 55.2 GB
Total With Repository 108.7 GB

Known Problems:

  1. We have only crawled US looking domains so as not to congest international links. This makes the search engine somewhat incomplete.
  2. There has been some corruption in docid's for anchor hits. This results in some random looking matches (about 1 in 10). SB: I have tried to patch the code to account for this but there are still many problems.
  3. Also, some docinfo pointers are corrupted. SB: I have patched the code to account for most of these but I don't have tight bounds on the extent of corruption.
  4. The performance is somewhat poor right now. This is partly due to data going over NFS and antiquated hardware. However, we are anticipating equipment donations from IBM and Intel to help with performance and increase our disk capacity so we can scale to 100 million pages.

Before emailing, please read the FAQ. Thanks.

Please send any comments to backrub@google.stanford.edu.

Copyright ©1997 Larry Page, Sergey Brin, Scott Hassan, Alan Steremberg


Backrub
Last modified: Thu Dec 4 10:09:44 PST