((( If you host your website on www.1and1.com, like I do, then this script will work without having to do anything much more than copy or FTP the search software you purchase to a new folder on your website hosted by 1and1.com (save 50% limited time offer). I could also FTP it to the website for you for additional $$. )))
The Installation phase consists of transferring the script and data files to your server and setting permissions on them.
This phase should be very easy, but it can be difficult if you have never installed a script before, or if your web server has not been set up for CGI.
Manual installation of Search Engine is described in detail in the install.html file. If you need help, there is an auto-install option which works for most users.
The First-Time Configuration phase involves logging into the admin area and creating your first search index.
Begin by visiting your default search page. The URL to your search page will probably look like one of these:
This help file will refer to your search page as http://yoursite.tld/search/search.pl.
Your first visit to your search page should show a search form followed by the search tips. If you see an error like "404 Not Found", then you probably have not entered the correct URL. If you see an error like "500 Internal Server Error", then refer to the Installation section above.
Once you have accessed your search page, you need to access the admin page. The search file and admin file are the same -- the only difference is that you add the "Mode=Admin" query string for administration. The "Mode=Admin" string is case sensitive.
If your search page is:
then your admin page will be:
If you receive the error "500 Internal Server Error" when visiting the admin page, and yet you do not receive any error on the normal search page, then review Search page works, but Admin page returns "500 Internal Error".
If you receive the error "search data folder is not writable - Permission denied. Please manually set permissions on the data folder", then return to the Installation Manual. Review the section on file permissions. If you have trouble with file permissions on Windows web servers, you may want to review the more in-depth help file Remotely setting file permissions on Windows.
Hopefuly, when you first visit the admin page, you will be shown two password entry boxes. You create an admin password by entering the same password into the boxes and submitting the form.
Next you will be required to log in to the admin page.
If you forget your password, you can reset it. See Security: How to force a reset of the admin password.
Once you have logged in, you will see the admin page. The admin page consists of a left hand navigational frame with a bunch of sections like "Manage Realms" and "Usage Statistics". In the main frame there will be two forms, "Add New URL" and "Add New Site".
We need to determine whether your web hosting provider allows CGI scripts to make socket connections. Almost every provider allows this except the free web hosting providers. There are also some web sites which use a creative network architecture that prevents their scripts from making connections.
To test, you should enter several URLs into the "Add New URL" form. Always enter a few major sites like http://www.yahoo.com and http://www.whitehouse.gov. Then enter your own site URL http://yoursite.tld. Also enter the website of the company that provides your hosting, like http://www.pair.com. If you receive errors, try a few related URLs until you determine whether all URLs of that type are failing.
Your should experience one of the following:
No failures. The script was able to access all URLs.
Your provider rocks and Search Engine will work well for you.
Local failures. You cannot access your own web site, but you can access major sites like http://www.yahoo.com.
Most likely, your provider has a complex network configuration.
Remote failures. You can access your own web site, but you experience failures when access major sites like http://www.yahoo.com.
Most likely, your provider has a complex network configuration or a firewall.
All failures. You receive errors no matter what site you try to access.
Most likely, your provider has disabled socket connections as a matter of policy.
If you want to search your local web site, but you cannot make network connections to it, then you must use the file system crawler. See Administration: Creating a "file system" realm.
If you want to search a remote web site, but you cannot make network connections to it, then there is nothing we can do. Your only options would be to complain to your web hosting provider about their restrictive network architecture and policies, or to get a different provider.
While testing the "Add New URL" form, the script will have auto-created a realm named "My Realm 1". It will be listed at the bottom of the Admin Page under "Open Realms". You can hit the "Delete" link to get rid of that realm once testing is complete. Or you can keep it.
If socket connections worked for you, then the next step is to create your first website realm. Enter the URL to your website in the "Add New Site" form (the second form on the Admin Page).
The process of building the search index is iterative, so you will need to keep your browser open while the crawler passes over your site many times. Eventually it will complete with the message "Success: finished crawling site.".
If your provider does not allow sockets, then create a file system realm using the help link listed earlier.
From the main Admin Page, your new search index will be listed now under "Website Realms". The Size property will show how many kilobytes of disk space are used by the index file. The Pages property shows how many documents were found.
Click the "Review" action link to see all of the documents that were found.
Now it is time to customize your index.
There may have been some documents on your site that you know exist, but that were not listed in the search index when you did the Review.
If there are missing documents, are you sure that they are linked either directly or indirectly from the main page? The web crawler can only discover documents that are linked. If your site uses unlinked content, then consider using a file system realm instead of a web crawler realm.
There may be some documents on your site that should not be included in the index. Use one of these techniques to exclude unwanted content. With all of these techniques, you will need to rebuild the search index after you have made the change.
If documents should not be seen by the public, then consider moving them off your site. Download the documents to your local computer and then delete the copies from your web site.
Or, you can also use industry-standard robots exclusion techniques.
See How to prevent your pages from being indexed.
Or, you can use the Search Engine filter rules with the "Deny" action.
See How to forbid pages that you don't control.
Return to your search page and perform some searches. Confirm that the expected pages are returned. Confirm that you can click on the links in the search results and have the expected page appear.
Return to the Admin Page and click on the "Usage Statistics" link in the navigation frame. From there, you can review the log of what keywords were entered. You can use this feature in the future to see what your visitors are searching for.
To customize the look and feel of your search engine, click on the "User Interface" link in the navigational frame. On the resulting page, scroll to the bottom and find the "Advanced: Edit Templates" section.
Search Engine is a template-based script. You can edit those templates to control 95% of the output. If you are familiar with HTML and CSS, then you should be able to easily customize your layout.
There is a section in the Search Engine help file devoted to customizing the HTML.
You can add search forms to your other HTML pages. Visit the "Admin Page" => "User Interface" page again, and scroll down to the sections labeled "HTML Forms for Searching". Those forms can be cut-and-pasted onto your other pages.
Once you have set up the script, you should choose a license mode. Search Engine can be run in Freeware mode or in Registered mode. See "Admin Page" => "Update License" to learn about the differences between the modes and to select the mode you would like.
The Day-to-Day Maintenance consists of things you need to do every so often to keep your script running smoothly.
The primary maintenance task is rebuilding the search index. Whenever your content is updated, you should log in to the admin page and follow the "Rebuild" link for each of your search indexes.
You may also want to review How to automatically rebuild the index.
The log of searches will grow as more and more people search. Every so often you should go to "Admin Page" => "Usage Statistics" and delete the log. You may also download it first to your local computer so that you have a long-term log of activity (the log is the file search/searchdata/search.log.txt on the server).
The Search Engine can be easily uninstalled.
Remove any custom search forms that you have placed within your other HTML files.
Delete all files and folders within the "search" folder.
The Search Engine distribution includes the hidden file "search/searchdata/.htaccess". That hidden file may prevent FTP clients from deleting the directory structure (they will fail with the error "unable to delete folder 'searchdata': directory not empty" or something similar). If you have trouble with this, review this help file for suggestions. orig