Processing of robots.txt files has been improved. Robots.txt is now case sensitive and consideration is given to “allow” directives. All common text files have been integrated into Sphider. The user may assign a default language to a web site, but Sphider will also try to detect the language used on each page and use the appropriate common text set. A new feature is the introduction of the possibility of setting built in pauses during indexing. Running from a command prompt, user help has been updated for better instruction in the use of “must-include” and “must-not-include” directives. The possibility of having ‘index to” level being blank has been fixed.
In the full version, Sphider not obeying the “must-not-include” directives during image indexing has been corrected. Also fixed was Sphider not picking up the width, height, and alt attributes in the img tag. Additionally, ‘jpeg’, ‘webp’, and ‘svg’ files are now recognized. Support for ‘tif’ image files has been dropped. (Does anyone even use tif/tiff any more?)
The User Guide has also been updated.