New Sphider downloads available for PDO versions

A minor problem was found affecting the PDO versions (PDO, SQLite PDO, and PostgreSQL PDO) of Sphider.

During indexing, if the “Use site map” switch was set, but the site map was not found or not usable, the code to update the database to turn the switch off was failing to execute.

The code has been corrected to enable the database to update. The updated downloads are reflected as a “b” version.

The non-PDO version was unaffected. This was strictly a PDO issue.

Thanks go out to Webbo for the catch.

Minor corrections to PDO Sphider versions

It has come to our attention there are typos in the code for all PDO versions. For the normal PDO (MySQL/MariaDB), spider.php and spiderfuncs.php have been slightly modified. Spider.php had a single typo. Sphiderfuncs.php was missing 5 lines of code. While the version number is unchanged, the new download is designated as 2.0.0a.

For the PostgreSQL and SQLite versions, only sphider.php contained a single typo each. No other files are affected. As with the regular PDO, the version is unchanged by the download designations are shown as 2.0.0a.

Our apologies for the inconvenience. During testing of all these versions. these anomalies were uncaught and thus it seems that, for the most part, crawling functionality was not adversely impacted, although it COULD be under certain circumstances.

Our deepest thanks go out to Ed Parrish for having caught these issues.

Sphider 2.0.0 nearing release

Sphider 2.0.0 is under going final testing and will be released probably by mid-October.

Virtually every file has gone at least some alteration. The features of Sphider 2.0.0 are:
– Better page charset handling to ensure that the database receives only UTF-8 input. UTF-8 encodeing of web pages already in UTF-8 format is avoided to eliminate garbled entries.
– Phrase searches have been improved.
– This version is PHP 7.1 ready.
– Integrated indexing of images, with the option to NOT index images. An image search page is also provided.
– RSS content may also be indexed and searched.
– Jquery has been updated to a more recent version.
– While not fully PSR-2 compliant when it comes to PHP coding standards, the code is a LOT closer than it ever has been. This involved the renaming of many functions, the elimination of a few functions which were found to be obsolete (and thus, unused). Coding style had to be changed virtually every module. This is why so much code has been altered, affecting nearly every Sphider PHP code segment.
– The search page is integrated for legacy, RSS, and image searches. Knowing that RSS and images are something not every user will be interested in, an updated (as in 2.0.x compliant) version of the 1.6.x search page is provided. The revised 1.6.x search form, it will work fine with 2.0.x. It will need to be renamed to replace the provided search.php.

Also, finding that porting PDO to databases other than MySQL was messier than anticipated (too many DB specific requirements for each), Sphider 2.0.0 will actually have 4 flavors. The “kits” for PostgreSQL and SQLite were too cumbersome and confusing.
1) The legacy Sphider, using the MySQL database (or MariaDB) and using MySQLi and MySQLnd.
2) PDO Sphider, also using the MySQL database (or MariaDB), but using a PDO implementation (for installations lacking MySQLnd support).
3) PostgreSQL version using a PostgresSQL database and accessed via PDO,
4) SQLite version, using a SQLite database accessed via PDO.

All flavors are testing well and it seems no more coding changes will be needed, after working out some “peculiarities” for each. Now each version must have a final full set of operations performed to ensure everything works. This includes new installation via PHP script, installation using SQL queries, upgrade installation, adding sites, indexing sites, deleting sites, adding, editing, and deleting categories. Also the same is done for RSS indexing. The search functions need to be tested for various situations. We have found a few websites which have, uh…., what you might call “unusual” methods resulting in unusual problems. (Ever seen an image “alt” tag with text running in excess of 1000 characters? We have!)

Future considerations for Sphider (but not guarantees)

I’ve been giving thought to just what should come next for Sphider.

Integrating the Sphider Image Indexing functions with the main Sphider, thus making content and image indexing a single operation is a rather obvious improvement.

The ability to index and search RSS feeds would also be a nice addition. I actually have an alpha of this running on both Linux and Windows machines. Since the spidering operations can be done from a command prompt, a simple cron is keeping the feeds updated on the Linux box. The Windows task scheduler is being a bit more stubborn, mainly because of a pesky PHP error I haven’t solved yet. PHP is fine in a browser, but the command prompt is giving trouble. It works, but I keep getting an error that DEMANDS a response! I’ll figure it out.

Since searching for content is different from searching for images, which in turn is different than searching for RSS feeds, three different sets of search and results pages are needed. To a user, the only obvious difference is the search page, as the results portion is integrated. So I am giving thought to a possible “unified” search page with tabs so that the appropriate search form (and corresponding results) can be present to the user. This is not definite yet, just a thought.

These are all ideas for the future. For now, version 1.6 remains the latest. If the need arises, minor release improvements/fixes are not out of the question.

Anything you would like to see in the Spider of the future? Give me your ideas and … well, who knows? It might be a very good, very doable idea!

Sphider 1.6 Release Status

The regular version of Sphider 1.6.0 and the associated Sphider Image Indexer are completed, tested, and ready to go. Since I want to release the PDO version in tandem, that is the only hold up.

The PDO version and associated Image Indexer are also essentially completed, but undergoing further testing due to some last minute code changes. These changes involve code portability between database types. The release, as usual, targets MySQL (and presumably, MariaDB). There will also be a small set of four replacement modules (install.php, database.php, db_main.php, and db_backup.php)  available targeting SQLite users! It is anticipated that a similar set will soon be introduced for PostgreSQL users. The power of PDO will finally come to be realized.

As soon as everything has been more thoroughly tested, the appropriate zips will be posted in the Downloads section.

Preview of the OPTIONAL Sphider Image Indexer search results

Work has progressed to the testing phase of both Sphider 1.6 and the OPTIONAL* Sphider Image Indexer. This is a screenshot of the results of an image search during testing. To get these results, the PHP installation needs to have the imagick module installed. The search will still work without it, but the thumbnail previews will be absent. The rest of the results will remain. Search is in the choice of image name, image url or alt tag contents. Search can be for all indexed sites or be site specific.

Release date of mid-July.


* – Sphider 1.6 will work normally without the Sphider Image Indexer and will automatically detect when it has been installed. Image indexing is integrated into Sphider.

What’s next for Sphider?

Work is proceeding with Sphider 1.6!

What will be new in 1.6?

  • The ability to truncate selected tables from the database tab
  • The ability to clear all site data without deleting the site
  • The ability to crawl a site using a sitemap.xml, provided one exists
  • The option to preview pages from the results listing
  • An issue with resuming suspended indexing has finally been resolved
  • Support for an optional Sphider Image Indexer

At this point, the changes have been made in both the vanilla and PDO versions of 1.6 and testing is ongoing.

And what? An optional Sphider Image Indexer?  This is an add-on that will work with Sphider 1.6. You will be able to build a catalog of images from sites where you have previously indexed the pages. Currently, the indexer itself is being tested, with excellent results. Work has begun on an image search function, but that is still in the VERY early stages and nowhere near being a viable tool. While the indexer required some modification of the core Sphider, the search function will not.

What this means is that once testing of the vanilla and PDO versions of 1.6 are complete, it can be released. The Image Indexer add-on still has to have the search function completed, then both the indexer and search function ported to PDO, and finally fully tested. At that time it will be released as version 0.99.

Since the search function of the add-on is in the very early stages of development, input as to how you would like to see it operate would be considered.