Sphider 1.5.4 and Sphider 1.5.4-PDO to be released on 29 May

On 29 May 2017, Sphider versions 1.5.4 and 1.5.4-PDO will be released and posted on our Downloads page.

Although addressed in the 1.5.3 series, table prefixes containing a hyphen continued to be a problem. Hopefully this time we have tracked down ALL the sources of this problem and corrected them.

Another problem was that the presence of an emoji on a web page (generally uncommon except on blog or forum pages) would cause an error and that page would not be indexed. Emojis are now purged before indexing.

The ability to index decimal numbers has been added. In earlier versions, numbers could be indexed but decimals numbers would be not be. For example, ‘12345.56789’ would be indexed as ‘12345’ and ‘56789’. If the setting for indexing decimals (on the settings page) is checked, ‘12345.56789’ will now be correctly indexed. A side benefit is ANY numerical string with a period will be recognized. For example ‘123.456.789’ would be indexed. This could be useful for pages containing part numbers. The mixing of numeric and alpha characters will still omit the period. ‘12345.abcde’ will still be indexed separately as ‘12345’ and ‘abcde’.

Also changed in these versions are the language files. Since the search page is utf-8 compliant, “special characters” like è or ç would fail to display properly. The Cyrillic alphabet with characters such as Ц or й will also now display correctly. This does NOT mean the text displayed will be the proper translation, as I am no linguist and am either relying on the work of others where possible, or winging it with the use of Google translate. Simply put, these characters are now coded in the language files as unicode entities.

Tax Freedom Day

This being April 15th, and being in the United States, I got to thinking about taxes. And then I started thinking about Tax Freedom Day, the day which, on average, a person has earned enough in the current year to pay all of his/her federal, state, and local taxes for the year (and starts working to provide for his own needs).

First of all, April 15th isn’t the day taxes are due in 2017. April 15th being a Saturday, and Monday, April 17th being Emancipation Day (I never even knew there was such a holiday), taxes aren’t due (in the USA) until Tuesday, April 18th, 2017.

Anyway, back to Tax Freedom Day… it turns out that this year Tax Freedom Day in the USA falls on April 24th. I was thinking “Gee, that kinda s**ks!” Then I found out what it is like elsewhere. In the United Kingdom, Tax Freedom Day doesn’t arrive until 13 May. But it could STILL be worse. In Finland, it isn’t until 15 June, and in Sweden it is 30 June. The end of June, which means you work half the year just to pay your taxes. Turns out, in Germany the day doesn’t arrive until 19 July, and in France it is 26 July. And worst of all is Belgium with a date of 3 August! I didn’t check to see if there were any countries even worse off. It would have been too depressing.

I guess 24 April isn’t that bad after all.

Sphider 1.5.3 and Sphider 1.5.3.PDO have been released

Updates to the Sphider search engine have been made. The latest version is 1.5.3. Sphider 1.5.3 is for use when both MySQLi and MySQLnd modules are available in PHP. For individuals who’s host does NOT provide MySQLnd support, but DO provide PDO support, Sphider 1.5.3.PDO is also available. You may find both on the Downloads page (Click the Downloads tab at the top of this page.)

To avoid confusion concerning versions, the PDO version not longer contains a “.1” at the end of the version number, but a simple “.PDO” to distinguish it from the non-PDO version. (Some people thought 1.5.2.1 was an minor update from 1.5.2 when it actually was identical but coded for PDO instead of MySQLnd.)

Changes in 1.5.3 from 1.5.2 are:
Better support for https sites.
Ability to better recognize and follow the directives in a robots.txt file.
Correction of a potential problem when using the CleanDomains function in the event there was only a single domain to clean.
Fixed a number of errors which could appear when a database table prefix contains a hyphen.
Fixed a potential error when running under PHP 7.

Sphider Help Forum is now available

The new Sphider Help Forum for help concerning Sphider 1.4.2 or later is now open, at least on a trial basis. Out of necessity, ALL posts will be moderated. This is because of the tremendous amount of blog, forum, and guestbook spam present on the internet. Apologies for the inconvenience, but that’s life.

Hopefully, this forum can be used by the slowly growing community of users of the updated Sphider. The original Sphider Forum (located at sphider.eu) has become steadily less help and more sales pitch for Sphider-Plus. We have no gripe about Sphider-Plus, per se, but the original Sphider was free and just because the original developer moved on to other interests several years ago, we don’t see why the original can’t live on and evolve with the rest of technology.

The original (1.3.6 and before) has problems with anything later than PHP 5.4, and here we are, most platforms on 5.5 or 5.6 and the trend well underway towards PHP 7.  Any internet technology which simply stands still for 4 to 7 years is going to become lost in the cloud of dust.

Anyway, hopefully the forum will be a better place to air problems and find solutions than blog comments.

Considering another Sphider improvement

The original version of Sphider had very erratic support for indexing HTTPS pages, and wouldn’t even look at the robots.txt file on a HTTPS site. That failing has never been addressed, and even the latest version, 1.5.2, has the same failings when it comes to HTTPS. This has never really been an issue for me before, and even now it is more annoyance than issue as I can work around it.

Still, the “problem” does seem intriguing. After a bit of experimenting, a fix may not be all that difficult. (Famous last words, right?)

I am debating now whether or not to continue investigating alternatives and make more code changes which would improve HTTPS support in Sphider, not only to ensure more reliable connectivity but to enable the robots.txt to be utilized as well. I don’t know that there is that big of a need. We’ve never received any complaints or comments on the issue…

Anyway, at this point there is a POSSIBILITY, but no definite plans one way or the other.

*******************************

UPDATE (Apr 6): I was able to get the robots.txt file read from a https site. First problem, regardless of http or https, the parsing of allowed or disallowed user agents and disallowed files/directories was iffy. If the robots.txt file had lines like “user-agent” or “disallow”, it was parsed, but “User-agent” or “Disallow” was not. It was a case issue. That is now fixed (on my side, not published yet). Second problem, now that I know the file IS being read and parsed, Sphider will STILL index some files in disallowed directories!

If you have any files or directories listed as “url_not_inc” in your settings, that will work, but not the robots.txt disallows, even though that SHOULD be the case. Well, this situation certainly has gotten my interest!

*******************************

UPDATE (Apr 7): I have begun the process of troubleshooting the code to see what is going awry and where. Working alone and having other things to do in life, this can be both time consuming and frustrating. So far, I do know the robots.txt is read and parsed properly. Just where and why the instructions are not acted upon is another matter. At least the question of whether or not I will be attempting another modification has been answered!

*******************************

UPDATE (Apr 8): GOT IT! Preliminary tests show robots.txt is now being followed in both http and https. More testing to follow (found a couple other misc issues and fixed them). Once everything is validated, there will be a 1.5.3. Stay tuned.

Point to ponder

When told the reason for daylight saving time the old Indian said…
‘Only a white man would believe that you could cut a foot off the top of a blanket and sew it to the bottom of a blanket and have a longer blanket.’

Daylight Saving Time is NOT followed in Arizona, with the exception of the Navajo Nation in the northeast corner of the state, which does. Meanwhile, the Hopi Reservation in Arizona, which is COMPLETELY surrounded by the Navajo Nation, does not. Does this make sense?

The reason given for this is that the Navajo Nation covers 27,245 square miles in parts of three states, Arizona, Utah, and New Mexico, and that Utah and New Mexico DO follow DST. Rather than having two different times in just one nation, the Indian leaders have opted to follow DST on the Arizona part of the Nation.

You can see from the map that the vast majority of the Navajo Nation is in Arizona. While it does make sense for the entire Navajo Nation to be observing a single time, wouldn’t it make more sense for the Navajo leaders to follow Arizona’s lead and declare that the parts of the Nation in Utah and New Mexico NOT follow DST?

Just wondering…

Current state of rocket landings

As of this time, Blue Origin has nailed 5 successful landings in a row. The last landing was actually unexpected as the launch was to test (successfully) the launch escape system. The push back was expected to damage the launcher and make it unable to land. In a big plus for Blue Origin, not only did the escape system perform well, the booster was able to make a successful landing as well. Blue Origin may start launching tourists for suborbital flights this year. At least, that’s the plan.

Meanwhile, SpaceX just nailed a landing in Florida after a successful launch of a Dragon cargo vessel to ISS. This was the third success of bringing Falcon 9 first stage back to LZ1. There have also been 5 successful barge landings (4 in the Atlantic, 1 in the Pacific). So what is SpaceX’s record at this juncture? They have 8 successful landings in 18 tests. Consider that on the first 5 tests, all at sea, there was no barge involved. These were strictly systems tests and all the stages were intentionally lost at sea. Now we are talking 8 of 13. There have been 4 successes in a row, 7 successes in the last 8 attempts, 8 successes in the last 11 attempts. Overall, considering the complexity of the systems involved, not a bad record at all!

SpaceX is currently constructing a second landing pad in Florida, LZ2, and LZ3 is in the works. This will come into play when the Falcon 9 Heavy comes on line later this year. Three cores coming down at once! It is anticipated that two will return to LZ1 and LZ2, and the third to a barge in the Atlantic. If SpaceX can pull this one off, it will be a sight to see. A Falcon 9 Heavy launch and three core landings in a single act!

Between Blue Origin and SpaceX, 2017 could be quite a year.

Sphider 1.5.2 and 1.5.2.1 (the PDO version) have been released

The newest version(s) of the Sphider search tool have been released and are available from the Downloads tab above. While there isn’t really anything NEW in these releases, they do address a couple of problems encountered. Of most importance, the problem of having Sphider exit during indexing due to web page coding errors on the site being indexed has been addressed. Instead of issuing a fatal error and stopping, only warnings are generated and indexing continues on its merry way. A potential database error when updating the settings has also be thwarted.

Also, the previous PDO version had a bug in which descriptions could disappear from search results listings. This has been fixed.
If you had the previous PDO version (1.5.1.1) and have lost the descriptions, after upgrading to 1.5.2.1, you will need to restore the descriptions by going into the settings tab, go down to the “Search settings” section where it says “Maximum length of page summary displayed in search results”, change the selection to 250 and “Save settings”. (Updating the settings before would change this from the default 250 to either 0 or 1!)

Happy Holidays and Happy indexing!

Sphider 1.5.2 – coming soon

The next version of the Sphider search tool is now in testing. Sphider 1.5.2 (and its companion PDO version, 1.5.2.1) is not very different from the previous version, except for a couple minor fixes on the Settings tab and the fact that the indexing portion has been toned down to issue warnings only when an improperly coded web page is encountered. Sphider 1.5.1 exits with a fatal error instead of continuing to index the site. While improper coding in a web page (commonly having to do with some off beat special character the database has no idea how to interpret) is rare, it sure was a monkey wrench when it came to indexing a web site. A couple other page conditions which could have produced a fatal exit now simply issue warnings (like the url exceeding the length the database could store).

At any rate, both the PDO and non-PDO varieties are now being tested to make sure the intended fixes work properly, and that we haven’t introduced any new problems. Expected arrival at this time is early December.

Blue Origin does it yet again. One booster, three launches, three landings.

On April 2, Blue Origin launched its New Shepard booster for the third successful West Texas landing after a suborbital flight. Previous landings of the same booster previous took place on January 22 and November 23, 2015.

The crew capsule successfully landed by parachute shorty after the booster landed.

SpaceX, which has been successful only once (so far), but it has to be noted that the Falcon 9 is larger and, being orbital, has a greater velocity to contend with. SpaceX hopes to be able to recover and reuse a booster sometime in 2016.

Whether it is Blue Origin or SpaceX, recovering a booster is no simple matter. It is, after all, rocket science!