Search Done (Almost) Right

Yes, it's yet another post about search algorithms and how I'm never satisfied with any of the versions that I write against a MySQL database1. That said, I've managed to cobble something together in the v5 API that doesn't completely frustrate me every time the thing runs, and it's live on my v5 test site right now.

Unlike many of my past attempts to work with search, I wanted the Anri theme to have a very focused way of enabling search from any page without requiring a page load. When the search modal is triggered, the entire screen will be covered and a single text box will appear for search criteria to be entered.

001 - Search For Nozomi

When the results come back, the search criteria is moved to the top of the page and the bottom 75% shows information with the specified words highlighted in off-yellow. Like other versions of 10C, an icon will appear on the left of the title to signify what kind of item was returned. The title is a proper link to the item but, for people who want a bit more of a peek, there's a "Show More" button2. Clicking this will open a simplified version of a post with keywords highlighted and all of the HTML stripped out. I may need to change this in the future to allow images, though, as my posts about Nozomi look a little weird without the visual elements.

002 - Search Results for Nozomi

This form of search is not as instantaneous as the one built for v4's default blogging theme, but it's a lot more comprehensive.

003 - Bright Yellow

With v4 I would often run into issues when trying to return search results for my own site in under two seconds, which is why I "cheated" with the EzReader theme by merging search with an archive page. A full list of blog posts would be retrieved from the API along with some additional metadata such as tags and stored in local memory. This information would be used to generate the full list of blog posts in reverse chronological order and, when filter criteria was entered into the search box on that page, the data stored in local memory would be read to show the results. Unfortunately this would only include blog posts. Social and other post types were completely ignored because it typically results in too large a volume of data to work with.

Lazy, lazy, lazy!

With v5 I've set aside the goal of returning data for my sites in under two seconds and instead opted to return a more complete set of results by querying the database properly for all post types. This will be important going forward as there is no limit to the number of post types a channel may contain.

004 - Expanded Results

The v4 API did have a Search API that could be called to query the database, but this was rarely ever used. What I plan on doing with the v5 implementation is seeing how well it returns data for people and then improving its ability to handle accounts with more than 100,000 items in a single channel.

Using the v5 Search API

If you'd like to see how well the v5 Search API responds to requests, you can do it like this:


Required variables:


Optional variables:


Including the HTML body will send the full, original text of the item and is not included by default. The count value defaults at 75. Authentication is not a requirement but, if the request contains a valid authentication token, account-level search results are made. This means that if there are private or "invisible" posts on a site, they will appear in search so long as the signed in account has the appropriate level of permission.

Hopefully this is a solid start to search done right.

  1. I have some pretty fancy code to whip out for SQL Server that gives me a proper-weighted search result with pretty good consistency.

  2. This will appear only when the full text is longer than the summary.