Real-Time Search

This morning, while staring at the day-job workload waiting for me in the form of GitHub tickets, I started thinking about a problem on 10Centuries that I have long wanted to solve. It's a topic that has come up again and again over the years, and I think it's almost "solved". The problem of course, is search.

Search on a website is theoretically pretty simple. People enter some words in a field, those words are checked against the content in the database, and results are returned. In its simplest form, only the exact search term is sought. This means that if I were to ask for all posts containing the words bright and yellow, I would see only this result as there is an exact match for "bright yellow". But if I were to ask for yellow and bright, nothing would come back. This is clearly suboptimal, so it's better to have all of the words split apart, with results that include all posts with the words bright or yellow, ideally scoring the posts in such a way that the above-referenced post is at the top of the list. A lot of effective software uses this weighted search result method to return relevant results, but I wanted to do something different still.

I wanted people to see something instantaneously.

The New Search Box

One of the tricky parts of instantaneous results is dealing with network latency, server load, and all sorts of less-than-desirable problems that can make a theoretically semi-decent idea practically untenable. More than this is the general response times of the service. Most people can type several characters per second. Sending multiple calls to the API just for the illusion of supplying decent results in realtime seems silly. So I decided to go about solving the problem a little differently: a subset of every post is loaded into memory and called when requested.

This blog post is number 2,428 in publication order — so long as I haven't back-dated anything since this post went live — and the average size of each post is roughly 618 words. That's 1.5-million words. A crazy number one might say, but then I have been blogging for almost a decade. 150,000 words a year is nothing compared to the number of words that have been published on various social networks, forums, and IRC channels over the years. Loading all of these words into a browser would be absolute overkill so, instead, I am loading just a subset of the words that constitute a post. As people type their search query into the box, the browser scans through the data stored in memory, finds matches, scores them, and then updates the results. People with relatively recent hardware will see that the operations are pretty much smooth and responsive. People with hardware as old as this blog … will unfortunately suffer some stuttering. People searching other 10C-powered sites will likely not notice a hiccup at all.

The browser is working with a subset of the posts, though. What's not included? The content.

For the moment, search will pull from titles, URLs, tags, and author names. Future updates will include the content of the pages and posts. Yet before it can happen, two things must first take place.

  1. I need to see that people are able to use the search in any browser on any platform. This is still in testing.
  2. I need to create a cached result for every post that contains just a single copy of every word in the article, excluding certain common words in various languages.

Once these two things are done, then I can build on the existing search tool in order to provide much better, more specific results.

In the meantime, people using the default blog theme on 10Centuries will see an "Archives" link in their navigation bar. Every post will be listed in reverse chronological order, and the search bar up top can be used to quickly find published items. If you don't see this link, it's because the cache for your site has not been refreshed. Simply write a new post (or update an existing one) to force the system to regenerate your website.

This isn't a perfect solution by any stretch of the imagination, but it solves a number of problems that I've been thinking about for quite some time, and it does it in the browser rather than taxing my own servers with Google-like search speeds. Hopefully this same search method will be employed in every theme going forward.

Theme Weekend

With every spare minute, I've decided to flex some creative muscle, spend a little bit of PayPal cash, and add five themes to 10Centuries for people to complain about. Here is the first one, which is styled after Tumblr and will go live on the site this coming Monday. The remaining four are all of different colours and will round out the basic set of themes. From then on, I'll get to work on revamping the administration screens to make them easier to use.

Screen Shot 2013-06-02 at 1.05.22 AM

Looking at some of Nozomi's better pictures can really help take the edge off the ever-present sense that I'm never doing enough. That girl can sleep 20 hours a day and still look like she's been up all night cramming for some crucial examination.

A New Site Design

Matt Gemmell wrote a pretty good blog post a few days back explaining the key design elements to an effective blog. As a man who's been blogging and designing websites for over a decade, his comments are incredibly valuable to anyone who would like to take their website from a hobby to something a little more substantial. I took his words to heart and decided to update the look and feel of this site. The result isn't half bad.

Oh Captain, My Captain

picked this theme up from ThemeForest and really trimmed it down to make it even more minimalist than the original developers intended1. A number of other tweaks and elements have also been made so that the site will not load things that are unnecessary. There are still a few optimisations that I will perform over the coming days, such as the reduction of CSS and JavaScript files2, but what we see here will likely see this site through for a while.

Where's All Your Stuff?

One of the first things people will probably notice about this design is that a number of once prominent items are now missing completely. The most recent remarks3, the most recent posts, a list of posts written on this day in the past, a list of the last 15 months of archives, my ugly face, profile, and some pages have been completely scrubbed from the site. People rarely ever used them, just like Mr. Gemmell said in his post, and the screen real estate needed to be used better. This, I think, accomplishes the ultimate goal.

Search Results with Captain

One of the screens that I'm quite happy with is the search result screen. While not yet 100% complete4, it provides just about everything people could ask for in a search panel. That said, the search algorithm still sucks. I plan on overhauling that function early next month so that it's more useful than the current implementation.

Now I just need to figure out what to do with the previous design …

Another New Layout!

Another New Site Design for j2fi.netIt seems that this site gets a makeover about as often as a new Japanese Prime Minister takes office. The last design managed to eke out a day under 5 months … which isn't much worse than former PM Aso's term.  That said, this new design is lookin' nice with a lot less white and a bit more focus on minimalization (is that even a word?).

There's still a bit of work I need to do on the CSS to make it work with some of the plugins that are employed on here, but this certainly seems like a worthwhile 3-hour effort.  I don't know whether this design will last longer than the former, but I do know that this will be my last HTML 4.0 WordPress theme. The next one will be HTML5 with some interesting visualizations.

Why? Because I can. That's why :P

New Theme to Celebrate a New Home!

July 2010's Theme ReDesign!Today's the day that the Mrs. and I have confirmed the location of our new home: in Kashiwa City, Chiba Prefecture.  It's hard to believe that we'll be moving again, after just recently returning to live with her parents, but when opportunity comes knocking, we'd be foolish to not answer the door.

So, to celebrate all of these occasions, I've redesigned this site's theme from the ground up! One of the things that really bugged me about the last theme was just how bandwidth intensive it was.  This is something that I usually gripe about regarding other sites … so it was time to fix the issue.  You'll be happy to know that this site is now a full 23% slimmer on the bandwidth while also resolving some of the pesky JavaScript issues that have been an issue for some browsers … particularly Opera.

So what do you think? Is this design better than the last five that have been used on this site?  I'd love to hear your thoughts.

My First HotaruCMS Theme

HotaruCMS is a new web project that's being led by Nick Ramsay as an alternative to systems such as Pligg.  He's put in an incredible number of hours on the project and it's currently sitting at version 1.0.4.  I see Hotaru as an exciting project because, unlike the other projects I've had over the last two years, this one can't be canceled by my employer.

Now that I am devoting much more time to the project, one of the tasks that I was asked to do was convert my current blog theme to something that would work on HotaruCMS … but make it more blue.  WordPress themes are something I can slap together pretty quick, and the one you're looking at now took just an afternoon during the winter break.  HotaruCMS, however, required about five hours over two days.  I'm sure this number will decrease dramatically as I become more familiar with the inner workings of the system.

So, without further delay, here is my very first theme for HotaruCMS … Default Blues:

HotaruCMS Theme | Default Blues

What do you think?  Is it too blue?  Is it just a cheap knock off of the current theme?

There are still a few things that I'd like to finalize before including this with the HotaruCMS trunk, but it's certainly shaping up to be a slick little design for a slick little project.

A New Site Design for 2010

With 2010 around the corner, it seems like as good a time as any to update the look and feel of this blog to something a little less … noob.  As a result, I've spent the bettter part of the afternoon putting together this design and making the whole thing look as professional as possible, as it'll soon be part of my professional profile (the design, not this site).

To show the differences between the two designs, I've put them here side by side:

2009 Site Design2010 Site Design

One thing that people will probably notice is that the advertising is quite obvious, now.  While I've always tried to keep ads to a minimum, I've decided to make them just a little more prominent in order to earn a few extra cents here and there. I've been unable to earn enough through advertising to pay for hosting, which means 2010 will be the first time in over five years I've had to pay for hosting from my pocket.

In addition to the advertising, I've restored the Contact page and updated the Tweets page so that it doesn't look so … blah.  All in all, not bad for 4 hours work.  Heck, it was so fun, I might just do a lot more of this!

What do you think? Is this a worthy creation, or was the last theme better?