Getting Better

Over the last couple of days there have been some pretty decent updates to 10Centuries that have resolved a number of bugs that people have — and sometimes haven't — reported, as well as a couple of features that made sense to bring back from v4 with some logical updates. There are still a number of areas that need to see some attention, but the platform is inching towards being a better system for anyone who might want to use it. Hopefully by this time next week we'll see the return of the main landing page, which will include such necessary features as the ability to create an account.

Clearly I was a lot less prepared for the migration to v5 than I had originally thought.

That said, with the weekend here, a lot of the core development will need to come to a stop. Coding on the weekends is incredibly difficult given the people vying for attention, and family time is something I generally look forward to, so unless something is broken or a really quick job, there won't be any new features until Monday at the very earliest … and I'm okay with the delay. Although it's strange to say, I might be getting better at being offline for much of the weekend.

Last August, when Reiko shattered her phone and she had to use mine for a while, I made the conscious decision to be offline a little more often. While having a mini-tablet with an always on network connection was nice, it didn't make sense to pay $50 a month for the phone and data plan. I work from home, which means that my devices are either connected directly to the network via CAT6 or connected to the WiFi. When I go out for a walk, I'm out for less than an hour. If I can't be offline for 1 out of 24 hours a day, then there's a problem.

As one would expect, there was an adjustment period where I had to remember that random trivia questions that popped into my head couldn't be quickly researched while walking Nozomi. Not having the ability to check the global timeline on 10C or post an update was a little annoying. But all of these things were relatively easy to overcome1. Right now I have no plan on picking up a SIM card for the phone, nor do I see why I should pay crazy rates for data and the very occasional phone call. For my current use case, a phone plan is just bad value.

Without the digital tether, I find that when I'm with the boy or Nozomi, I am more present. The phone is now just a camera that plays podcasts when I'm not at home. Inside the house the unit is also good for messaging, reading RSS feeds, and using the browser. This is ultimately a good thing as it means that I have the opportunity to focus a lot more on what I'm doing rather than what's going on elsewhere. Being present is important, particularly given how rare it seems to be in this part of the country.

Not everyone can go without a phone, nor should anyone ditch their digital devices just because there has been some positive results from this change in my life. Being able to spend more time with the family is great, and not being distracted means that when I sit down at the notebook I can focus more on writing code or working with databases, but I'm just an edge case.


  1. I'll admit that I sometimes "cheat" and bring the corporate iPad out, which has a data connection. This is generally only done when I'll be away from the house for more than an hour so that I can deal with some limited problems at the day job should there be any server trouble.

Documentation

In order for any bit of complex software to be better understood and effectively utilized, documentation must be made available to the people who will use the tool. Unfortunately, documentation is the least favourite task that faces every developer. I can count on one hand the number of full-time software people I know who actually enjoy putting the code editors away to instead write complete sentences. There are automation tools out there that will try to write the documentation for you, but these can only go so far. At the end of the day, the best author is going to be someone — or a collection of someones — who have a good understanding of the system … which can certainly be a problem for tools that are created by one or two people.

At the day job I'm fortunate enough to be in a position where I get to make new software every couple of months. These are tools that don't exist one day, then spring into existence 20 minutes after a meeting that sanctions their creation comes to an end. A lot of the tools start out small with just three or four functions that are relatively intuitive to anyone who has worked for my employer a couple of months. However, as people begin to use the simple system and ask for "just one more thing", the software becomes more complex. The rules become more opaque. The emails from people asking how to do something becomes unworkable. By this point, documentation is not only needed, but late.

Near the end of last year I was given the opportunity to create a piece of software that would be used by colleagues all over the world and integrate with our HR systems. After a couple of discussions with the project owners it became clear that documentation was something that couldn't wait until later, it needed to be part of the development cycle1. The HR department wanted a Word file that could be updated easily and sent out as a PDF to everyone who used the system. I balked at the idea and suggested that documentation be built right into the application, complete with screenshots, videos, and links to the pages being discussed. The management wasn't keen on the solution initially, but they quickly saw the benefit once the feedback started coming in. People were actually reading the documentation that was going up, and they thanked the HR managers for making it happen so quickly.

Score one for preparedness.

In addition to this documentation, though, is the developer documentation. This is generally something that doesn't get seen by people but, because this HR project is owned by HR, some key people have access to the GitHub repository where the source is kept. These people have been reading the commit messages, Wiki pages, and Issues, and they're quite impressed with the level of detail that goes into the internal docs.

Writing a great deal on GitHub is nothing new for me, as it's sometimes necessary to have a single place where the rules and reasoning behind certain design decisions are stored. To help future me, I try to include screen shots and lists of reasons for why some functions or classes were created the way they were. If something is particularly complicated, then the messages in the commits will be a little more colourful than the dry words found in the Wiki or supporting Markdown files. This is something I try to do with all of my applications, as most of them start out small and simple, then quickly start battling scope creep as more functionality is built in. There's just one problem, though: there's almost nothing (documentation-wise) for 10Cv5 in GitHub.

The vast majority of the notes for v5 have been written to A5-sized notepads and I'm not yet 100% sure how this information will get shared with the world in a readable format? Scanning with OCR could work to a certain degree, but these notes are not always written with complete sentences (or grammar) in mind.

Documentation for v5 is slowly being released with more going out every few days. Regardless of how many people use the system, having it documented will make it easier for anyone to understand how and why it does what it does. Had I been a little more proactive with the v5 documentation like I have been with the day-job projects, then there would likely be less missing from the platform2. Fortunately there is still time to remedy this issue.


  1. Generally this is the rule for larger organisations.

  2. One interesting thing that I have noticed is that by writing documentation for the system, I get to revisit the core functionality with a semi-fresh mind. If something doesn't make sense when I'm trying to write it down, then that's a pretty good indication that something can be improved.

Missing Chronology

Last month when someone wanted to find a specific post on my blog they would open the archives page, type in a few keywords, and let the incomplete search mechanism try to find the item they were looking for. If that didn't work, then clearing the filters and scrolling down would show every post in reverse chronological order going all the way back to April 1979. The default blogging theme on v5 works a little differently in that the search box is available on every page and, unlike the previous mechanism, will actually result in a database search. As people had a way to find items on a site, it never crossed my mind to build a page showing a site's table of contents until Larry reminded me.

Whoops.

Fortunately, building a page like this isn't incredibly complicated. The fact that the archives page does not need a search box also means it's possible to change how the page displays information. But how could the information be changed to show things that people might want to see? I thought about this question a bit this weekend and came up with this:

The Anri Archives Page

There were a couple of things that I liked about the previous design:

  1. posts were numered
  2. posts were grouped by month, with the month being a title
  3. grouping was done based on the time zone of the reader, not the author

These three features needed to be brought forward with the understanding that Bookmarks and Quotations would also appear on the archives page. Social posts, called notes, are not visible in the archives as this would be noise. Should there be a need to see all social posts in reverse chronological order, there is always the Notes page.

The previous version of 10C generally cheated with the archives page by presenting a blank page, querying the API for a list of posts with supporting meta data, then building the results. This works in most situations, but can cause some headaches for search engines that do not parse JavaScript or for people using a browser with JavaScript disabled. To help resolve this, archives are now presented in plain HTML and then modified after the fact.

One item I'm not too sure about at this point is the numbering. As the screen capture will show, the numbers count differently based on the kind of object. Articles, bookmarks, and quotations are all shown with an icon unique to their type, and the counter is for that type as well. Does this make sense? Does it matter whether these are split apart at all? Could everything have the same icon, or none at all, with the understanding that clicking the title will bring you to the author's page regardless of the type? I'm not 100% sure. Fortunately, the community on 10C will let me know when something doesn't quite work or needs improvement.

The archive theme was deployed with release 19D150 which is live on the server now. Every site with at least one article, bookmark, or quotation will see the "Archives" link in their navigation menu.

Gaps

For the better part of six months, I would keep two browser tabs open on my phone and notebooks for nice.social and beta.nice.social. The first site ran v4 of the platform while the beta ran v5. This was sub-optimal, but allowed for a good deal of testing to take place with the newer software in a realistic setting. Earlier this week when a server update took down the v4 service, the decision was made to move everyone and everything over to the new platform because I felt that it was ready despite a handful of incomplete items. As was to be expected, there were a whole lot more gaps in the tool than I had anticipated.

A good amount of time has been dedicated to migrating data and resolving reported bugs over the last three days and it has brought back memories of many other migrations I've done over the years for personal projects, client projects, and with several employers. When things go smoothly, it means that something is most probably wrong. When things are hectic, it means that something's wrong but the people reporting the issues give a darn. Crazy as it might sound, I generally prefer any sort of migration that is going to involve people who give a darn.

Some of the problems reported include missing posts, broken avatars, missing functions, and site routing issues. When something is reported, I write it to an ever-growing list of tasks, making sure to set aside the time to resolve the matter. If the missing or broken item is actively affecting people, then it gets pushed up closer near the top. As of this writing the critical items have been resolved1, and a half-dozen other issues remain. The ones that will be tackled next include:

  • change the font on the Anri blogging theme to a better sans serif font
  • resolve some of the reported CSS issues on the Anri theme
  • enable messages via the OpsBar[2. The OpsBar is the name of the bar that runs along the top of a 10C site when signed in.
  • return a JSON response for an object with a canonical URL when the HTTP header requests a JSON response
  • enable follow/block lists on the social site
  • complete password-protection handling in the Anri theme

There are also close to 1800 blog posts that still need to be brought over, and the podcasts need additional work to ensure all of the meta data is imported and sent properly in the syndication feeds. If all goes according to plan, all of the core items will be resolved on Monday or early Tuesday and then the focus can shift from "Identify and Repair" to "Converse and Extend".

If there's one thing I can take away from this experience, it's that I should really look at having data migrated daily in an automated fashion during the development phase. This would ensure that migration scripts were complete, meaning the actual migration would be done at the full speed of he server.


  1. If they weren't resolved, I wouldn't be blogging.

Server Down

So much for my five-nine's of availability1 in 2019. Today I had a couple of minutes between meetings at the day job, so decided to connect to the web server hosting 10Cv4 and install some operating system updates. This is something that I've done hundreds if not thousands of times with various servers over the years. After the installation scripts completed I saw that I was within the 38-minute "lull period" where traffic to the service is generally at its lowest for a Wednesday and issued a sudo shutdown -r now command, telling the server to reboot.

Less than 30 seconds later I was reconnected and checking available storage space when my phone notified me of an issue with 10C. The site was offline. I checked with the notebook and found that the service was indeed unresponsive. The server was running, as I was connected via SSH. Apache was running on the server. The database was also operating well. But no traffic was being received. I checked to ensure that the firewalls were configured correctly, and that the IP address of the server handn't changed2. I cycled the software. I rebooted the machine. I checked error logs, installation logs, and configuration files. Everywhere I looked, the server appeared to be fine.

Cloudflare's Dreaded Error 523

By this time the service had been down for five minutes and a recovery plan needed to be enacted pronto. There were three viable options:

  1. Restore the VPS: This would essentially see me wipe the server clean and start with a fresh installation of 10Centuries. A backup would be pulled down and restored, returning the system to its previous state seconds before the reboot that brought the service down. Total recovery time: 90 minutes.
  2. Transfer 10Cv4 to the backup VM: As one would expect, I have a virtual machine image set up on the same server that is running 10Cv5. The machine could be brought online in less than 30 seconds with the most recent database restored and ready less than 45 seconds after that. I test this process every morning and it consistently takes between 73 and 75 seconds to complete. Once done, I would need to ensure the routing and forwarding was properly configured on the v5 server, which could interfere with some of the Apache settings that allows v5 to do what it does. Total recovery time: 15 minutes.
  3. Migrate v4 to v5: With the virtual server in Osaka slated to be decommissioned in two weeks when the annual service package expires, the v4 service would have to be migrated to v5 in the very near future anyway. One could argue that it's better to rip off the band-aid now rather than buy time and delay the process any further. Total recovery time: the rest of the day.

Yes, I went with the third option.

While it may not seem like the wisest decision given the lack of complete documentation, the lack of notice, and the stunning lack of functional code in various parts of the system, forcing the migration to v5 should work out to be a net positive. There will be more incentive to complete the outstanding items, as if there wasn't enough already, and it will be possible to see how well the home network can handle the traffic. If problems crop up right away, then it will still be possible to renew the VPS service with the Osaka data centre3, set up a newer infrastructure, and move everything over as a single package.

This is the plan, anyways. And with everyone on the same version on the same server, there will be a singular place to read updates rather than the plurality of timelines that has existed for the last eight months.

To the people who use 10Centuries on a semi-regular basis, I am very sorry for the downtime and hassle that will come from changing DNS records, workflow processes, source code, and preferences. One thing is for certain, though: once the migration is complete (along with a little more documentation and coding), people will prefer what v5 has to offer.


  1. Five nines generally means a service is accessible and usable 99.999% of the year, which means the system must be down for less than 315.6 seconds per year. My servers can generally shutdown and reboot in 23 seconds when everything is running properly, allowing for regular maintenance windows for security patches and other items to be installed.

  2. This would be weird, given that the 10Cv4 server is running in a data centre in Osaka with an IP that hasn't changed in years.

  3. 10Cv4 used a 2G VPS with 50GB of SSD for the web server and a 4G VPS with 100GB SSD for the database server.