The boy will be two in January. This single-digit number has very quickly snuck up with every passing day and every new thing this child has learned. He’s mastered walking forwards and backwards, can climb stairs and furniture, is able to feed himself with a spoon or fork without getting too much on the floor1, can count from 7 to 10 properly and read every number from one to ten just fine, and — interestingly enough — has the wherewithal to quickly mimic what people around him are doing and saying. Like a lot of people his age, his mind is a sponge and is busy making sense of the world. Watching this person go from a blank slate to a semi-autonomous entity is much more interesting now than it was three decades ago when my mother was having children every two years. All this aside, keeping up with this tiny person has been quite the challenge. He just won’t stop talking nonsense with his limited vocabulary and non-existent grammar.

How is it that parents are able to maintain any semblance of sanity when raising multiple children? Is the time away from the house the only thing keeping them sane? How do people who work from home manage it? These are questions I think about when the boy, who has just recently entered his “No!” stage of development, is cooperating about as much as a typhoon in September. For him the day is a nonstop party full of new and interesting things. For me the day is a myriad of cleanups interspersed with headaches, sore ears, work expectations, and half-heard conversations.

In a decade or so the rose-tinted glasses will make this time seem a lot easier than it is. Here in the moment, however, I am amazed by parents who can manage multiple children, the household responsibilities, a job, and anything else that pops up. Extra kudos to the single parents who don’t have a spouse to help out.

  1. Usually.

Soliloquies in the Inbox

How long is "too long" when email is involved? Ask this question to a hundred people and you'll get a hundred answers. However, one of the general rules that I've tried to stick to over the last few years is "anything longer than 3 paragraphs should be handled with a phone call". This generally works quite well when calling people in the same country and can be a challenge when working with people on the opposite side of the planet.

Power of Words

Since joining "The Big Project" at the day job, email has gone from being a simple communications medium to a document distribution mechanism. People ask a few questions using incomplete sentences replete with typos and I respond with what can only be described as a wall of text with the occasional attachment thrown in to supplement the 1000-word monologue. Some colleagues have joked that when it comes time to write the technical documentation for the new systems, they'll just print these electronic missives.

There's no denying that I enjoy putting ideas into a readable format. Depending on the subject and audience, an incredible amount of context will also be shared in an effort to reduce confusion or answer questions that should naturally arise by reading about decisions made and directions taken. Context is generally the bulk of any textual message I share unless communicating with someone I know really, really well. Unfortunately this often creates more problems than its intended to solve as people can get confused or begin to ignore my messages due to the sheer length they can reach.

What are the alternatives, though? This large project involves almost 80 people spread across four continents and 13 time zones. While we all technically work for the same organisation, a lot of regions have been operating almost autonomously for a quarter century or more. As we try to consolidate the best processes and procedures from each region into one global system context plays a huge role in helping people understand why certain decisions were made.

Or so I thought.

This week there have been more than a few people who have asked me to write much shorter emails because they either will not invest the time to read a few hundred words or because they get confused while trying to parse the information. Having a TL;DR at the start of an email is nothing new, but it does make me question if decisions are being made without a complete understanding of the problems.

A silly question, perhaps, as decisions within organisations are often made without a complete understanding. How else can one explain the universality of corporate inefficiencies and office politics?

50 Days

Fifty days ago Jeremy Cherfas wrote about a fun blogging challenge he took part in a decade ago with the goal of writing 50 posts of exactly 100 words for 50 consecutive days. That idea got me thinking about trying it here, albeit without the word limit. Anyone who has had a conversation with me will know that there's a very clear reason why I don't like arbitrary limits on communication. This personal challenge was started fifty days ago and this post makes for 50 in as many days. Success.

An Open Notebook

This does bring me to the next challenge, though. Writing every day is fine for a casual pastime, but something that I've wanted to do for quite a while is to write better posts on topics that give people a reason to invest their time in reading each paragraph. The vast majority of the items I've published on here over the last few years have been essentially stream-of-consciousness posts. Sure, many of them would be written one day, edited another, and published later, but the short pieces could have been done better. Heck, even this post could be written better if I were to properly plan it out on paper, organise the key concepts ahead of time, and write with the beginning in mind. The concept is not at all new as this is how I would often write for the web between 2008 and 2010, otherwise known as "the time before I became really active on Twitter".

One of the primary reasons I'd like to write better isn't to gain readers, but to better organise my thoughts on a topic. This past year I've been studying philosophy — particularly the different forms of existentialism — by reading book after book in my spare time1. The authors of these cognitively deep tomes have given me a great deal to consider. In order to better organise my thoughts on each chapter, I've written notes in an A5 notebook dedicated to the subject. Keywords, relational arrows, questions, quotes, and further areas to explore are just some of what's been written down, each page of which could be expanded to a 1000-word essay given the opportunity. Topics such as the meaning of (my) life, isolation, death, God, friendship, and the forgotten lessons from the 20th century would make for an interesting series of introspective commentaries.

Well ... interesting to me. If anyone else were to find value or entertainment from such a frivolous use of time, then all the better. Regardless of public interest, writing more thoughtfully and purposefully would offer the opportunity to slow down, plan, and clearly enunciate the ideas that are still plastic in the mind. This form of analysis can be quite useful when breaking down a concept into its individual components before holding them up for further exploration. What I like about doing this is the challenge of justifying a position, testing its correctness, and making refinements along the way. Any idea that cannot stand up to a little scrutiny is not worth holding on to. This is what long-form writing enables.

What I am ultimately looking for from my own writing is a better understanding of myself and my place in the grand scheme of things. Stream of consciousness blog posts can offer a future me a glimpse into the mental state of a past me, but they seldom answer the over-arching question of why I thought a given way at a given time, or how my understanding of the world has evolved.

Can the time be dedicated to write proper essays, though? Let's find out.

  1. I will admit that there's not a lot of spare time during the day. When 15 minutes avails itself, I will read a chapter, though.

Why Developers Are Often Frustrated

Today was supposed to be an "easy" day at work. On the agenda were four items:

  • respond to {person}'s email regarding progress report scoring matrix
  • export system usage data from LMS for {person}
  • load CSV1 data from vendor into migration SQL Server instance
  • transform data from CSV into rational information

The first two items would require less than half an hour, the third was expected to take 15 minutes, and the last one would consume the remaining time in the day, as the work that needs to be done can only be semi-automated. A person still needs to sanity check the output and make revisions where necessary.

Unfortunately, that third item refused to cooperate. Regardless of what tool I tried to import the data with, the contents simply could not be consistently read without illogical errors being thrown. I could open the CSVs in Notepad on the server just fine, but the data could not be read by anything else, including SQL Server Management Studio!

This was going to require some pre-processing.

Pulling the data onto my development machine, I took a look at the file and discovered, much to my horror, that the problem boiled down to the encoding of the file. The vast majority of data I work with is encoded as UTF-8 as a great deal of my day involves using non-Latin character sets. The files I needed to inject into a database was encoded as EUCJP-WIN, which is pretty much the penultimate file format for storing anything of value. Before I could continue, all of the CSV files would need to be re-encoded, sent back to the server, and then imported.

Annoying, but doable. I had my machine convert the 9 files to UTF-8 and sent the data2 back to the SQL Server for insertion. The first file went in just fine. The remaining eight, however, refused. I looked at the files in Notepad on the server and saw that some of the column names had upper-ascii characters. Annoying, but workable. I changed the column names, saved the file, and tried the import again.

No dice.

This time there were some rows that had more columns than should have existed, meaning that a value somewhere contained commas and the field was not properly escaped. Given that these files contained hundreds of thousands of rows each, it did not make sense to use Notepad to try and deal with the problem. Instead, I would attempt to solve this programmatically. The sun was going to set soon and Nozomi needed her afternoon walk. What should have been a 15-minute task was now into it's fourth hour.

I hammered out some quick code to read through the file line by line, comparing the number of columns to the definition on Row 0, and found that all 8 files had several hundred records with unescaped commas. While I could resolve this by hand, it would take the rest of the day and would be more an exercise in patience training than anything else. The solution I opted for was "more code".

Visually examining some of the bad data, I noticed that the problem only happened with data that belonged to a column with one of three different names. Fortunately, the 8 files did not have more than one of these columns defined, meaning a programmatic solution could very quickly be worked out.

  1. Split the values into an array by using the commas as separators
  2. Moving from left to right, copy the good values into a new row until reaching the offending column
  3. Moving from right to left, copy the good values into the same row until reaching the offending column
  4. Concatenate the values that were not copied, separating them with a comma
  5. Write the new string into the row, properly escaped so that it could be imported into SQL Server

Annoying, but doable. The computer is doing all the work with the data validation and correction. I'm simply providing the instructions on how to do it.

Once done, I copied the data back to the server and tried to import. Errors again. This time there was a problem with data types. A table was auto-created for the CSV table, and the system chose the wrong data type for a column that appeared to contain only integers at first but started looking like AB48910 from row 48,913.


Going back to the code, I wrote some additional logic that would read the file into memory, construct a list of column names, then run through all of the rows in the file and examine the data type and its length or maximum value depending on the type the machine thought it was. If an Integer column suddenly became a string, then items would be updated accordingly. The output of this code was an auto-generated SQL Table creation script, minus the Indexes3.

Looking at the output, I saw something strange. One of the columns that should have only been numbers between the range of 58,000 and 114,000 was given a datatype of NVARCHAR(25). Why?

Writing yet more code, I listed the non-numeric values in that column and found this: 112066

Full-width ASCII characters where half-width, proper ASCII characters should have been for the numbers. FUN! So the next round of updates to the code would identify values like this and convert them to their proper lower-ASCII equivalents everywhere. Maybe now it would be accepted and imported into the database?

No, of course not. There was more dirty data in the CSVs. This time in the form of URL-encoded values in a dozen or so rows where plain Kanji should have been ... like in the other 100,000+ rows from the same file.

Six hours later a solution was completed. I had written a small application for the sake of 8 files that would:

  • generate SQL scripts to drop and create tables complete with accurate data definitions and sensible-length strings when working with NVARCHAR values
  • ensure that 100% of the data going into the database was properly encoded as UTF-8
  • ensure that numbers and the letters A through Z were all half-width, lower-ASCII values
  • ensure that malformed rows were corrected in the event of having too many — or too few — columns of information
  • generate insert statements for the data, grouped by 50 rows, and save the statements to .sql files
  • write a report after each process outlining what was wrong with the CSV so that a bill could be sent back to the vendor for wasting my time

With all of this done, I was finally ready to have the data loaded into the SQL Server database. I copied the files over, loaded them into SSMS4, pressed [F5] to run the statements, and held my breath.


So what should have been a 15-minute job wound up taking just over 7.5 hours in total because the vendor who supplied the data5 seems to have a serious case of GIGO when it comes to customer data. Thankfully, their services are being terminated later this month. May I never have to turn their infuriating garbage into my consistent data ever again.

  1. Comma-Separated Value
  2. All 1.87GB of it
  3. Indexes would be added later manually, because like heck I'm farming that out for 8 tables when a person would be so much better at the job than a couple lines of code
  4. SQL Server Management Studio — a tool for working with SQL Server databases
  5. after we had to wait 12 business days for a data dump

A Little Bit Better

Today, while in Tokyo for a series of meetings, I reached out to a couple of people I’ve openly clashed with in order to get to the bottom of a problem that the company simply has no time to lose to petty politics and protectionist fiefdoms. We were assigned a task nearly a month ago, I’m responsible for making it happen, and they’ve been asked to support by providing the necessary resources when needed. We’re currently three weeks past due on this relatively simple task and, before this week, there did not seem to be any end in sight. Something had to be done.

Over the last year I’ve been investing a great deal of time into becoming a better person and working with different teams of people. This often means finding the common ground and speaking plainly, being careful to not let the politics embedded within various groups to get in the way of a positive result. Today I think some very positive progress was made. Now the hard part of maintaining that positivity will be needed.

There are better uses of time and energy than petty corporate bickering, and these better uses are getting easier to find.

10C Locker

A couple of days ago I released a new feature that's built on the 10Cv5 API called Locker. The concept is hardly original, but it does solve an immediate problem that I've had when trying to work with sensitive data at the day job where colleagues are involved. Generally the problem goes something like this:

  • I need access to a resource or see a config file, so ask the person in charge
  • Assuming all the security conditions are met, they grant access and send the resource via an "anonymous" online service that requires a password
  • I type in the password and see the resource, which then "disappears" in a cloud of binary zeroes

Why this is the process I will probably wonder for a long time. That said, I am not at all comfortable with the idea of putting the sensitive data on a website that purports to be a secure way to share information. I do not know the people behind the site, nor do I have any guarantee that data is actually secured when at rest and deleted after being accessed. I am at the mercy of the people who created the site.

So, rather than allow colleagues to continue using services like this when there could be better ways, I decided to invest a couple of hours to build my own version of this service, and it's called Locker.

The Locker Landing Page

While others may ask the very same questions about the security and trustworthiness of this site, I can rest assured that the thing does what is advertised. This first release of the tool does not yet have the "delete on access" feature nor an expiration date for the URL that is generated, but these will come over the coming days as responsibilities at the day job start letting up. What I like about having this feature built into 10Cv5 is that I will not be the only person to benefit from this. People who will choose to self-host the software will have the option to enable the feature if they think it would be of value.

Once the two remaining key features are implemented, I'll consider the feature "complete" and invest in a memorable URL. While I don't expect this sort of service to be popular by any stretch of the imagination, I hope it can provide some value to people who need such a tool.

How It Works

For the moment, encryption is done on the server rather than in the browser. The content can be text of any reasonable length1 and the password can be of any reasonable length and include multiple different input types, including emoji. Once received, a password is generated using Open SSH ciphers and the content is encoded with 256-bit AES encryption. From there the content is stored in an encrypted area on the storage medium — not in the database — and the initialisation vector for the decryption is written into the database. The password is not stored anywhere on the server, as this would defeat the entire purpose of the feature.

Once the encryption is complete, a URL is returned and people can send that to the intended recipient. At the moment it's a pretty long URL, but I do plan on picking up a convenient domain so that short-links can be generated for this and other parts of the v5 system.

When a person visits the link to view the content, they're greeted with a password entry field. That string is then tested against the encrypted content and, so long as everything is good, the text is decrypted and returned to the browser.

This system does not require authentication at this point, as that would defeat the anonymity element of the feature, and the API endpoints are also available for people to use if they so choose. API documentation is on the way for v5 and should be online in the coming weeks in preparation for the v5 roll out this winter.

  1. ideally under 100MB

Blogging in 2023

Software it an iterative exercise with wheels being invented, reinvented, and reinvented again by people around the world who are unsatisfied with the existing tools. I was “late” to blogging, joining the communities that sprung up in 2004 on MSN Spaces and soon jumped over to a self-hosted WordPress installation for a few years before taking a stab at writing my own blogging engine in 2012. In the 14 years that I’ve been blogging quite a bit has changed, but how will it change in the coming five years?

Until a decade ago, blogs were seen as social islands. People would write a post and then directly engage people in the comments. Linkbacks from sites would be followed and long-form conversations on topics simple or complex could take place over a period of days as though the writers were mailing each other. Blogging in 2006 and 2007 was incredibly interesting in this regard. As social networks began to expand and people left their blogs for the more immediate mediums, the writers who continued publishing longer-form items needed to learn how to market their sites in order to gain new readers. People who didn’t do this would see fewer readers and less advertisement revenue. Blogging was for some people quite the profitable pastime up until 2011. In the last five years the medium has seen a slow evolution to where we are today, with dedicated authors sharing their ideas, knowledge, and photos with anyone who might be interested on sites with a limited amount of advertising and a good amount of social service integrations.

Over the last few years there has been a growing community of people who want to have a great deal more control over the words, photos, and videos they share online. This IndieWeb movement shares a lot of the ideals that I strongly believe in. People should “own” the items they choose to put online and not be tracked every step of the way. Is this the future of blogging, though?

One of the biggest problems that bloggers face around the world is not tracking by the Silicon Valley organizations, but the active monitoring and censorship enforced by their governments. News coverage of bloggers in China, Egypt, and Iran being arrested and thrown in jail for years at a time for being critical of the people leading their countries is nothing new, and some countries like Tanzania require bloggers first acquire an expensive license before they can publish words online. Preventing people from communicating is about as difficult as preventing people from procreating. Humans are genetically programmed to do both whenever it is feasible … and sometimes when it’s not. While modern blogging is essentially “complete” for people with a greater degree of freedom than most, it’s still very much a problem to be solved where authoritarian and paranoid governments exist. It is not enough to post under a pseudonym to a service hosted in the US or elsewhere. How can the medium evolve to better work for writers who face persecution?

Protected blogging is a problem I’ve considered from time to time for a number of years and I consistently return to the same possible solution, which is to use a mechanism similar to BitTorrent where people use their computers, phones, or — if they choose — servers to share and store posts written by anyone in the world. There would be no servers to shut down, because these posts wouldn’t necessarily be on the public web. Censorship would be more difficult and people could post without worrying about data being traced back to them, as their machine would appear as just another node.

But then I think of all the problems with this approach. The lack of any sort of trust mechanism, making misinformation easier to disseminate under other people’s name. The lack of visibility for the posts that the world might need to know about. The bandwidth and storage requirements that people would face just keeping up with a large influx of posts from people around the world. The list goes on.

None of these problems are insurmountable, but they would require some serious consideration. There would also be the problem of ownership on these items. If people are posting anonymously under a pseudonym (or handle as it were), could they “own” the posts they share? Would ownership matter to someone being persecuted for disagreeing with their government?

For many people in wealthier countries, blogging likely looks much the same today as it did five and ten years ago. Like word processing and spreadsheets, it’s a “solved” technology. For many, though, the platform is not quite ready. Will it be in five years?

Old Notes

Every so often I stumble across a cache of outdated notes that offer a peek into the problems my mind worked on at some point. These notes are dated, colour coded, and replete with check marks in tiny boxes denoting accomplishments. Items without check marks are either crossed out with a reason written to the side or copied to a new page where they are then completed. What I find interesting about these notes is the opportunity to re-examine the decisions made or work performed with a little bit of hindsight to see what worked and what needed refinement later.

Today’s notes were from May of this year when I was working on performance optimizations for 10Cv5. There were sketches outlining how the database would need to be altered and how triggers would be used to simplify the writing activities from the API. The structures that allow the system to be as performant as they are today were all written there on paper and only one had seen any sort of tweak after deployment. All in all, the concept was solid. I wish all of my development work could be as successful.

Generally my day-to-day notes are destroyed at the end of each week to ensure that any sensitive data from the day job or clients is not lying about. However, sometimes a handful of pages are found buried in a bookshelf or in a shoulder bag I had used at the office. Having this opportunity to reflect on what worked and what didn’t allows me to see where my skills may have improved over time and where I make repeated mistakes. I should probably stop destroying 10C-related notes just so that I have a reference point to look at in the future when writing some documentation about my systems.

  1. 1
  2. 2
  3. 3
  4. ...
  5. 269