The Wrong Risk

Last week I put in a formal request at the day job for a new Mac, as the one I'm using is a personal device and does not have the necessary internals required to keep up with 20% of my tasks this year, and roughly 50% of my tasks next year. I was torn between requesting an iMac and a 15" MacBook Pro, given that I tend to do all of my work from home. Portability is unimportant and, for those rare days when I'm called up to Tokyo or sent overseas for meetings, I could bring my 13" notebook. In the end it was decided for me that a MacBook Pro would be the way to go, outfitted with 32GB RAM and a 512GB SSD. Before the request could be submitted, I needed to provide a list of software that I use on a regular basis in order to show why I was requesting an Apple device rather than a company-standard Dell. Xcode was the star of the show on that document, along with Pixelmator, Automator, and Sequel Pro1 Paperwork completed, I sat back to let management do what management does.

This is when things became interesting.

Apparently one of the managers who I have regularly and openly argued with over the years took offence to my request for a $3,400 computer when a "perfectly good" $3,800 Dell could be ordered, pending approval from the company president2. The argument was that, because the company's system management software didn't run on macOS, the company couldn't adequately manage the device remotely. This was quickly shot down as a justification thanks to the hundreds of unmanaged Windows-powered Dells that continue to circulate throughout the organization. Not willing to give up, the next argument was that the support staff in IT didn't have the requisite knowledge to configure and maintain a Mac, so one couldn't be ordered. This, too, was shot down quickly by pointing to an order for a replacement iMac that would be used by someone at HQ. But then came the argument that would apparently solidify the attempted veto of my hardware request: I have too much access to the corporate databases, making me a risk to the company. Therefore, I should not get a Mac but, instead, a very locked down Windows 10 machine that would monitor everything I did and send a detailed summary of my activities every hour.

I had to laugh.

Yes, I do have a great deal of access to company data. I have access to a lot of databases for schools around the globe3, all of which contain information that is none of my business. A leak from any of these systems would be cause for serious concern and would result in a lot of bad publicity and, potentially, lawsuits. A lot of my colleagues would find themselves out of a job, and I would likely be unemployable for the rest of my life. All of this is true. But it's also the wrong risk that any manager should prioritize for employees who have proven themselves time and again to be very aware of the responsibilities they've assumed.

A Greater Concern

Many companies succeed not because of what's in a database but because of the people who invest time and skill in the pursuit of something better. The greatest risk that I pose to my organization is not as an information thief. Selling student lists or employee passport numbers is neither interesting nor worthwhile. If people are going to worry about what damage I might cause, they need to think a little grander. They need to consider what would happen if I left to start my own business4 and took some people with me. This is the risk that I pose my employer, and it's a risk that a lot of companies have to deal with when ambitious people think "Hey, why am I playing the corporate game when four of my hard-working, intelligent colleagues and I all really want to do something else?"

Many of the successful entrepreneurs I've met over the years have the same story. They were unhappy working for someone else for whatever reason, so they left to do their own thing. Some were able to hire former co-workers after a year or two. One that I know of founded their business with two colleagues5. Given my track record, this is what people should be looking out for. Not low-brow theft6.

The manager in question, clearly having nothing better to do with their time this week, has invested quite a few of her hours — and those of her staff — by trying to find all the reasons I should not receive the hardware I requested. She's even gone so far as to investigate ways of either limiting my access to systems I am 100% responsible for or tracking everything I do that is related to the day job. The last bit I can kind of understand, despite the horse leaving the barn years ago, but the rest is just pettiness. Our history of disagreements have been documented on this site and at the day job to some detail but, at the end of the day, the risks identified are nothing even remotely close to reality.

  1. These are the core Mac-only applications that I use. The others I can find decent alternatives for if running Windows or Ubuntu.

  2. All hardware purchases over 300,000円 (about $2,800USD) must be approved by two mangers and the president. Fortunately, he and I have a pretty good relationship, so there shouldn't be any concerns in that regard.

  3. This access is all done through secured, monitored, remote desktop sessions where copy/paste has been disabled. It's a right pain in the ass, but this means the data never physically resides on my computer. Moving data from the servers to my machine would be spotted pretty easily unless I'm siphoning off a couple of kilobytes of data per hour.

  4. I've openly stated on this site that one of my goals for the near future is to be self-employed.

  5. Their managers must have been quite upset to lose not one, but three competent people at the same time.

  6. I've often said that if I do turn to a life of crime, it's going to be worthy of a Hollywood movie featuring Jason Stratham. If my line of work runs the risk of spending years in prison, it better be worth it. Have you ever wondered what might happen if a country's largest bank suddenly lost all of its money in a well-executed digital heist? I have.

Just One More Thing ...

How do people manage to put things away when they're in the middle of the creative process? I've met some pretty interesting people over the years who are able to do a bunch of creative work, get in "the zone", start to make headway … then glance at the clock and head home, leaving the current efforts in a half-complete state. The next day they come back in, look over what they were doing, then pick right up again.

This has always amazed me, primarily because I despise putting things away just because the clock says it's time to do something else. In my mind there's always just one more thing that I'd like to finish before calling it a day. However, as one would expect, there's just one more thing after that. Then another. And another. Eventually the sun comes up on another day ….

Being 40 generally means I'm supposed to be smart enough to know the importance of having a good balance in life. Unfortunately this is something I haven't quite mastered yet.

Rolling Thunder

The weather this summer has certainly been different from the last couple of years. The area had a record rainfall for the month of June, receiving twice as much water from the sky as had ever been recorded1 for the 30-day period. Our winter was much warmer than average and we even had a weak typhoon hit rather late into the rainy season. It's been said before, but something is different.

Looking Westward

Late into the afternoon today we had some rolling thunder. The clouds coming from over the nearby mountains stretched across the sky to the east, leaving the west a nice summertime blue. Every few minutes there would be a rumble that would start low and slow, like a bowling ball gingerly making its way down the lane. Thirty seconds into the buildup the sound would either disappear or sound like a wave crashing into the side of a rocky pier. For two hours we were treated to this odd performance while the sky turned pink from the sunset.

Looking Eastward

Nozomi didn't seem to mind the noise, as the unstable weather made for a pretty decent breeze while we were out in the park. Generally the heat and humidity of the season tends to reach unbearable levels by 9 o'clock in the morning, with air so still that walking through it feels like pushing into a closet full of pillow fluff. Any amount of breeze is better than none, and the puppy certainly enjoyed having her fur cooled a little better while we made the daily trek along the evening course2.

Weather certainly changes over time and anomalies can make for some irregular patterns. What I wonder more than anything is how the farmers are being affected. Vegetables and fruit at the markets have almost doubled in price in the last three years, with apples and peaches selling for about $2 individually. Broccoli is generally sold for $3 while a pack of four tomatoes is $5. Bananas from the Philippines, however, have been stable at $2.50 a bunch for the better part of three years. Higher food prices will drive people to consume more of the processed foods, which is just a poor substitute for farm-fresh products. This isn't a good cycle.

Today's unusual atmospheric show took place at a safe distance, but the changing weather patterns are hitting very close to home.

  1. According to the city, temperature and weather records started in the mid 1800s, though the first 75 years of data is not at all accurate. Temperatures are +/- 5˚C, and rainfall was measured in boolean Yes/No terms.

  2. Nozomi and I have five "courses" that we can take in the park depending on the weather and how energetic she's feeling. Generally in the evening we walk around the baseball diamond as there's plenty to keep her nose busy without tiring her out too much before dinner.

Building Tables from Temp

This week I've been handed an almost impossible task at the day job: build a database containing a subset of information from our current SQL Server-based CMS using the table structures required by the new cloud service. On the surface, this doesn't sound too complicated. So long as a person knows the data structure of both systems, SQL scripts can be written once and used multiple times. The difficult part comes down to time as there are just three working days to get this done for 100+ data tables containing as many as 300 columns of data each, and the documentation for the Cloud objects is … incomplete1.

In an effort to build as much as possible in the least amount of time, I've decided it would be best to "cheat". The first set of SQL scripts that I am writing will collect as complete a dataset as possible for each object and write to a temporary table. As this can sometimes be an iterative process to refine the output, pre-defining the data tables does not seem like a good use of time. Instead, I'd like to simply write a query in an INTO #tmpWhatever command to generate a temporary table. When I'm happy with the output, the data is then written to the new table where it will sit until exported.

Now here's the fun part. Because the data is already in a temporary table and because SQL Server makes it really easy to query table definitions, one can have the database pretty much write a table creation script for you.

This is how I do it:

SELECT '[' + col.[COLUMN_NAME] + '] ' + UPPER(col.[DATA_TYPE]) +
       CASE WHEN col.[DATA_TYPE] in ('numeric', 'nvarchar', 'varchar', 'nchar', 'char')
            THEN '(' + CASE WHEN col.[DATA_TYPE] = 'numeric' 
                            THEN CAST(col.[NUMERIC_PRECISION] as VARCHAR(3)) + ', ' + CAST(numeric_scale as VARCHAR(3))
                            WHEN col.[DATA_TYPE] IN ('nvarchar', 'varchar')
                            THEN ISNULL(CAST(col.[CHARACTER_MAXIMUM_LENGTH] as VARCHAR(4)), 'MAX')
                            WHEN col.[DATA_TYPE] IN ('nchar', 'char')
                            THEN ISNULL(CAST(col.[CHARACTER_MAXIMUM_LENGTH] as VARCHAR(4)), 'MAX')
                            END + ')'
            ELSE '' END +
       CASE WHEN col.[IS_NULLABLE] = 'NO' THEN ' NOT' ELSE '' END + ' NULL' +
 WHERE col.[TABLE_NAME] LIKE '#tmpWhatever%';

This query will return as many rows as there are columns in the provided temporary table, which can then be copy/pasted into a partially-written CREATE TABLE statement. This query is going to save me hours of pain this week as I rush to complete things that should have been done weeks ago.

  1. When I use this word to describe something used in a professional setting, I mean it's untrustworthy or poorly defined. In the case of the data migration documentation, "incomplete" means both.

Why This Place?

Of all the places people can go online, why does anyone come to this place? This question rolled around in my head today while in the shower as a follow-up to other questions regarding my efforts online. Curious about how many people visit, I checked out the stats collected by Cloudflare and found the following:

Web Traffic

Traffic the last couple of weeks has been up, and about six thousand visitors have come to this site in the last 30 days, some of which are likely digital in nature. What's interesting about this is that it's about the same number of visitors I used to see between 2007 and 2009, right before Twitter really took off and "killed" blogging. Looking at the "Recent Popular Posts" down at the bottom of every page, that blasted post from 2009 continues to receive the most traffic followed closely by a long out-of-date tutorial and an editorial that seemed to raise a number of eyebrows at the day job … which was actually the trigger for the question posited at the start of this piece.

People around the world have much more interesting places they can visit, so why come here? Despite efforts to improve my writing style over the last 11 months, there doesn't seem to be much difference in anything published here since 2013 when I gave up long-form writing. The range of vocabulary might have increased as I try to become more precise in my speech, but the excessive comma and relative clause usage that has dogged these posts for years persists. Helpful articles and tutorials have long-since disappeared. Rarely is there a joke or keen insight shared. Very little that is published on this site would look out of place in the opinion section of a small-town newspaper run by volunteers.

To be clear, I'm flattered that there are readers who come here. The chart shows that a minimum of 426 people have visited every day and a little over 6,000 in the last month. Doing this math, this means that a good percentage of visitors are return readers. Clearly something is encouraging people to afford me a couple minutes of their day; I just wish I knew what it was.

The rational side of me generally asks the irrational, inquisitive side what value this sort of knowledge would offer. Would knowing people's motivation result in better articles? More focused writing? Encouragement to carry on? Given the patterns on display over the last dozen years of blogging, it's obvious that none of these would happen. Instead there would likely be a less-diverse range of topics and a more critical eye on the perceived value of a given piece. Published content already goes through at least two rounds of vetting before going live, so a third would just sap the fun out of this ongoing project. So the rational side of me knows that there is no long-term value in understanding why people come here more than once.

The inquisitive side, however, is irrational. Maybe it's this that people come to see. By observing my irrationalities, visitors can feel that much more confident about their own sanity.


Headaches have been a part of life for as long as I can remember. These usually start as a throbbing vein just above the left temple — a Temporomandibular Joint (TMJ) headache — before expanding to other parts of the head and becoming a migraine. In a typical week I'll have four or five clusters with one or two full migraines. Only when I reach the point of a debilitating migraine will I take some ibuprofen to reduce the pain. The doctors I've seen over the years have found nothing wrong. My glasses are fine, as are the muscles around the head and neck. The headaches will form regardless of whether I'm using a computer or not, and work does not seem to trigger a higher rate of problems1. This is just a fact of life for me.

One of the first serious TMJ headaches that I remember was in my third year of high school. I came home on a Friday with a throbbing skull. My parents told me it was because I wasn't wearing my glasses2 and insisted I put them on. I went upstairs to my room, climbed into bed, and woke up on Sunday3. This wasn't the first time that I'd lost an entire day while in a comatose state4, but it was enough to trigger me to pay attention to how these headaches formed, evolved, and dissipated.

It wasn't long before the three most common types were identified.

The TMJ Headache

This is the most common for me, where a throbbing or piercing pain starts around the temple and works its way inwards, sometimes feeling as though it's penetrating the ear canal and causing all sorts of confusion and sensitivity to sound. These headaches are likely one of the primary reasons I strongly dislike incoherent noise. Ibuprofen can relieve the pain within 15 minutes or so of taking the pills, while acetaminophen can require as much as 30 minutes before kicking in. Suffice it to say, there is always a supply of ibuprofen in this house.

The Neck Headache

This is one that a lot of people have become familiar with over the years thanks to cell phone usage. When the neck is bent for extended periods of time, it puts a lot of strain on the muscles in our shoulders, neck, and head. This can result in blood circulation issues or muscle strain, which can then evolve into an unpleasant headache. People who use their phones in low-light environments are hit twice as hard because, in addition to a neck headache, they often get to deal with a cluster headache around the eyes. I would often have neck headaches in my youth after playing with the GameBoy for hours on end, and after university when I'd use my Palm handhelds for hours and hours and hours. Over the last couple of years I've moved away from looking down at a device and instead have neck headaches as a result of poor sleeping posture. These are not at all fun to deal with and generally result in loud snoring and a headache the size of an elephant after waking.

The Migraine

Everyone's least-favourite headache. These have become a lot more common since the boy joined the family, as he's yet to learn the difference between an outside voice and an inside voice, and generally involve the sort of pain that makes a person want to sit in a quiet and dark closet for the rest of the day. A sensitivity to sound and light is very common, as is a loss of appetite and extreme dizziness. Being a parent means that I generally can't disappear from the world for a couple of hours but, when things become really dire, I reach for the noise-isolating headphones and drown out the world with a 9-hour audio track of falling rain. The boy can continue to scream his A-B-Cs, and I can sit at a safe distance and wait for the medicine to kick in.

  1. During holidays and vacations I'm just as likely to have headaches as when I'm sitting in front of a computer for hours on end for the day job.

  2. The same pair that I think I accidentally threw away during a locker clean-out.

  3. Given that I was the family cook, and the person who did a lot of cleaning, this didn't sit well with my sisters who had to pick up the slack while I was out of commission. I never did find out why I wasn't brought to a hospital for being completely unresponsive for 36 hours.

  4. The first time was the result of sunstroke after playing about 12 hours of baseball in the sun without adequate hydration.

Five Things

Yesterday I had today’s post all planned out. The topic was set as the things that have changed in my life since this time last year, and seven items were identified1 with a couple of notes to guide the direction of the section. After writing the post this afternoon, however, I found the piece lacking. It just didn’t sound right. When this happens, the post gets archived and is generally never seen again. Unfortunately this means that another post needs to be planned and written before midnight rolls around.

Luckily this is a Five Things post, which is generally easier to write.

Rather than look at change, which would have me write about a rather sensitive topic that would likely be misunderstood, I figure this would be a good opportunity to look at five inanimate things that make my days just a bit more enjoyable.

Coffee, with a Bit of Milk

I don't drink nearly as much of this wonderful beverage as I used to, but coffee remains one of the indulgent pleasures of the day. A cup with breakfast, a cup after lunch, and — occasionally — a cup around 11:00pm. When I started this addictive habit at the foolish age of 16, I took my coffee the same way my mother did; with cream and sugar. Around 21 this changed to cream only and at 23 I went with regular milk and haven’t looked back.

Boxer Shorts

This could probably be classified under the TMI category, but four months ago I made the switch from briefs to boxer shorts. This is not the first time I’ve switched, but it will likely be the last as none of the inconveniences I had while wearing this style of underwear at 20 have resurfaced. Summers in this part of Asia are no fun at all when the heat and humidity kicks in by 8:30 in the morning, and briefs are notorious for trapping heat. Since going with boxers, I have found sitting at the desk for hours on end to be much easier.

A Good Work Chair

Until a month ago, I used a kitchen chair at the work desk. There were a number of reasons behind this, such as avoiding the cost of a nicer chair so soon after moving house. Now that I have a more comfortable working chair, though, my legs don’t lose blood circulation and my back is supported much better. It has already paid for itself because of this.

A 24” 4K Monitor

Two years ago I was fortunate enough to receive a 24” Dell P2415Q monitor at work. Given how much of my day is spent staring at a screen, having a sharp image with no discernible pixelation is crucial. This monitor is generally used for image work, Remote Desktop sessions to Windows servers, and a whole bunch of web development. Without this monitor, my eyes would be a lot more tired by the end of every day.

A 13” MacBook Pro

I’ve used a number of computers over the years, but none have been quite as influential in my life as the 2015-era MacBook Pro that I use on a near-daily basis2. So much of what’s been accomplished in the last four years can be attributed to that specific tool. While it’s certainly struggling to keep up with my current workload, the machine is no slouch and can generally do what I need so long as I give it time to process.

There are certainly a bunch of other inanimate objects that make life more enjoyable, such as my home or the spring-loaded leash that gives Nozomi 5 metres of wiggle room when we go out for a walk. The five listed above are the smaller items that I tend to consciously appreciate on a daily basis. Sometimes it really is the little things that can help someone feel better despite whatever temporary trials life may be throwing their way.

  1. I generally try to come up with more than five, then whittle the options down to the target number based on the decency of the writing. This doesn’t always happen but, when it does, a more cromulent post is written.

  2. I’m technically forbidden from using my computers on the weekend, as it’s supposed to be “family time”. This makes freelance projects harder to complete, but time with the family is generally a good thing.


In just 47 days I’ll have reached my goal of writing and publishing a blog post every day for a year. Also, at a rate of one post per day, the anniversary post will also be the 2,999th blog post published to this site. This is a ridiculous number for a personal weblog, though not without precedent. There are hundreds, if not thousands, of personal sites with far more content than I’ve managed to put out, many of which are probably better focussed than this one.

The idea of writing a post every day seems easy to a lot of people despite the obvious challenges with time, interest, and attention. Back in the mid-2000s when blogging was booming and sites like Facebook and Twitter were bootstrapped operations, there would be regular writing challenges posted to sites like Technorati and Digg encouraging people to participate. As one would expect, the first week would see a flurry of activity. The second week saw a steady stream. The third week would result in a trickle. Eventually the excitement would wear off and people would return to their erratic posting schedules1. Maintaining enthusiasm is hard work and requires a certain level of dedication.

Personally, I’ve found it to be rather difficult at times to write a post on a daily basis. Today is a perfect example of this as I’ve yet to open any of my note-taking applications to jot down ideas for the daily article. Aside from taking Nozomi out for her walks in the morning and evening, I’ve not left the house in three days. Excessive heat and humidity followed by a 30-hour rain storm precluded any sort of outdoor activities. What is there to write about? The need for software to be treated as a craft rather than a job? Ignoring a hierarchy to push change onto a group of individuals? My recipe for French toast that both Reiko and the boy seem to thoroughly enjoy?

Well … that last one might be a worthwhile venture. The others, however, are starting to feel old despite the obvious passion I have for the topics.

Fact of the matter is that I’ve been pushing myself way too hard for way too long and, as the cycle goes, I'm sliding into a state of indifference. In the short term, I don’t see the value of Activity A or B, while the long term demands that both be tended to as I’ve made the commitment to myself, and I’m not going to stop something when the finish line is in sight out of sheer laziness. Future me would be quite upset.

And so I write. I write about writing. I write about fragments of memories from the early web. I write about personal inadequacies. But I push onward — I write — because the alternative would be far more unpalatable than the publication of a repetitive post about fatigue and sloth.

Fortunately tomorrow is Sunday, which means there will be a 5 Things post to write. I have just the topics, too.

  1. A common trope in the early blogging communities would be prefixing a post with an apology for not writing more.

Knowing When to Stop

One of the hardest things to do as a developer is to throw away a large block of code because, despite all the invested time and effort, the results just aren't good enough. This is where I am with one of my work projects despite the dozens of hours invested as it's become a massive time sink with zero appreciable benefits going forward. Yes, with time, I could work through a lot of the issues one by one … but this isn't what I'm being paid to do. My responsibilities involve getting things done, not tinkering about with browser-related edge cases to paper over past decisions. So, regardless of the effort invested, I'm going to throw the code away and approach the problem from a different angle.

Discarded Paper

Back in the mid-2000s, a year or so before this blog was started, I was working on a project at a printing company that aimed to reduce the amount of paperwork pressmen needed to do during their regular workday. As with most corporate projects, the requirements were incredibly complex and seemed to change with every phase of the moon. The codebase started to become increasingly bloated with business rules that would stand in the way of getting work done and, worst of all, the application was starting to consume far too many resources while running1. The PCs at the printing presses would occasionally crash or freeze as a result of the software, which resulted in lost data. Nobody was happy about this.

So, being young, single, and stupid, I decided to invest an entire long weekend into fixing the application. Three entire days were spent at home, in front of the computer, working on solving resource problems through various means. On the first day back I went immediately to the printing plant and updated the application to the latest build. Five minutes into testing, the system showed signs of struggle. Ten minutes in, the computer locked up and blue screened. There was just too much data coming in from the printing press and too many business rules that needed to be run with each operation. Despite working the entire weekend, stopping only for coffee, food, and the bathroom, the problem persisted.

Suffice it to say, nobody was happy about this.

Later that morning I asked a senior colleague to take a look at the code. Within 20 minutes he pointed out a number of areas that could be improved with huge swaths of code being outright deleted. I'll never forget what he said:

You've got recent code reversing out past code just for the sake of holding on to the work you did a few days ago? Don't do that!

He was right, of course. I don't remember why I did what I did, but I remember how I solved the performance problems: I rewrote the program from scratch, using two working days and two nights to get it done. Thursday morning I went straight to the printing plant, updated the application, and watched.

Everything worked as expected. There were a few bugs here and there, of course, but the core application was receiving data from the printing presses, processing it, and saving the results back to the main database. The pressmen were happy for the reduced workload. The managers were happy for the reduced workload. The sales staff were happy for the process run and colour accuracy statistics they could review. All it took was a different set of eyes, being able to step back from trying to force a preconceived notion onto a problem, and recognizing that sometimes it's best to not be too attached to past efforts.

While I don't have a second set of eyes to help with this current project, I can certainly step back and recognize when time is being used in a manner that is ultimately suboptimal. It's time to drop some code and approach the problem from a different angle.

  1. This was back in the day before web applications were a thing. The project was being written in C# with Visual Studio 2005, which had just come out. The target PCs were all 500MHz Celerons with 256MB of RAM, so resources needed to be considered. This was usually enough for most business software, but manager-mandated bloat can do some pretty awful things to code.

Why That Data Sucks

This past week I've invested far more hours than I should have needed to clean up a database in preparation for an upcoming migration. When getting data ready to move from one system to another, there is often a little bit of work that's required to ensure information is not lost and that the most important details are as complete and correct as possible. What I've been doing over the last few days, however, is on a completely different scale. The question that keeps repeating in my head is both simple and absurd: How can a company that deals with long-term, face-to-face interactions operate without ever knowing the names of their paying customers?

A Little Background

At the day job we are migrating our systems from internally-developed solutions to a rather large cloud vendor. This is a bit more complex than spinning up virtual machines and migrating our databases to off-site servers, though, as we're taking our SQL Server, MySQL, PostgreSQL, and ancient FileMaker-based systems and putting them into something that — I think — operates on top of an Oracle database. The work is generally pretty straightforward, though there is a great deal of verification and validation that is necessary to ensure the information we upload is correct.

It's the penultimate step, the verification and validation bit, where I seem to invest the bulk of my time; particularly when it comes to names and contact details.

Every company has its own little quirks about how it uses its databases. Here in Japan, one of the things that has long bugged me has been the way staff at the schools will change a person's name in the database to help with quick identification or search. So, if we have two people named "John Smith", one might be changed to "John Smith (Old)" or "John Smith (Student)". This would be shown in the search results when someone looks for names matching or resembling "John Smith". While this seems like a logical solution to a problem, what this means is that in the database we'll have a last name of "Smith (Old)" and a first name of "John". If there are any reports to print out, the comment in the parentheses is included.

The schools have come up with a whole lexicon of short codes, symbols, and words to help quickly identify customers of all kinds. The first time I ran into this on a large scale was when I started importing customer names from the big CMS into the LMS I developed a few years back. These comments would appear on a teacher's schedule, on attendance lists, and in printed reports that went to the student. This was something I adamantly refused to let happen, so wrote a little function that would strip the codes out of a name and present just the proper name. It's worked well for several years and the state of the data in the Japanese database, while not perfect, is consistent and reliable. There will not be any problem whatsoever ensuring that the names and other details that we upload to the new system will be devoid of these "meta notes".

Knock It Up a Notch

The database I've been working with this week, however, is not from any Japanese system. This means people from a whole different culture and background who have used the same software have created their own form of meta notes over the years … and it's terrifying.

One of the first things that I noticed when working with the database was that a person's entire name was written into the "First Name" column along with some extra details, such as the type of contract they have and maybe even the name of a colleague for when two people are taking Lessons on the same contract1. In the "Last Name" column there will be other details, such as a person's family name … or their full name … or their name in the native language … or the name of the employer plus their name … or the name of the employer, the type of contract, and the full name of the student. And I need to parse this out to have given names in their own column, family names in their own column, and names written in the native language written in a third.

But wait! There's more!

Some of the more interesting uses of the "Last Name" column is a school's habit to write the relationship of a student. These are some of the values that I have found in the database2:

  • Tom's sister's friend
  • The mother is taking the class
  • Afternoons at the cafe
  • John's new wife
  • The president of ABC Company

These are notes, but they're written in the "Last Name" column. In the first name will be the whole name, sometimes in the standard alphabet, sometimes in the native language, sometimes with both, and sometimes with the contract type thrown in as well. When a school is feeling particularly frugal, there might not be a name at all and instead something like "xxx" or just an empty string. As I've already said, I need to provide a proper list with family names, given names, and native language names — when they exist — in separate columns.

Most database people I've met over the years would take one look at this and send it back to the schools, telling them to "fix their crap data", otherwise nothing will be migrated. While I would love nothing more, this is not really an option. Instead I went and created a series of SQL queries that would clean the data as much as possible. After a few days of work, I'm generally confident in 95% of the data. I could go through the last 5% line by line and fix issues by hand when I find them, and I have done this with some of the most egregious issues, but it's not really the best use of my time. There are other countries and other databases that need attention as well.

Why in the … ?

Blaming the schools for their "crap data management policy" would be easy, but I really don't think the fault lies with the schools. People were compelled to do this in order to get around very real problems with the home-grown corporate software. Problems that could have been avoided had people paid attention, asked questions, and sought solutions.

The key problem is one that I've already mentioned: it's too hard to find the right person when searching by name. This is completely true, and every country has their equivalent of James Smith or Maria Garcia3. The solution is not to mess up a person's name field, though.

What are the options, though?

Having solved this problem a couple of different ways in the past, including in the soon-to-be-retired LMS, I see two relatively quick changes that could resolve the issue.

The first is to make it possible to assign tags to a person's record. The tags could be completely free-form and allow a good amount of text so that contract types, descriptions, and relationships could be easily recorded, searched, and displayed in a results list.

The second is to add a short comment field — distinct from the main comment fields — that would also be part of the search and returned for display in the results list. This option would generally require more processing power, but may be easier to implement within a database.

Either one of these options would ensure that printed reports, attendance lists, and other items showing a person's name are free of superfluous information. The company would win because printed materials would look more professional and teachers would win because they wouldn't have to try and parse the meaning behind the meta. Of course, I would win, too. With less "gunk" to filter out and process, I could more easily prepare data for migration from one system to another.

Over the years I've seen a lot of very strange things put into a database. By looking at the reasoning behind why, it becomes easier to think about how data can remain complete and valid while also solving genuine business problems. The hardest part is being vigilant and proactive when oddities in the data are discovered and reported.

  1. This is not at all required. The system can have a million people on the same contract. I have no idea why the people who used the database in question have such a fear of making new records.

  2. These are not the actual names, as that would be a giant breach of trust and would justify an immediate firing. The names have been changed, but the gist is completely accurate.

  3. These two names are among the most common in the world according to this blog post.