This morning on Ars was an article outlining a mobile application with a wide-open API and hard-coded passwords that resulted in some social media fireworks and hurt feelings. The security problems are the sort of thing that one might expect from a new developer or a one-person development group within a small organisation, but are by no means unique. Quite often I have stumbled across similar discoveries when joining or taking over projects at the day job and it just goes to show that creating secure applications is not at all easy but should be something we constantly work towards.
Today I put the finishing touches on a new feature for an HR-owned project at the day job that is used by just about every employee at manager-level or higher across the globe. People have been asking for a way to upload files to the application, have them appear on reports, and make them downloadable to the appropriate people. In addition to this, the we need to know who downloaded each file and when. None of this is particularly difficult, and I decided to make the download mechanism a little more interesting by adding the following rules:
- each download link must be unique
- a link is valid for a maximum of 15 seconds
- links must be used by the same account that requested them
- links cannot be guessable
The HR system is running on a couple of Amazon servers and files are to be stored in a locked-down S3 bucket. In order for files to be downloaded, they must first be copied from the S3 bucket to the web server, then sent on to the recipient if they're using one of our white-listed source IPs.
So far so good, right? This is all basic stuff. So when I demoed the system to the HR people and a couple of senior members of IT, I was surprised by some of the questions that came back. After a couple of minutes, they asked me to step through the logic so they could understand how the whole process worked. This is what I told them:
- a list of files is presented on the screen
- a person clicks (or taps) the file they want
- the browser sends a request to the API asking for a link
- the API verifies the account has access to that file and creates a URL record, then sends the information back to the browser
- the browser opens the supplied URL in a separate tab
- the web server receives the request for the file, authenticates the request using session data, confirms the source IP is valid, and verifies the requested URL
- if everything's good, the web server copies the file from the S3 bucket to the server
- the web server records the file access in the database, preventing the URL from being used again and creating a verifiable audit trail
- the web server transmits the file to the browser over HTTPS, which acts as a standard download
- the file is removed from the web server
All of this happens in the blink of an eye for the most part, with the most time-consuming aspect being the actual file download. Everything else is just a handful of text characters moving between computers. After I finished going through the process not once, not twice, but thrice, someone asked a question: Don't you think this is a little over-engineered?
It would be far simpler for me to simply insist the S3 bucket be open to the web so that a direct link to the file could be shared, but that is incredibly risky when working with files that are associated with HR data. It would also be simpler to just copy the file to the web server if it doesn't already exist, and leave it there for any subsequent download request. This would save on database queries and ensure that an interrupted download could more easily be continued. Heck, either of these options would be much simpler to document and communicate to management, too!
But this is often why corporate systems are discovered to be terribly insecure. Just because something is simple does not mean that it's better. The reverse is also true, in that complexity does not necessarily result in security. That said, so long as I am putting my name next to the work, I'll do what I can to make the system as effective as I can, and the 10-step process I outlined to the managers appears to do the trick.
Later this week I'll write up some documentation that includes a visual depiction of the flow so that the mechanism is better understood by anyone at the day job who wants to know how it's done, and that someone will probably be me in six months when some feature request requires me to understand how the functions work. Do I over-engineer solutions? Most certainly. Is there a chance they'll leak data or otherwise expose the company to risk? Not so long as I do my job correctly.