Tales of updates and other AWS concerns.
• Mark Eschbach
So it’s been a while. The month of May was a mixed up blur of new activities, pressing to get things settled, and work. Parts of that weren’t worth writing about, while others were.
Syslog Update
The issues with PaperTrail have mostly settled out. I got our iOS, Ruby, and Python applications logging through libraries I wrote. I submitted a PR for the base Python library to add standard fields however I’m unsure if it will be accepted. Really strange there weren’t out of the box libraries for the systems which logged everything. I was hopeful PaperTrail’s remote_syslog2 would be able to manage stdout
streams. However they wanted to operate on files. Big problem: logging to the file system will consume disk space and we log a lot. Honestly half the reason we use PaperTrail is to get the logs off the host and somewhere they are persisted.
While trying to tighten up the mgiration workflow I ran into a big problem: the migration’s command will only dump to stdout
. In the end I’ll probably write a sidecare utility in golang or C++ which captures stdout
then dumps it into a Syslog over TLS/TCP/IP setup. Now that I’ve written several of these libraries it’s become relatively easy. Honestly I haven’t decided on a language because my requirements are relatively simple: it speeks TLS/TCP Syslog (even if I have to write it); it builds a static binary only depending on core OS runtime. Admittedly C++ is a bit of stretch for the second part since it’s got some binding issues but for core OS functionality in Docker it’s probably okay.
Next up for the Syslog system is a fork of bunyan-syslog with a focus on networking and being a pure JS package. I’ve got nothing against the native linkage but I’m not sure it makes the most sense since the focus on networking remvoes the need to link against local libraries. This uses the node version of Tap, so that was an interesting testing framework to use. I think I prefer Mocha. Not sure if that is because I like the drink or I’m just more familiar with the library.
Exploration of a technique for environmental binding
As apart of our migration work I migrated part of an application built by a developer without a great amount of JavaScript knowledge. The application funcitons as a relay between a system accepting messages and stores them. The application design was workable however there was odd chainings of this
attaching methods which didn’t really belong. To test the relay in mulitple environments there was a fun set of if statements, some of them I’m more guilty of adding than I would like to admit. As an effort to simpilify the system and reduce our maintance costs I sucked in changes from another developer dropping the ExpressJS portion of the application and restructure the main relay to use a more sensible single concern for the primary driver.
The driver as a result of the cleanup is now environmentally agnostic.
Interlude of Work
I spent much of today getting another engineer caught up on the AWS stuff I’ve been up too. Very smart man, however he stated he was very much an applicatoin engineer and it will take a while for him to get up to speed. No problem, the biggest problem is usually time.
An article posted by my acomplice in the current AWS architect: Footgun Prevention with AWS VPC Subnetting and Addressing. This harkenins back to a fairly large problem I’ve found with the general documentation: the systems always try to take on a single region as the entire system. This may be great, however if you are truely designing systems to scale then in my current opinion that is a horrible idea. We’ll ignore the system should really be using IPv6 and focus on the divion of IPv4. Assuming you are using the largest block of IPv4 private space (10/8) then you a lot of hosts.
Going back to the original issue at hand we need to subdivide the available 24 bits into networks fitting the following criteria: AWS regions, AWS AZ, and service types. AWS regions are well known and likely to be added. Region interoperability is probably my weakest argument as I have the least expierence. My assumption is you will want to establish a secure tunnel between VPCs in a variety of tunnels and desires duplex traffice. I don’t expect AWS to really add more than 32 regions, however they are already around 15. I’m happy to limit to 16 and call for a redesign at that scale.
We’re lucky if any region has 8 AZs. I mean it’s cool to reap additional reliability, however if you were going for a realistic design you would only leverage a set of these. A bit conserative view might only reserve 4 possibilities.
So far we have 6 bits if we are airing on the network conservative side and 8 on the network greedy side. This leaves us with a subnet mask of 14 or 16. That leaves the underlying networks 16-18 bits. We can losely divide services into three categories: public facing, public consuming, and private. Public facing are your client facing load balacners and hosts. Public consuming would include any set of hosts which require access to the public internet. THe final gorup are only accessed via internal clients, the best example being a database. I’m not so sure if dividing these classes provides any additional insight or clairity though.
I have some questions for furthing the design of the subnets:
- Does a single loadbalancers create more than a single host in each subnet? If not then we should be interested in tuning down the number of hosts in this particular subnet.
- If the database isn’t multimaster, how many are we exactly planning on running in each subnet? Most relational databases run in hot-standby unless you are fancy and these are still by far the largest in use.
- For multi-master or clustered database the tunning may be in favor of many samll nodes thusn requried a lot of hosts in each subnet. This is probably most true for massive in-memory databases but I have little expierence with them.
- How many application servers are you truely thinking about running? I know 5 is the upper bounds. If you are running on an Uber or Google scale this might be drastically different. However for most of it’s going to be modest unelss you’ve chosen the wrong instance size.
Being able to scale up to a size you will never reach is a waste of resources. Try thinking about what you actually use.