A while back, I tasked one of my team members to update the NTP servers used in one of our datacenters. We were using standard pool NTP services and decided to move away from them for various reasons. We found that stable time was more important than accurate time, and the pools definitely didn't add stability. NTP uses UDP by default, and we wanted to turn off/ACL-off UDP in certain networks. So we grabbed a few CDMA-based time servers off of Ebay, fronted them with our typical Juniper SRX firewalls, and set up clients to use the SRX's as time sources.
After setting up a few devices, this employee suggested, "Hey why don't we set this up on a loopback and anycast it?" I thought about it for a second, something else came up, we moved on, and the suggestion was forgotten (by both of us). We had not finished moving everything from flat, layer 2 networks to a true Clos L3LS setup, so the timing wasn't just right. After finishing the L3LS migration, I looked at this again......and we're very happy with the results.
I will preface this with, I am not a DBA. I've run PostgreSQL databases off and on for the past 8 years, but I'm not a full-time DBA. I don't follow all of the ins and outs and daily updates with PGSQL. I'm a simple bit pusher who runs a few dozen DB's. And with that out of the way.....
I got an alarm last week about high disk usage on one of our PostgreSQL database instances. This was really odd as we've never seen a high disk alarm on them before. Also, it was just the master that was alarming; none of the replicas were alarming even though they have identically-sized drives. If the DB data were truly growing quickly, every machine would alarm. So I checked the DB size graphs in OpenNMS and sure enough, they were OK:
These are 1.6TB drives so nothing wild going on here. Running du in the pg_data directory showed that pg_xlogs were responsible for all of the new disk usage. But why? None of the WAL and checkpoint settings had been changed in months. Nothing new in the output logs. After searching high and low about the actual calculations for retaining WAL files (most of them are wrong including the official docs), I came to the conclusion that our WAL and checkpoint settings were not responsible for the high disk usage. But what was?
In the world of computing, there are some things which, in Dustin's words, "is not good software." For the past 6 months, I've been messing around with Ceph for object storage. Luckily for me, it is good software. However, it has bad documentation. Sometimes bordering horrible. This is the sorry state of opensource software these days (ever try an use a Hashicorp product lately?). While most things in Ceph are fairly automated, swapping an OSD's underlying storage device is not. Seems like a scale-out storage product would want to nail down the simple act of swapping a dead hard drive, right?
Sadly, the official documentation for swapping a drive is...long and not admin friendly. Seriously Ceph team, this is your suggested process? Red Hat is now Ceph's corporate overlord and even they think that this is an arduously bad process. Look, they even made a bug on it: https://bugzilla.redhat.com/show_bug.cgi?id=1210539
Luckily, I've figured out a much easier way. Here it is:
1. On the storage node, find the OSD # to drive letter mapping (sdq is the dying drive):
Remy started to crawl about a month ago, and every parent knows what happens when children start to crawl....they get into everything. Everything. So this meant changing up our existing entertainment area and optimizing for fewer emergency room visits. I'll admit, the old setup just had too much stuff on it in general, and especially too much stuff for him to get into. Here's what it looked like before:
So, out with the old and in with the new right? Sort of. The old TV was a 37" Westinghouse LCD that I bought over a decade ago. I tend to buy TV's the same way that I buy computers -- I get something better than decent and hold on to it for a while. A 37" LCD might not seem great today, but it wasn't too shabby in 2004. At least it did 1080p and had HDMI (baller!). The new TV panels are all super slim for fashion's sake (more on that later), and while they look really beautiful, they're seriously lacking in the audio department. The old speaker and amp setup (Parasound HCA-1200 mkII and KEF iQ3) also had to go. You can see that a child could easily knock over the stands and hurt themselves in the process. So goodbye amp and speakers, and time to get a soundbar.
I recently upgraded my Sony Z3 Compact to Android Marshmallow and noticed something really odd. When not on WiFi, I wasn't getting any push notifications or any updates. I'd come home or get to work and then my phone would light up and beep like mad. Later, I discovered that there was no network connection on the mobile network. I could make calls and receive texts, but the data connections just wouldn't work. Luckily, it's an easy problem to fix.
EDIT: It looks like RedHat pushed a new build that fixes the issue ->
If all of your MYSQL SSL clients and replication just broke, I'm guessing that you're running RedHat, CentOS, or something derived from RedHat. In short, RH modified OpenSSL to reject Diffie Hellman (DH) keysizes less than 768 bits. Note that this is not the length of your private key. This is the DH key which is used in Perfect Forward Secrecy.
A certified Creole coonass just trying to get by. I live in San Francisco and work as a digital plumber for the joint that runs this thing. (www.weebly.com)