Twitter Weekly Updates for 2009-05-03

  • I wonder if my leg was tasty after a random aggressive dog bit right into my calf yesterday. A&E for me! Yey! #

No Comments


Twitter Weekly Updates for 2009-04-26

  • I’m all e-commerced out #
  • “Soft” by Lemon Jelly is just musical nirvana to my ears. Can’t believe this is the 1st time hearing LJ since Uni. What have I missed?!! #

No Comments


Milo and Mimi photos

So I’ve just ordered some poster photo prints from photobox.com from my gallery…

poster_mimiimg_6063-intense

… can’t wait to see the results :)

, , ,

No Comments


Twitter Weekly Updates for 2009-04-12

  • My favourite Linux command of the week: dstat. e.g: # dstat -dnyc -N eth0 -C total -f 5 #
  • To move, or not to move? That’s a tricky question. #
  • Bear McCreary is the fraking daddy. + <3 the BSG version of “All Along the Watchtower”. Good job I’m listening to it on headphones! #
  • I remember the days when hip-hop was quality, stuff like GMF and Stetasonic. Too bad the genre sold out. People today don’t know what it is. #

No Comments


Time Travel Cheatsheet

From http://www.topatoco.com:

timetravel-cheatsheet

,

No Comments


Twitter Weekly Updates for 2009-04-05

  • We’re now the custodian of a third rescue greyhound, we’re only fostering this one though, and we’ve named him “Buddy” #
  • All I feel like I’ve done today is just piss on fires. Servers out of disk space left, right and centre! #
  • @stephenpc Yeah, we use Intellipool for most of it, but the disk space was running out too fast for monitoring to be of any real use :( in reply to stephenpc #
  • I’m getting ready to help seed CentOS 5.3 prior to release. Batten down the hatches! #
  • Nearly 100GB upped on the CentOS torrents so far. Just waiting for the mirrors to update and the release to be made public. I need a holiday #
  • I’m not impressed at being woken at 3am for help with a system we don’t support. Lazy, ignorant bastards. I hate doing on-call. #
  • I’m looking forward to rescuing 30 dogs on saturday from a trainer who’s kept them all in a small garden shed. What a bastard. #
  • How to force mountd to use a static port on Red Hat http://➡.ws/䆂⌯ #
  • Operation Sieze Greyhounds is in full swing. Just waiting to get my cargo for delivery to the kennels. #

No Comments


How to force mountd to use a static port on Red Hat

So I’ve been working with a very strict firewall on an AIX host which is mouting an NFS share on Red Hat 5.3 hosts and since NFSD on Red Hat utilises the RPC protocol (port 111) and NFS (port 2049) which are static, it unfortunately also uses rpc.mountd (aka mountd) which (by default) doesn’t run using a static port, instead, every time it starts up, it asks the RPC portmap service for a free port number, and uses that.

I just couldn’t have this happening on Red Hat, since the AIX firewall is locked down as tight as can be, with even anomalous outbound tcp/ack’s being disallowed. I know that the portmap service gets its free port numbers from (among other sources) /etc/services so I decided to grab the current port number that mountd was running on…

rpcinfo -p | grep mountd

and make an entry into /etc/services in the hope that rpc.mountd would see the mountd entry and automatically use that port number, and only that port number, such an example entry:

mountd          672/tcp                         # Rob's Edit - binds mountd to a static port
mountd          672/udp                         # Rob's Edit - binds mountd to a static port

I restarted portmap and nfs, and ran rpcinfo again…

service portmap restart
service nfs restart
rpcinfo -p | grep mountd

… and lo-and-behold rpc.mountd had binded to the static port specified.

, ,

No Comments


Note to self… co-incidences in IT DO happen!

Ok, so on Friday I was working from home while recuperating after some surgery (don’t ask). I’m currently working on a large migration project which is really high priority time-scale wise, which is why I was working from home, since I, nor the company I work for can really afford for me to be away from this project for any length of time. So I’m working on a large IBM RS6000 AIX wide node where I need to create an NFS share to their new Red Hat based platform, this required a minor change to the genfilt / mkfilt rules on AIX to allow the new systems to access the NFS shares. I made the one line change and reloaded the firewall on the system, unfortunately this made NIS/YP fault and stop responding, not such a big deal, except that this node is also a NIS server, which meant that users who were authenticating from a frontend running on a thin node were unable to, which started to cause issues quickly, fortunately existing users weren’t affected, however newly connecting users weren’t getting on.

ibm_rs6000 As soon as I’d reloaded the firewall I could see that NIS had failed (inexplicably) and backed out the change, I had to get NIS back online, and reloading the YP services wasn’t working. With the change backed out, I reloaded the firewall again, this time mkfilt just wasn’t having it. The syntax was fine, but the firewall was now blocking access to all services. Remember, I’m working from home, via an SSH session to a host at work with rlogin access to the wide. As soon as the firewall started blocking traffic my remote session died and I was unable to access it. FUCK!

I get on the phone straight away to the DC and asked a colleague of mine, Brian, to re-run the firewall script from the control workstation, which has a direct, non-IP connection to the wide. About 10mins later I get a call saying he’s been able to restart the firewall ok, and I can access the server from my connection again. Phew. NIS is still down though and still refuses to start-up cleanly. A reboot is in order. By this time, I’m pretty much ready to head into the DC so I can be hands-on with the kit when needed. Brian gets in touch with the client and co-ordinates a graceful shutdown of the databases before we initiate a standard reboot.

By the time I arrive at the DC (15mins away by car fortunately), we’ve managed to arrange “unscheduled maintenance” time, and we bounce both the nodes. Everything comes back up perfectly, and users can log back in just fine. We notify the client, and they can see everything’s ok, the databases have come up and everything’s back the way it was.

I get into finding out what caused NIS and the second firewall reload to spanner completely when we get another call from the company saying that LPR print queue jobs are not being passed from the thin node to a 3rd server which is running Caldera Open Linux linux (yeah, I know!). This Caldera box is running Tarantella which provides client-based printing. Essentially, users printed from a terminal on the thin-node, which is mapped to a remote print queue on the Caldera server, and the Tarantella server then maps the user’s printer to their print queue on the Caldera server. Essentially allowing (in a very round-about way) client-based printing from a terminal. For turn of the century stuff this was quite advanced, since there was no way to do this dynamically, from a web-based (HTTPS) client, and without setting up static routed print-queues on the node.

_643711_caldera_linux23_150 So that’s the background. Now, when we heard about this printing issue, which had been an intermittent problem since the platform had been introduced, but this had normally been resolved by a simple reboot of the Caldera server. We decided that since the nodes had been down, this had likely caused a bottleneck between the servers and that Caldera needed a reboot in order to enable the bottleneck to clear and allow the print queues to start moving again. We bounce the box and the queues still are being held on the thin node. FRAK! I know beyond a doubt that the issue isn’t software firewall related, since my minor change (a) wouldn’t have affected port 515 communications and (b) the firewall is running ok. My boss, John, had become involved around the time we rebooted both the nodes, as he was interested to know what was going on. After being brought up to speed he was convinced that this was a firewall related issue, since the initial cause was firewall related, and that I’d asked our network manager to add new rules to allow NFS between the new and old platforms. I knew it was highly unlikely that the problem was a firewall one since the changes had been backed out, and the system was in it’s normal, default configuration but  John felt that the timing was just too close for it to be a coincidence with anything other than a firewall issue. It took us a while, looking at the firewall rules in place, to see if any hits were being matched on the Cisco’s (which they weren’t), telnetting to ports etc all of which were fruitless. It was obvious in my mind that there was something on the Caldera box which was not allowing the LPD daemon to respond properly. After looking through the tarantella logs I checked the /var/log/messages log and saw that the LPD daemon faulted at start-up with the error “not enough disk space”. That old chestnut.

After a little more digging, it turned out that the / partition, having only 2GB of space had slowly been filled up by apache access and error log files since the early 2000’s and had caused the disk to become full. Monitoring hadn’t been set up to check disk space usage, which beggared my belief, but there it is. The apache logs had filled the last of the available disk space at pretty much the exact same time as the AIX system had gone down. All of the time spent wasted checking firewall rules and all the printing problem was related to was a frakking simple thing – disk space.

So the moral of this story is, blind co-incidence DOES happen in this profession, and it’s something that I’ll definitely remember for the rest of my career!

, , , ,

No Comments


Oracle’s Unbreakable Linux not denting Red Hat – CNET News

I just read an interesting article about Red Hat and Oracle’s rip-off (literally) clone, “Unbreakable Linux”

<3 “Red Hat is the trusted brand in Linux, and for good reason. Red Hat’s support policies demonstrate an understanding of what Linux customers require: mission-critical support for mission-critical deployments.”

Article: Oracle’s Unbreakable Linux not denting Red Hat | The Open Road – CNET News.

,

No Comments


Exchange Square

Last weekend I was on-call and was working late on Friday so I deceided to take my camera into town to take some photos of Exchange Square which I walk through every day to get to work from the station. These are the results (click the photo to go to the gallery) and some are my first foray into HDR imaging too…

Exchange Square

, , ,

No Comments


SetPageWidth