So now the connection was closed but the program still tried to read data from it. So basically there was a dead lock. After setting up the limit the problem disappeared. Further Marco and Me looked into using the Coda file system for our Laptops. We have now requested a server and hopefully we can start installing next week. This should be really cool as this is a networked file system that will sync when it reconnects. So you can take your laptop home work offline and when you come back to work you can keep on working on your big work pc. I further did some research into shadow-utils and userlib. Without going into to much detail userlib is really nice. I don't really understand why so many people still use shadow-utils. I am currently lobbing for userlib to become the standard at Cern. I started thinking about disaster recovery and disaster management. I wrote a script that will run on a server and query the Ldap server every 15 minutes about it's entries then it creates the /etc/passwd, /etc/groups and /etc/shadow. So in the unlikely event that Ldap goes down and Kerberos is still up. The files can just be copied to all the machines and users can still use them.
I started to have a look at the quattor sendmail component that automatically configures the sendmail program. The syntax is really horrible of the sendmail config file. But more to come about this. While writing this I am waiting for my sendmail patches to be commit to the test cluster. Through some minor changes I reduced the run time from about 1 1/2 minutes (real 1m12.017s ) to half a second. (real 0m0.875s).
Further I attended quite a few meetings. And a talk about the new castor scheduler.
I was quite happy to hear that the average uptime is 99.73 % for the machines my department maneges.