Category ArchiveDcache
Dcache &Work Derek on 13 Jan 2006
Friday 13th
Sat in on 2 Tier 2 Deployment meeting sessions: SC4 preparations and What the Tier 1 can do for Tier 2s
csfnfs62′s pools were showing too many files open errors on pool usage pages – restarted dcache-pool service
esr, t2k and ilc directories on dCache not correctly setup, deleted and recreated properly – this sparked by usage from esr testing people
Thursday 12th
Attended GridPP 15
Sc3 rerun started, CERN network on end of OPN had expanded and we’d hadn’t been told/realised, so had to update routing on dCache boxes
Initial rate of 30MB/s – fairly poor
Gridftp doors all fell over when concurrent transfer raised from 12 to 30, rebooted by GP.
Took decision to add in all possible pools & servers to SC3 activity, on doing so rate leapt up to over 100MB/s
Wednesday 11th
Attended GridPP 15
Built slony rpm
Documented slony building procedure on Gridpp wiki
Dcache &Work Derek on 10 Jan 2006
Cleared out some stores which had got stuck where the file appeared to not have been stored when it actually had
Downloaded Slony and began building it
Asked for GGUS access to see ticket
Had missed a new queue from Grid/Non-Grid stats so had to regenerate.
Dcache &RT &Work Derek on 09 Jan 2006
Reenabled some pools on nfs60 that had gone funny and were holding login slots on gridftp doors open
Cleaned up after our helpdesk and the CA helpdesk decided to spam each other
Ran 2005 Grid vs Non-Grid CPU usage stats
Attended talk on SPEC benchmarks
Did most of work on Laptop as PSU in Desktop was playing up – but is now fixed
Dcache &Work Derek on 08 Dec 2005
(Seem to have got out of the habit of these :-/)
Thursday
TOAST meeting
CMS report transfers going better now
CB reported moving 1 TB overnight at 200Mb/s RAL-RALPP, using FTS configured for 1 transfer.
Wednesday
Attended Gridpp-Storage phone-conference
gridftp doors went funny again, posted to dCache user forum, possibly allowing to many logns- box getting swamped, will examine situation if reoccurs
Had a t look at CMS’ pools’ disk usage – disparity between dCache reporting free and actually free – likely cause of recent failures, asked CMS if they could delete some files to see if it alleviates problem
Tuesday 6th December
dCache gridftp doors got into odd state but restart fixed them.
Checked ops consoles could raise alarms for correct systems – need to get fezzywig enabled
Dcache &Work Derek on 05 Dec 2005
Monday 5th December
Added 4 new vos
Tried install ops console with SL4
Added resrticted access yum repository for install poweroff rpm
Friday 2nd December
Further tidying up
Thursday 1st December
Tidying up after dCache upgrade
Dcache &Work Derek on 29 Nov 2005
Friday 25th
Holiday
Monday 28th
Holiday
Tuesday 29th
Preparation for downtime tomorrow
Dcache &Work Derek on 24 Nov 2005
Thursday 24 November
Met with CB,BS and MH about RAL-LCG2 – RALPP transfers after egging on from JC to get started. Getting transfers working should be trivial, getting transfers working at 300-500Mb/s may be harder, but this should be easiest T2 site to do.
Change dcache to use fresh pg database for srm requests, either this or the necessary restart of SRM appeared to reduce load on dcache.gridpp to 1~2 from 4~5 though it has since increased.
Wednesday 23rd November
Cleared out failed CMS transfers for files apparently already deleted
Attended GridPP-Storage phone conference
Attended SC phone conference
Deleted remaining files from SC3 tape test
Dcache &Work Derek on 22 Nov 2005
pathtape server fell over at weekend – lots of stores piled up. Also some restores for not entirely understood reasons, my current thinking is that the file got written to a pool which tried to store it and hung (or it got stuck behind other stuck stores), the user access the file to see if its there, this causes another pool to try and supply the file – and tries to restore it from tape ( because it thinks it can, which hits some code in our restore script to cope with some backwards compatibilty and caused it to access the pathtape server which it wouldn’t do normally). But I’m sure dCache shouldn’t allow that, each file has a boolean to say whether its on tape or not which is not set until the tape operation is complete so it shouldn’t try and restore it I think.
Copied contents of dcache_1 to another dteam pool and shut it down – I want to use the /pool partition as /var/lib/pgsql/data when the great reorg takes place.