Feed on Posts or Comments

Category ArchiveWork



Dcache &Work Derek on 13 Jan 2006

Friday 13th

Sat in on 2 Tier 2 Deployment meeting sessions: SC4 preparations and What the Tier 1 can do for Tier 2s
csfnfs62′s pools were showing too many files open errors on pool usage pages – restarted dcache-pool service
esr, t2k and ilc directories on dCache not correctly setup, deleted and recreated properly – this sparked by usage from esr testing people

Thursday 12th

Attended GridPP 15
Sc3 rerun started, CERN network on end of OPN had expanded and we’d hadn’t been told/realised, so had to update routing on dCache boxes
Initial rate of 30MB/s – fairly poor
Gridftp doors all fell over when concurrent transfer raised from 12 to 30, rebooted by GP.
Took decision to add in all possible pools & servers to SC3 activity, on doing so rate leapt up to over 100MB/s

Wednesday 11th

Attended GridPP 15
Built slony rpm
Documented slony building procedure on Gridpp wiki

Dcache &Work Derek on 10 Jan 2006

Cleared out some stores which had got stuck where the file appeared to not have been stored when it actually had
Downloaded Slony and began building it
Asked for GGUS access to see ticket
Had missed a new queue from Grid/Non-Grid stats so had to regenerate.

Dcache &RT &Work Derek on 09 Jan 2006

Reenabled some pools on nfs60 that had gone funny and were holding login slots on gridftp doors open
Cleaned up after our helpdesk and the CA helpdesk decided to spam each other
Ran 2005 Grid vs Non-Grid CPU usage stats
Attended talk on SPEC benchmarks
Did most of work on Laptop as PSU in Desktop was playing up – but is now fixed

Dcache &Work Derek on 08 Dec 2005

(Seem to have got out of the habit of these :-/)

Thursday

TOAST meeting
CMS report transfers going better now
CB reported moving 1 TB overnight at 200Mb/s RAL-RALPP, using FTS configured for 1 transfer.

Wednesday

Attended Gridpp-Storage phone-conference
gridftp doors went funny again, posted to dCache user forum, possibly allowing to many logns- box getting swamped, will examine situation if reoccurs
Had a t look at CMS’ pools’ disk usage – disparity between dCache reporting free and actually free – likely cause of recent failures, asked CMS if they could delete some files to see if it alleviates problem

Tuesday 6th December

dCache gridftp doors got into odd state but restart fixed them.
Checked ops consoles could raise alarms for correct systems – need to get fezzywig enabled

Dcache &Work Derek on 05 Dec 2005

Monday 5th December

Added 4 new vos
Tried install ops console with SL4
Added resrticted access yum repository for install poweroff rpm

Friday 2nd December

Further tidying up

Thursday 1st December

Tidying up after dCache upgrade

Dcache &Work Derek on 30 Nov 2005

Upgrade dcache to 1.6.6-1

Dcache &Work Derek on 29 Nov 2005

Friday 25th

Holiday

Monday 28th

Holiday

Tuesday 29th

Preparation for downtime tomorrow

Dcache &Work Derek on 24 Nov 2005

Thursday 24 November

Met with CB,BS and MH about RAL-LCG2 – RALPP transfers after egging on from JC to get started. Getting transfers working should be trivial, getting transfers working at 300-500Mb/s may be harder, but this should be easiest T2 site to do.
Change dcache to use fresh pg database for srm requests, either this or the necessary restart of SRM appeared to reduce load on dcache.gridpp to 1~2 from 4~5 though it has since increased.

Wednesday 23rd November

Cleared out failed CMS transfers for files apparently already deleted
Attended GridPP-Storage phone conference
Attended SC phone conference
Deleted remaining files from SC3 tape test

Dcache &Work Derek on 22 Nov 2005

pathtape server fell over at weekend – lots of stores piled up. Also some restores for not entirely understood reasons, my current thinking is that the file got written to a pool which tried to store it and hung (or it got stuck behind other stuck stores), the user access the file to see if its there, this causes another pool to try and supply the file – and tries to restore it from tape ( because it thinks it can, which hits some code in our restore script to cope with some backwards compatibilty and caused it to access the pathtape server which it wouldn’t do normally). But I’m sure dCache shouldn’t allow that, each file has a boolean to say whether its on tape or not which is not set until the tape operation is complete so it shouldn’t try and restore it I think.

Copied contents of dcache_1 to another dteam pool and shut it down – I want to use the /pool partition as /var/lib/pgsql/data when the great reorg takes place.

Dcache &Work Derek on 21 Nov 2005

Monday morning operations meeting
Upgraded jra1dch01.gridpp.rl.ac.uk to latest dCache release including pnfs-postgresql and postgresql companion.

« Previous PageNext Page »