tag:blogger.com,1999:blog-4778960813645956682024-03-14T05:46:30.273+00:00Gridpp OperationsAlessandra Fortihttp://www.blogger.com/profile/11973932320387024088noreply@blogger.comBlogger50125tag:blogger.com,1999:blog-477896081364595668.post-6250039685464288632009-01-07T21:03:00.002+00:002009-01-07T21:06:21.762+00:00OpenSSL vulnerabilityThere is a new vulnerability in OpenSSL in all versions prior to 0.9.8j, discovered by Google's security team. You will be happy to learn that the Grid PKI is not affected by the vulnerability since it uses RSA signatures throughout - only DSA signatures and ECDSA (DSA but with Elliptic Curves) are affected. (Of course you should still upgrade!)Jens Jensenhttp://www.blogger.com/profile/17912909430932033924noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-13856795024943928632009-01-05T10:52:00.002+00:002009-01-05T11:12:53.855+00:00New MD5 vulnerability announcedIn 2006 two different MD5-signed certificates were created. A new stronger attack, announced last Wednesday (yes 30 Dec), allows the attacker to change more parts of the certificate, also the subject name. To use this "for fun and profit" one gets an MD5 end entity certificate from a CA (ideally one in the browser's keystore), and hacks it to create an intermediate CA which can then issue Jens Jensenhttp://www.blogger.com/profile/17912909430932033924noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-61233856144992209142008-12-18T11:00:00.002+00:002008-12-18T11:04:25.178+00:00IGTF release 1.26 annotatedFolks, you probably saw the announcement, the new distribution came out of IGTF Sunday.Overall the release is not urgent. I meant to send out an evaluation earlier but ran out of time.--jensAnnotated changes:* Added accredited classic Indian Grid CA (IGCA) (hash da75f6a8) (IN)Good for those folks over in Cambridge working on EUIndiaGrid. INFNno longer has to be the catch all.* Updated IUCC rootJens Jensenhttp://www.blogger.com/profile/17912909430932033924noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-46158713004758279512008-12-10T13:51:00.005+00:002008-12-10T14:23:02.128+00:00Winning hearts and minds?(Not really operations but this is my least-inappropriate gridpp blog for this...)I was invited by Lambeth City Learning Centre (www.lambethclc.org.uk) to give a talk about "computing and science" to schoolchildren. They were 14 year olds, it's the first time I've given talks to children.My first main point was that increases in computing performance enables researchers in every single field of Jens Jensenhttp://www.blogger.com/profile/17912909430932033924noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-41991770316418198542008-10-09T12:32:00.002+00:002008-10-09T12:35:37.349+00:00September CPU efficienciesAPEL data provides a view of CPU efficiencies (CPU time/Wall time) across UKI sites. The table here shows the results for September.Jeremyhttp://www.blogger.com/profile/17769621277176880473noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-44250298266186400852008-09-29T08:41:00.003+00:002008-09-29T08:56:43.987+00:00New IGTF distribution 1.25What's new? NCHC is back in (Taiwan), new keys not vulnerable to Debian incident. A number of metadata files were updated.There is a new group, the IGTF RAT, Risk Assessment Team, which covers the whole world, timezone-wise (or close enough). The idea is when a vulnerability is announced via the IGTF, the RAT assesses the risk and alerts the CAs concerned. The idea, of course, comes from the Jens Jensenhttp://www.blogger.com/profile/17912909430932033924noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-55864743414491935882008-09-09T19:23:00.005+00:002008-09-09T19:51:07.063+00:00All greenGridMap revealed a brief period of all green on 8th September for GridPP sites. This reflects the hard work many sysadmins have put into getting their sites ready for the LHC first beam on 10th September.Jeremyhttp://www.blogger.com/profile/17769621277176880473noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-10382510332101049402008-08-21T15:49:00.002+00:002008-08-21T15:58:28.767+00:00Play it again SAMLHCb have recently made a change to their software install SAM tests - as noted at todays WLCG Daily meeting:Some LHCB specific SAM CE tests were systematically failing everywhere because of a bug in one of the modules for installing software. These tests have been temporary set "not critical" until a fix will be in place and fully tested. However for once Glasgow knew about these failures at 23:Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-56131347616158819972008-07-02T11:08:00.007+00:002008-07-02T12:20:30.630+00:00A view on Steve's new network testsEarlier today Steve released results from his new network tests (see his blog entry). As a first step to looking at the data the plot above shows the scatter of from site b SE - to site a CE results. For this view rates are capped at about 100 Mb/s (which impacts the UCLH-UCLC 164.5; RH1-RH1 764.9 and RH2-RH2 243.1 Mb/s results). As Steve mentions it is too early to draw conclusions but there areJeremyhttp://www.blogger.com/profile/17769621277176880473noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-70405337688464336892008-06-20T10:35:00.002+00:002008-06-20T10:49:34.082+00:00CCRC post-mortem workshop - talk summariesWLCG CCRC post-mortem workshop – summaryAgenda: http://indico.cern.ch/conferenceTimeTable.py?confId=23563First part of workshop was on storage. Generally observed that there was insufficient time for middleware testing.CASTOR: Saw SRM ‘lock-ups’ – CASTOR network related. The move to SL4 is becoming urgent. There needs to be more pre-release testing. They want to prioritise tape recalls. [UK: RAL Jeremyhttp://www.blogger.com/profile/17769621277176880473noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-36260158881366728292008-06-12T11:37:00.005+00:002008-06-12T15:36:32.241+00:00CCRC Post Mortem workshop (day1 - PM)MonitoringTalk about how everything feeds to gridview, but conveniently skips the lack of detail you normally find in gridview. Mentioned ongoing effort for sites to get an integrated view of VO activity ath their site. Too many pages to look at.Once again 'nice' names for sites were discussed - we want the 'real' name that a site is known as. Network monitoring tools were discussed (well, lack Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-52924376650385873832008-06-12T10:17:00.004+00:002008-06-12T10:43:16.482+00:00CCRC Post Mortem workshop (day1 - PM)More from the CCRC post mortem meetingMiddleware1) No change from normal release procedures. FTM didn't seem to be picked up as a requirement by Tier-1s (however RAL had their own already in place). Problem was that there wasn't a definirive software list, just the changes needed. CREAM is coming (slowly) and SL5 WN perhaps by Sept 08SL5 plans - ATLAS would like to get software ready for winter Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-29291926969388159542008-06-12T07:41:00.007+00:002008-06-12T10:16:46.670+00:00CCRC Post Mortem workshop (day1 - AM)Some crib notes from the CCRC postmortem workshop - These should be read together with the Agenda. Where I've commented its parts that I consider particularly useful.Storage1) Castor / StormGPFS vs GridFTP block sizes - GPFS 1MB, SL3 GridFTP 64K, SL4 GridFTP 256KReduced gridftp timeouts from 3600s to 3000sCMS had problems with GPFS s/w area - latency issue [same as ECDF?] - partly due to switch Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-72202677107244120852008-05-07T16:32:00.002+00:002008-05-07T16:35:55.188+00:00Oracle sick at RAL - GOCDB affectedAs has been broadcast several times, RAL suffered a power glitch which has had a bad effect on some services. One of those still outstanding is the GOCDB. See https://cic.gridops.org/index.php?section=rc&page=broadcastretrieval&step=2&typeb=C&idbroadcast=32223 for details.Posted here to raise awareness as people may use the RSS feed..Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-66155480320501692282008-04-28T11:05:00.004+00:002008-04-29T08:28:26.217+00:00Impact of a SPOFLast week connectivity to the RAL Tier-1 was affected for many hours due to a large number of firewall connections being left open. The ops SAM tests quickly showed how the Tier-1 can act as a Single Point of Failure with the current setup. The evidence is in this gridpp grid status snapshot in the 24hrs column. Note that Dublin and ScotGrid sites were unaffected as they now run their own RB/WMS Jeremyhttp://www.blogger.com/profile/17769621277176880473noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-40652957370449727452008-02-14T11:49:00.002+00:002008-02-14T11:54:58.052+00:00Tier1 Joins the BlogosphereNice to see the RAL Tier-1 site has a blog over at http://www.gridpp.rl.ac.uk/blog/. I've Duly added it to the GridPP Planet site and picked up this rather useful snippet about the Areca RAID Cards.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-28438137336794596502008-02-08T11:21:00.000+00:002008-02-08T11:36:46.123+00:00CERN VOMS woesPeople's v-p-inits stopped working - from the CERN VOMS. Turns out the UK CA certs had been taken out, in the mistaken belief they had been revoked... Without consulting me, of course. Apart from a flurry of email, I called Remi this morning asking him to put them back and he said he'd do that today.It was the root cert that was "suspected compromised" (but not "compromised", which is why we're Jens Jensenhttp://www.blogger.com/profile/17912909430932033924noreply@blogger.com1tag:blogger.com,1999:blog-477896081364595668.post-82589613744058310852008-02-06T11:41:00.000+00:002008-02-06T11:50:30.030+00:00evo on osXGridPP uses the evo software for most of its videoconference meetings. It normally works pretty well (espcially when compared to VRVS / Access Grid) and the audio on my mac laptop is suprisingly good without the need of a headset. However, todays java web start did an upgrade to 1.0.8 and broke the video client - it just refused to start.Problem is caused by an incorrect shell script test in ~/.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-47413683378152380172008-02-04T17:00:00.000+00:002008-02-04T17:49:59.556+00:00Out in the big worldThought I'd start blogging some CA stuff because a lot of things happen which you may not otherwise hear about (or indeed care about, but at least you have the option). CAs are kind of operational so I thought it fits OK here. It ain't storage, that I know (except see below).Starting in the big world, we now have Ukraine and Morocco on board. I was a reviewer for both. Much of my review work (Jens Jensenhttp://www.blogger.com/profile/17912909430932033924noreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-87653962727953067532007-12-22T15:19:00.000+00:002007-12-22T15:26:14.661+00:00Running Reliable Grid Services?Well, Since it was Friday Night, just before Xmas, I thought I'd do what ervery sad geek would do, and fire off a batch of grid jobs to last over the shutdown. However, the voms servers at CERN had other plans:Contacting lcg-voms.cern.ch:15004 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "dteam" FailedError: dteam: User unknown to this VO.Trying next server for dteam.Creating temporary Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-82833397299704349152007-11-28T23:29:00.000+00:002007-11-28T23:37:49.430+00:00-xBeen debugging apparently slow transfers using FTS (now a 2.0 application....) and discovered a slight change in the UK setup.Previously I just set the total no if files per channel, but I noticed that even with that set high, my dteam transfers were still only popping one file at a time off the list.Turns out that the VO shares now have hard coded limits in which are only visible with the -x Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-11152230451211714892007-11-07T10:54:00.000+00:002007-11-07T11:01:46.411+00:00Virtualisation hasslesGot a shiny new Dell Optiplex 740 (AMD 64 X2 goodness) that I wanted to run request-tracker in a VM on. Installed ubuntu gutsy fine, Installed Xen fine, just wouldn't correctly boot any domU's to completion. I even tried out KVM seeing as it's new hardware that has the 'svm' flags in /proc/cpuinfo. That failed miserably at the modprobe kvm_amd stage with kvm: disabled by bios. Most odd as there Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-14528272082187765122007-10-22T20:24:00.000+00:002007-10-22T20:32:16.312+00:00cfengine gotchaNot strictly gridpp, but posting this here may save an administrator some sanity...I've just spent the day battling against ubuntu on a lab cluster. The server had successfully upgraded from Feisty (7.04) to Gutsy (7.10) and cfengine refused to work. At all. It was only a minor upgrade in cfengine too - 2.1.20 to 2.1.22.cfagent -qv -d2 hung - even with debugging on cfservd nothing obvious. Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-477896081364595668.post-50042960047635138932007-10-09T19:37:00.000+00:002007-10-09T19:43:17.139+00:00Feed Me!At todays dteam meeting, there was a discussion about informing the PMB of any significant changes to the infrastructure. As the standard SOP is to use the EGEE Broadcast tool on the CIC portal, it'd be nice if there was say an RSS feed that we could parse and extract UKI significant info into say the planet feed. Anyone care to add comments?Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-477896081364595668.post-37125928954381875652007-09-30T07:28:00.000+00:002007-09-30T07:38:55.082+00:00RAL Network performancePrior to running any transfer tests, I normally check the Gridmon plots for that site, the relevant T2 and RAL T1. Using the (non-default) options of 'Metric data on same graph" and "New graph on new test dest" (and "new test src"), you generally get a good feel for the sites capacity, and any 'slow' links associated with it. Sadly recently the Tier1 seems to be performing dreadfully WRT the Unknownnoreply@blogger.com0