16-Apr-2005 - 12:00pm - 8:00pm www4 Upgrade > Dual Xeon

Announcements concerning Networking & Related News, Planned Outages, Anything which may affect your services.

Moderator: Admins

Post Reply
porcupine
Site Admin
Posts: 704
Joined: Wed Jun 12, 2002 5:57 pm
Location: Toronto, Ontario
Contact:

16-Apr-2005 - 12:00pm - 8:00pm www4 Upgrade > Dual Xeon

Post by porcupine »

Hi guys,

Well it looks like we've found the only potential way to nip the current CPanel/CentOS 3.4 issues (/scripts/upcp randomly causing kernel panics) is the not use CentOS 3.4. As such, the new replacement for www4 has been loaded with Fedora Core 3, all of the relavent data from the current www4 has been rsynced to the new server, and thus far, it appears that it's ready to go.

As such, I've scheduled this maintenance for Saturday April 16th, 2005, to be completed between 12:00pm (noon), and 8:00pm EST. We will be powering down services on the old www4 as we do one final file synchronization to the new one in the following order: POP3/SMTP, MySQL, HTTP.

Once all data has been synchronized, the old www4 will be unplugged, the new www4 will be rebooted, and the routing tables will be cleared (on the gateway router to ensure traffic goes the right direction).

Customers should expect between 10 and 20 minutes of downtime for their email, between 5 and 15 minutes of downtime for mysqld, and between 4 and 8 minutes of downtime for httpd.

If this goes flawlessly on the first attempt, subsequent attempts will not be necessary. Overall, we expect the impact of this maintenance to be extremely minimal relative to whats being done, and hope to have everyone happy on the new Dual Xeon before the day is done. Next week we will be performing the same type of maintenance on the new www6 server (replacing the Dual Opteron with a Dual Xeon), with www5 being upgraded sometime next month (likely near the end of the month).

Regards,
Myles Loosley-Millman
Priority Colo Inc.
myles@prioritycolo.com
http://www.prioritycolo.com
porcupine
Site Admin
Posts: 704
Joined: Wed Jun 12, 2002 5:57 pm
Location: Toronto, Ontario
Contact:

Post by porcupine »

It's 2:05PM EST now, and I'm just about to begin the first attempt on moving the www4 content for the last time synchronization to the new Dual Xeon server.

Customers will shortly notice their pop3, smtp, and cpanel access being disabled, then for a few short minutes, http, and mysql will go down as the server transitions.

Please bear with us on this, as it will take a few minutes :).
Myles Loosley-Millman
Priority Colo Inc.
myles@prioritycolo.com
http://www.prioritycolo.com
porcupine
Site Admin
Posts: 704
Joined: Wed Jun 12, 2002 5:57 pm
Location: Toronto, Ontario
Contact:

Post by porcupine »

porcupine wrote:It's 2:05PM EST now, and I'm just about to begin the first attempt on moving the www4 content for the last time synchronization to the new Dual Xeon server.

Customers will shortly notice their pop3, smtp, and cpanel access being disabled, then for a few short minutes, http, and mysql will go down as the server transitions.

Please bear with us on this, as it will take a few minutes :).
Well that went reasonably smoothly:

64 bytes from 66.199.181.4: icmp_seq=104 ttl=63 time=4.836 ms
64 bytes from 66.199.181.4: icmp_seq=208 ttl=63 time=1.660 ms

As you can see, only 104 seconds missed (just under 2 minutes) of actual downtime with the migration.

Everything should be back up and running except SMTP (exim) as there was a problem during startup that is being investigated currently (and should be fixed shortly).
Myles Loosley-Millman
Priority Colo Inc.
myles@prioritycolo.com
http://www.prioritycolo.com
porcupine
Site Admin
Posts: 704
Joined: Wed Jun 12, 2002 5:57 pm
Location: Toronto, Ontario
Contact:

Post by porcupine »

Unfortunatly the new install did not take. CPanels latest release of exim clearly had some problems with libgcc compatability, and/or was corrupt on their servers.

We've reverted back to the old www4.pcdc.net server.
Myles Loosley-Millman
Priority Colo Inc.
myles@prioritycolo.com
http://www.prioritycolo.com
porcupine
Site Admin
Posts: 704
Joined: Wed Jun 12, 2002 5:57 pm
Location: Toronto, Ontario
Contact:

Post by porcupine »

Well it would appear that a manual fix for exim has repaired the problem.

While i realise that we're now just over 1.5 hours outside our maintenance window, with the last switch-over having an impact of a mere 1 minute 54 seconds of downtime, I feel that it would be best to simply get it over with, and let people start enjoying the benefits of the new Dual Xeon server.

Having said this, we will be re-attempting a final move attempt within the next 30 minutes. The impact of this should be minimal, and I'd like to thank everyone for their patience regarding this :).
Myles Loosley-Millman
Priority Colo Inc.
myles@prioritycolo.com
http://www.prioritycolo.com
Post Reply