Skip to content


Advanced search
  • Board index ‹ General ‹ Administrativia
  • Syndication
  • Change font size
  • FAQ
  • Members
  • Register
  • Login

Possible Downtime - New News

Announcements and discussion of issues, processes, and policies related to this web site.

Moderator: DzM

Post a reply
156 posts • Page 9 of 11 • 1 ... 6, 7, 8, 9, 10, 11

Post Sun Dec 16, 2007 10:42 pm

More teething problems...

As many of you may have noticed over the last few days, there's been some odd outages where all you see are mysterious Stub errors.

Well - I'm working on figuring out exactly what's causing these (shockingly this problem seems to actually involve my Day Job, so i have actual software engineer professionals looking into it). Until that's resolved, I've slapped together a sort of dead-man's switch to see when the error starts and bluntly recover from it. Hopefully the impact to the chatting public will now be marginalized.

Whee!
“I know all those people that were in the film [...] But that’s when they were young and strong and full of life, you know?”
User avatar
DzM
Site Janitor
 
Posts: 10530
Joined: Sat Nov 29, 2003 2:11 am
Location: Bay Area, California, USA, North America, Western Hemisphere, Terra, Sol, etc etc
  • Website
Top

Post Thu Jan 10, 2008 9:19 am

So - Where are we with these moves and teething problems? It's been, after all, roughly three weeks since there was any update.

Before I go into the details, it may interest you to know that Gir (this server) actually hosts several web pages other than Pogues.com (including my own, dzm.com). It may also interest you to know that Pogues.com accounts for ~99.9 of the traffic Gir deals with. In a particular one hour period all the other web sites hosted here serviced approximately 500 requests. In that same hour Pogues.com serviced more than 16,000 requests. Some virtual server is what I like to call "resource intensive."

First the move: Generally this has gone well, though Gir (the server) now lives behind a smaller pipe (or in technical terms "a smaller t00b from the internwebnets"). This has caused a little concern for me an ServerMan, but I think I've got that wrestled down. Unfortunately this has meant removing teh sound samples from the various lyrics pages, officially throttling the bandwidth from Gir under peak loads (mean basically that a small amount of the t00b will always be reserved for other traffic, and under some very, very rare circumstances data coming from Gir may be delayed within the realm of about 10%. This should nearly ALWAYS be very short-lived though, so you likely don't need to worry about it.), and banning a Russian search engine from spidering the site (those of you in Russia - please go yell at Yandex.ru for writing a really fricken rude and misbehaving spider. Please also note that that is a small "s" there, not a large "S". Blocking Yandex.ru's fucked up spider has reduced Gir's bandwidth use by ~15-20%. Die die die.) So for now things are looking relatively good on that front. The move was also used as an excuse to offload email onto another service and to upgrade Gir to a fully 64-bit and modern version of Linux (it had been running Red Hat 9 for about five years - long story and stop laughing, dammit).

Second the teething problems: Due to the amount of traffic that Pogues.com handles, it exposed a bug (or possibly two) in the software that, by day, I'm paid to kinda be in charge of. So naturally I've had one of my developers looking into the problem in the context of "hey - this part of the software is teh buggy! Fix!" A preliminary fix was delivered on Dec 20 and has vastly improved stability of Gir (turns out a lot of the stability problems from the last two years are very likely the result of this bug). In fact there have been zero (that's right - ZERO) stability problems from Dec 20, 2007 through today. Yay! But the b00gs were not gone - oh no! Tonight I've installed the proposed final patch for the problem, and this should make things even more stable than ZERO PROBLEMS SINCE DEC 20, 2007! That's really stable and stuff.

So there we are. The move has largely been a heaping helping of success. We've learned some things about TCP/IP management in the Linux 2.6.x kernel space, we've learned things about FastCGI file descriptor and memory leaks, and we've entered the exciting world of 64-bit OS. And, really, haven't we all learned to love just a little bit more? And isn't that what it's really all about? The love?

Happy New Year everyone.
“I know all those people that were in the film [...] But that’s when they were young and strong and full of life, you know?”
User avatar
DzM
Site Janitor
 
Posts: 10530
Joined: Sat Nov 29, 2003 2:11 am
Location: Bay Area, California, USA, North America, Western Hemisphere, Terra, Sol, etc etc
  • Website
Top

More newness

Post Sat Jan 26, 2008 8:45 am

In addition to the things we do here that make you grumpy, your site janitors also do some relatively invisible things (or invisible when we're doing them right at least). One of these obnoxious tasks is the removal and banning os spam posts, user accounts, and ip addresses. This ongoing battle sometimes seems to be promising, and sometimes seems to be be doing poorly.

I am constantly trying to further remove the manual aspects of this task. I much prefer that the janitors concentrate on annoying you good folk (and I'm also determined to help Skynet come one small step closer to world domination). To that end Gir is now automagically pulling a list of 9000+ known Bad Guy IP addresses and banning them before their nasty little requests make it anywhere near our Medusa.

What's this mean to you? Probably nothing. Unless you just so happen to be a user of one of these 9000+ IP addresses. If that's the case then the site will appear to ... well ... just not exist. At all. No "you been banned" message, no "Hey, the server doesn't want to talk to you!" message. Just nothing, and eventually your browser will give up.

So if you're unable to see the site at all and you suspect this is a mistake, please let me know. Automagic processes don't always work as smoothly as we would like. I'm certain there are kinks to be worked out.

And yes I do appreciate that IF a real person has been caught in this net they will not see this post and will not know to contact me. I figure that's a feature. Voila! It works perfectly! Nobody is complaining! YAY!
“I know all those people that were in the film [...] But that’s when they were young and strong and full of life, you know?”
User avatar
DzM
Site Janitor
 
Posts: 10530
Joined: Sat Nov 29, 2003 2:11 am
Location: Bay Area, California, USA, North America, Western Hemisphere, Terra, Sol, etc etc
  • Website
Top

Re: More newness

Post Sat Jan 26, 2008 1:48 pm

DzM wrote:To that end Gir is now automagically pulling a list of 9000+ known Bad Guy IP addresses and banning them before their nasty little ....


That makes you "Good Guys". 8)
What kind of fuckery is this?
A. Winehouse
User avatar
Eric V
Yeoman Rand
 
Posts: 3396
Joined: Fri Dec 02, 2005 9:31 pm
Location: Washington, D.C.
Top

Re: More newness

Post Sat Jan 26, 2008 7:16 pm

Eric V wrote:
DzM wrote:To that end Gir is now automagically pulling a list of 9000+ known Bad Guy IP addresses and banning them before their nasty little ....


That makes you "Good Guys". 8)

suddenly it sounds so much more like the matrix...
thanks DzM
Insert Witty Username Her
Il Capitano
 
Posts: 172
Joined: Tue Jan 08, 2008 10:31 am
Top

Post Sat Jan 26, 2008 7:21 pm

"Good Guys"? Not so much. I've already discovered that contained in the Bad Guy list are two Yahoo! spiders (need those to get through so that people can find us), and at least two AOL proxy addresses (need those to get through so that the many dozens of you that use AOL can still be part of our happy squabbling family).
“I know all those people that were in the film [...] But that’s when they were young and strong and full of life, you know?”
User avatar
DzM
Site Janitor
 
Posts: 10530
Joined: Sat Nov 29, 2003 2:11 am
Location: Bay Area, California, USA, North America, Western Hemisphere, Terra, Sol, etc etc
  • Website
Top

More more downtime

Post Sat May 17, 2008 4:45 am

Today (Friday, May 16) there were two unexpected outages that required ServerMan to physically cycle the power of Gir. We're not positive what's causing the outages, but we suspect bad and/or overheating hardware (the temperatures in the San Francisco area have been hitting 100F the last few days). Until we figure out what the problem is, there's a good chance that more outages will occur and that we won't be around to be able to cycle the power. In these cases Medusa may be unavailable for hours or days. Sorry about that. We're working on it.
“I know all those people that were in the film [...] But that’s when they were young and strong and full of life, you know?”
User avatar
DzM
Site Janitor
 
Posts: 10530
Joined: Sat Nov 29, 2003 2:11 am
Location: Bay Area, California, USA, North America, Western Hemisphere, Terra, Sol, etc etc
  • Website
Top

Re: More more downtime

Post Sat May 17, 2008 2:30 pm

DzM wrote:Today (Friday, May 16) there were two unexpected outages that required ServerMan to physically cycle the power of Gir. We're not positive what's causing the outages, but we suspect bad and/or overheating hardware (the temperatures in the San Francisco area have been hitting 100F the last few days). Until we figure out what the problem is, there's a good chance that more outages will occur and that we won't be around to be able to cycle the power. In these cases Medusa may be unavailable for hours or days. Sorry about that. We're working on it.


Well if you had not spent all your money on boxed wine and acid in high school maybe you could have built a climate controlled room by now. :)
And I don't want no grave
Just throw my ashes in the field
And hope there's some soul left to save

W. E. Whitmore
User avatar
Clash Cadillac
Yeoman Rand
 
Posts: 3029
Joined: Tue Mar 06, 2007 4:37 pm
Location: Dakota
Top

Re: More more downtime

Post Mon May 19, 2008 6:36 am

DzM wrote:Today (Friday, May 16) there were two unexpected outages that required ServerMan to physically cycle the power of Gir.


Oh ServerMan, you're my hero. And Gir, it will only hurt for a moment.
Allow not nature more than nature needs, man's life is cheap as beast's.
User avatar
LittleCupcakes
Brighella
 
Posts: 950
Joined: Mon Nov 07, 2005 3:34 am
Location: Orange County, California
Top

Post Wed May 21, 2008 8:20 pm

See Sandy?? I told you you hadn't been banned...
“But I being poor, have only my dreams. I lay them at your feet...Tread softly; for you tread on my dreams.”
― John Keats
http://www.traceybookish.wordpress.com
User avatar
Irishbookish
Scaramuccia
 
Posts: 1273
Joined: Wed Dec 19, 2007 11:09 pm
Location: Wales
  • Website
Top

More more downtime

Post Fri Jun 20, 2008 12:07 am

FYI - Yesterday (June 18) Gir really started to crap out hard. And then the network connection crapped out.

Now that the network connection is back, we're working on figuring out the hardware problem. Either the MoBo is going bad, or the RAM has gone to shit. Or maybe one of the CPUs. Hard to know for sure. We're replacing parts one at a time. Hopefully we'll get this sussed out shortly.
“I know all those people that were in the film [...] But that’s when they were young and strong and full of life, you know?”
User avatar
DzM
Site Janitor
 
Posts: 10530
Joined: Sat Nov 29, 2003 2:11 am
Location: Bay Area, California, USA, North America, Western Hemisphere, Terra, Sol, etc etc
  • Website
Top

Re: Possible Downtime - New News

Post Fri Jun 20, 2008 1:17 am

:?
I guess it's time to start shopping. What a great excuse to roam around Amazon.com. :mrgreen:
Canta, no llore.
User avatar
territa
Scaramuccia
 
Posts: 1441
Joined: Thu Feb 05, 2004 2:03 am
Location: San Antonio
Top

Re: More more downtime

Post Fri Jun 20, 2008 10:10 am

DzM wrote:FYI - Yesterday (June 18) Gir really started to crap out hard. And then the network connection crapped out.

Now that the network connection is back, we're working on figuring out the hardware problem. Either the MoBo is going bad, or the RAM has gone to shit. Or maybe one of the CPUs. Hard to know for sure. We're replacing parts one at a time. Hopefully we'll get this sussed out shortly.

Probably more a case of Grrrgh than Gir?
I wish I'd done biology for an urge within me wanted to do it then
User avatar
Jon
Brighella
 
Posts: 880
Joined: Mon Aug 14, 2006 9:47 pm
Top

Re: More more downtime

Post Fri Jun 20, 2008 11:01 am

DzM wrote:FYI - Yesterday (June 18) Gir really started to crap out hard. And then the network connection crapped out.
Now that the network connection is back, we're working on figuring out the hardware problem. Either the MoBo is going bad, or the RAM has gone to shit. Or maybe one of the CPUs. Hard to know for sure. We're replacing parts one at a time. Hopefully we'll get this sussed out shortly.

Hmm...crapped out Gir, network connection down, (typical) bad MoBo, shyte RAM, malfunctioning CPU...or could be the rascally rastorizing streaming shareware vectors, moody ASCII's, juicy applet bytes, dastardly DHTML's, flashy FTP, HTTP, HTML, XML's, or maybe muddy maya's. But then...who knows, it could be something as simple as DzM's cool new avatar.
“But I being poor, have only my dreams. I lay them at your feet...Tread softly; for you tread on my dreams.”
― John Keats
http://www.traceybookish.wordpress.com
User avatar
Irishbookish
Scaramuccia
 
Posts: 1273
Joined: Wed Dec 19, 2007 11:09 pm
Location: Wales
  • Website
Top

Re: Possible Downtime - New News

Post Mon Jun 30, 2008 10:01 pm

For those following along at home - There was another five-hour outage Friday morning (California time; mid-afternoon for you EU-types). We're (that's "two of us", not "royal") pretty certain we've narrowed the problem down to a 1GB stick of crapped out RAM.

Over the weekend we reduced Gir to a single CPU and 50% of the RAM. Gir remained solid throughout. Now we've restored the second CPU and are letting Gir burn for a week or so (assuming no hangs) with the restricted RAM, but dual CPU. With luck this will prove that it's crap RAM. Then we can install the brand-spanking-shiny-new 2GB of replacement RAM.

Fingers are crossed.
“I know all those people that were in the film [...] But that’s when they were young and strong and full of life, you know?”
User avatar
DzM
Site Janitor
 
Posts: 10530
Joined: Sat Nov 29, 2003 2:11 am
Location: Bay Area, California, USA, North America, Western Hemisphere, Terra, Sol, etc etc
  • Website
Top

PreviousNext

Board index » General » Administrativia

All times are UTC

Post a reply
156 posts • Page 9 of 11 • 1 ... 6, 7, 8, 9, 10, 11

Return to Administrativia

Who is online

Users browsing this forum: No registered users and 1 guest

  • Board index
  • The team • Delete all board cookies • All times are UTC


Powered by phpBB
Content © copyright the original authors unless otherwise indicated