Off to do battle

Discussion of all things technological and/or gadgety
Post Reply
BobbyK
Posts: 361
Joined: Sat Aug 16, 2008 12:56 pm

Off to do battle

Post by BobbyK »

So, I've had a fun weekend here.

Saturday morning, $CABLEMONOPOLY replaces the cable modem at Customer Site with a new unit, to allow them to upgrade from 18/2 to 100/10 service.

Our remote management and monitoring system uses http or https hits to check in and communicate. Servers are set to check in every 30 seconds, workstations every 5 minutes.

Starting about 3 hours after the installer leaves, we start getting server down notifications, followed by server up a couple of minutes later (specifically between 4 minutes and 6 minutes later). The servers impacted are all at this site, but there's no rhyme or reason or pattern. It's a mix of physical and virtual machines. There's 12 servers at the site, and no more than 2 at a time report down at the same time during all of this.

End users call to bitch about websites timing out. For testing purposes, I temporarily disable the web proxy on the perimeter UTM, with no change. So it's not that.

I finally manage to get a wireshark capture from both our RMM server, and an impacted machine, capturing normal checkins, badness, and the resumption of normal checkins. Guess what I see?

On our server, I see several minutes of no packets, followed by what Wireshark flags as a TCP Retransmit, a bunch of TCP Duplicate ACKs.

On the customer side, I see ACKs suddenly stop, and then a bunch of retransmits.

Pretty clear cut, right?

:evil: Nope. MOUTHBREATHER#1 at $CABLEMONOPOLY says "Signal is good, and I can ping the modem. It's not our problem."

I would like to point out that this is the same $CABLEMONOPOLY that we had to get the state's Public Service Commission involved before they would correct and admit that they fucked up an ACL on a CMTS somewhere, and blocked port 5060 TCP/UDP for about 1000 of their customers.

I'm either going to need a new liver later, or bail money. I know where their support call center is, and I'm now on hold for 15 minutes waiting for someone at the NOC, assuming that the aforementioned mouthbreather is in fact escalating me as requested.

FML
Greg
Posts: 8486
Joined: Tue Aug 19, 2008 2:15 pm

Re: Off to do battle

Post by Greg »

Flap on, flap off, the flapper!
Maybe we're just jaded, but your villainy is not particularly impressive. -Ennesby

If you know what you're doing, you're not learning anything. -Unknown
Sanity is the process by which you continually adjust your beliefs so they are predictively sound. -esr
BobbyK
Posts: 361
Joined: Sat Aug 16, 2008 12:56 pm

Re: Off to do battle

Post by BobbyK »

I did in fact use the phrase "flapping like a flock of geese" with the dude at the NOC. Who then proceeded to demand that a tech be sent onsite in the morning.

<sigh>
User avatar
308Mike
Posts: 16537
Joined: Wed Aug 13, 2008 3:47 pm

Re: Off to do battle

Post by 308Mike »

I feel for ya' brother!!! I don't know how many times I did battle with our NOC in the middle between us and corporate. Our NOC would pass my data and testing only to have corporate say it wasn't their problem, and we'd argue it wasn't us - it was THEM causing it. I spent more than an entire weekend trouble-shooting a Microsoft Policy issue someone from upstream had created and was screwing up EVERY NEW MACHINE attached to the AD tree, but had no effect on stand-alone machines.

We were able to FINALLY get them to change the screwed up policy, but not after costing us and the company many HUNDREDS OF THOUSANDS of dollars in lost productivity and salaries - but only AFTER I was able to PROVE the problem was because of one of THEIR F'ED up policies pushed down to us, was causing the problem.

Talk about a head-scratcher!! I kept coming up with the same problems, the NOC kept saying they had no idea why it was happening, and Corp was saying they had nothing to do with it.

When they FINALLY got someone to review the MS policies modified by corporate (of course, they NEVER messed up our Linux/Unix machines used by our engineers), they FINALLY found the problem and meekly admitted they'd created it but never issued an apology for all the wasted time and resources tracking down (and PROVING) THEIR ERROR.

YES, I still get hot thinking about it.

I UNDERSTAND!!!
POLITICIANS & DIAPERS NEED TO BE CHANGED OFTEN AND FOR THE SAME REASON

A person properly schooled in right and wrong is safe with any weapon. A person with no idea of good and evil is unsafe with a knitting needle, or the cap from a ballpoint pen.

I remain pessimistic given the way BATF and the anti gun crowd have become tape worms in the guts of the Republic. - toad
User avatar
randy
Posts: 8334
Joined: Wed Aug 13, 2008 11:33 pm
Location: EM79VQ

Re: Off to do battle

Post by randy »

If you need bail money after (allegedly) trashing said company's customer non-support center, I'm in.
...even before I read MHI, my response to seeing a poster for the stars of the latest Twilight movies was "I see 2 targets and a collaborator".
User avatar
First Shirt
Posts: 4378
Joined: Mon Aug 18, 2008 11:32 pm

Re: Off to do battle

Post by First Shirt »

I'm always willing to contribute to a worthy cause. I'm in!
But there ain't many troubles that a man caint fix, with seven hundred dollars and a thirty ought six."
Lindy Cooper Wisdom
BobbyK
Posts: 361
Joined: Sat Aug 16, 2008 12:56 pm

Re: Off to do battle

Post by BobbyK »

And to add insult to injury, three additional, widely dispersed sites on the same ISP have started flapping, as well.
Post Reply