Re: [WISPA] Systems Management - Process

Tom DeReggi Fri, 05 Nov 2010 18:15:57 -0700

> This automagically happens when your script to automagically update
> Nagios removes accounts which are marked as inactive.
>
Be careful with that idea. Automating  that almost killed us. The reason is 
that sometimes you may want to disable monitoring on an account that is 
live, because it may be temporarilly down or temporarilly getting false 
alarms. There were times when we'd have 10-15 alarms disabled manually. The 
problem then is that when you automate a global corss refference between 
billing and monitoring, it re-enables all teh accounts you wanted disabled 
temporarilly. Then you spend 30 mionutes re-disabling the account, if you 
can remember which they are, as you get reminders all niught long when you 
get it wrong.


I'm for automation, but no automation should check all the monitors and auto 
change. The automation should be on an account by account basis only. You 
dont want the automation to mess with accounts that are not the one you are 
specifically working on.

Tom DeReggi
RapidDSL & Wireless, Inc
IntAirNet- Fixed Wireless Broadband


----- Original Message ----- 
From: "Cameron Crum" <[email protected]>
To: "WISPA General List" <[email protected]>
Sent: Friday, November 05, 2010 8:22 PM
Subject: Re: [WISPA] Systems Management - Process


This is why we wrote wispmon. Handles virtually all this in a single 
platform.

Cameron

On Friday, November 5, 2010, Scott Lambert <[email protected]> wrote:
> On Fri, Nov 05, 2010 at 02:34:01PM -0700, Mark Nash wrote:
>> This is lengthy, but worth discussion, I think...
>>
>
>> Unless there is a good process in place to ensure that these
>> systems get updated when components on our networks are
>> added/removed/replaced/changed.
>
> That place is the billing system....
>
>> For instance... A new customer is added to our network... Information 
>> about that new customer goes into:
>>
>
>> - billing (several things here...email address verified, pro-rate
>> amount added for first month, valid billing address, name spelled
>> correctly, correct price, contract signed & stored, etc)
>>
>> - nagios (to monitor)
>>
>
> With the right information in billing, a bit of scripting will
> automagically keep your Nagios configuration up to date.
>
>>
>> - IP documentation (so we don't duplicate IPs)
>>
>
> Keep this in the billing system.
>
>>
>> - equipment documentation (so we know what we're dealing with if we
>> have to go out there again)
>
> Keep this in the billing system where your techs can update as needed.
>
>> - name the association on the AP so it's easily identifiable
>
> You can probably script this from the billing system if it tracks
> the MAC address of the customer's equipment. Depends on your APs
> and such.
>
>> Then if that customer cancels...
>>
>> - remove from billing
>
> Mark the account inactive in billing. Keep the data. Database
> storage is cheap these days. That's probably what you meant...
>
>> - remove from Nagios (so we stop monitoring)
>
> This automagically happens when your script to automagically update
> Nagios removes accounts which are marked as inactive.
>
>> - remove from IP documentation (so we can re-use that IP)
>
> Let the billing system mark the IP as inactive when the account is
> marked inactive.
>
>> - remove equipment documentation
>
> Keep the documentation on the account notes. They may come back.
> Database storage is cheap these days.
>
>> Or if that customer has to change towers on our network...
>>
>
> Update the billing system.
>
>> - change monitored IP address
>
> Automagically happens on the next Nagios configuration generation
> run.
>
>> - change IP documentation (so we can re-use the old IP)
>
> Do this through the billing system.
>
>> - change equipment documentation (if necessary)
>
> This is part of updating the billing system, which the on-site tech
> should do before leaving the customer's site. Updating the billing
> system while on-site ensures the Tech actually tested the connection
> by using it.
>
>> - name the association on the new AP so it's easily identifiable
>
> Hopefully you can script this from the billing system.
>
>> Now let's consider replacing a backhaul goes down...
>>
>> - change the routing to go to use a backup backhaul (we're using
>> manual re-routing, not autmatic)
>
> Dynamic routing. Manual, ick.
>
>> - change the hierarchy in our monitoring system (we use Nagios
>> "Parents" so that devices that are behind a "Down" device is not
>> "Down" itself, just "Unreachable" - saves the inbox from getting
>> blasted if a backhaul goes down
>
> If there are multiple paths, you can use multiple parents in Nagios.
> Nagios should do the right thing. We don't use the multiple parents
> option because it screws up the Map. But if the primary path goes
> down, the hosts which are still reachable stay up in Nagios.
>
>> - change the monitored IP address for the router at that site so we're
>> monitoring an IP address that is going over the backup backhaul
>
> You can create hosts in Nagios for each interface on a router if
> you want. Then you know when your backup path goes down before the
> primary dies.
>
>> Then you get it back up and you have to change these things back.
>>
>> My point of all of this is that there are a TON of details to take
>> care of, and if you try to grow fast you need systems and protocol
>> in place to deal with all of this information. Things get forgotten
>> about, and your system can be a mess before you know it.
>
> If it's not automatic, it won't get done accurately. If there are
> multiple locations for storing information about your customers and
> the configuration, they will get out of sync.
>
> I've worked at three ISPs. The one which invested in the billing
> system which could track everything about a customer and provisioned
> everything from the billing system had the fewest customer service
> complaints.
>
> The other two spend many many extra man-hours tracking down documentation
> inconsistencies each year.
>
> I'm a firm believer in what the billing system says is how the
> hardware is configured. It also ensures that there is less left
> over cruft when a customer leaves. You're not accidentally still
> hosting their DNS two years after they stop paying you.
>
>> We have used the method of using checklists for client changes (new
>> customer, repair order, disconnect).
>>
>> We're just now getting into cleaning up our systems & documentation
>> on infrastructure components (routers & backhauls & APs - OH
>> MY!!!). We have alot of information about the initial deployment of
>> infrastructure equipment, but as changes have happened, we have not
>> kept up with it.
>>
>> So we're looking at expanding upon our checklists for when
>> infrastructure components are deployed/changed/removed. We think this
>> will help the chaos.
>
> Codify it all in the billing system.
>
> --
> Scott Lambert KC5MLE Unix SysAdmin
> [email protected]
>
>
>
> --------------------------------------------------------------------------------
> WISPA Wants You! Join today!
> http://signup.wispa.org/
> --------------------------------------------------------------------------------
>
> WISPA Wireless List: [email protected]
>
> Subscribe/Unsubscribe:
> http://lists.wispa.org/mailman/listinfo/wireless
>
> Archives: http://lists.wispa.org/pipermail/wireless/
>


--------------------------------------------------------------------------------
WISPA Wants You! Join today!
http://signup.wispa.org/
--------------------------------------------------------------------------------

WISPA Wireless List: [email protected]

Subscribe/Unsubscribe:
http://lists.wispa.org/mailman/listinfo/wireless

Archives: http://lists.wispa.org/pipermail/wireless/ 



--------------------------------------------------------------------------------
WISPA Wants You! Join today!
http://signup.wispa.org/
--------------------------------------------------------------------------------
 
WISPA Wireless List: [email protected]

Subscribe/Unsubscribe:
http://lists.wispa.org/mailman/listinfo/wireless

Archives: http://lists.wispa.org/pipermail/wireless/

Re: [WISPA] Systems Management - Process

Reply via email to