Re: Witango-Talk: Select 1 From xxx where 1=0

Robert Garcia Wed, 25 Jan 2006 12:17:19 -0800

I don't know if this helps you, but I have come across a situation,that caused a problem, and I resolved it.

We have a master db server, and a slave that is replicated frommaster. Master is DB1, and slave is DB3.

We have a load balancing system using vars for the datasource. Idescribed this in a recent thread.

We were seeing with our alarms that occasionally, witango was goingdown, during busy parts of the day. Now, my alarm, is hitting thesame db on each db server, sequentially and reports a failure. Thealarm reported a general failure, not which of the dsns were notworking. After investigating, we found that it was not that witangowas going down, but that only one of the dsns was reporting a tcp/ipcommunication failure.

So to track this down better, we separated our 6 alarms, 6 checks to6 wtiango servers, check both dsns at once, to 12 alarms, 1 alarm foreach dsn for each of the 6 witango servers. We did this to help FINDthe problem.

The weird thing, is that this helped ELIMINATE the problem. For somereason, witango did not like hitting these 2 dsns sequentially in thesame taf, during heavy traffic periods. Just by separating thealarms, so that one is hit on one request, and the other on another,almost elimated the problem.


A few more notes about this.

The db3 dsn was the only one that ever failed. I am not sure why, butit should also be noted that in these alarms, it was always thesecond to done in the sequence.


DB1 and DB3 are identical in every way, hardware and software.

The last note is a bit more complicated, but I am including topossible give you more things to think about, (as if you need any more).

When we moved up to witango 55, we did tons of testing first, andfound it was much more reliable than 5.0, and eliminated manyinstabilities. I did notice that witango would still get into asituation during extremely heavly loads where it would stop servingrequests, but instead of crashing, it wouild hang for a few secondsand recover. You can search in the archives where I made theseobservations and included pictures of the task manager showing thisoccuring.

In order to further minimize this, we worked heavily with a PrimeBaseengineer to tune the PrimeBase ODBC driver and connection dlls. Wefound the only thing to completely eliminate it was to add debug codethat slowed down the time between simultaneous requests. We used thisfor a while, and it didn't affect witango performance, except on tafsthat would loop and do sequential inserts or updates.


With this setup, there was virtually ZERO downtime due to witango.

Later, we started the replication system, and did some major hardwareupgrades. Performance was great, but we experienced these tcp/ipdisconnects, where all 6 servers would go down at once. We replacedand reconfigured switches, and tried many things, in the end, it wasthe NICS in the witango machines. We were using server class nics,but that were using the Marvell Yukon chipset. We replaced with intelnics, and the problem went away.

Now, on to a month or so ago, primebase came out with a major update,and it was supposed to really increase performance. When we made theupdate, we started having the tcp/ip disconnects again, just on db3,that I started this email on. It happend only about once a day, andeffected 3-5 witango servers simultaneously. Just restarting witangoservice to reinitiate connection would return functionality.

By fixing the alarms to not hit one then the other db sequentially,the problem almost went away.

The last thing, these motherboards in DB1 and DB3 are socket 939gigabyte motherboards with dual gigabit ethernet. The ethernet chipsare not the same, and the one being used primarily was a marvel yukonchip. These servers are running linux fedora core 4, with theethernet driver compiled into the kernel, for serious performance.

We switched the primary ethernet comm to go through the NON yukonchip, and now we are back to ZERO problems. The servers have notburped on a single request since.

I don't know how much of that is going to help you, if at all. I hopeit does, I know how painful it is chasing these things down, but asin my case, it was a combination of several things.

One thing I would try, is instead of having a single taf, that hitsthe 2 dsns in sequence, make 2 tafs, and call them in sequenceseperately, and see if you get the same errors.


Anyway, hope that helps.

--

Robert Garcia
President - BigHead Technology
VP Application Development - eventpix.com
13653 West Park Dr
Magalia, Ca 95954
ph: 530.645.4040 x222 fax: 530.645.4040
[EMAIL PROTECTED] - [EMAIL PROTECTED]
http://bighead.net/ - http://eventpix.com/

On Jan 25, 2006, at 10:46 AM, Dave Machin wrote:

Thanks for the feedback.

We're still stumped.  We've confirmed that the ODBC variable is set

correctly before and after our two test queries, and yet one of thetwo (orsometimes both) execute against the wrong data source. Queriesthat happenlater or earlier in the request execute against the correct datasource.

I've compiled some notes of one example from this morning, ifanyone has

some time they could take a look and see what we're seeing.

You can download the document here:
www.benchmarkportal.com/witango_error_notes.zip

In short, we're showing that queries execute correctly at first;the uservariable is correct before the query in question; the first testquery then

executes against the wrong data source; the second test query executes

against the correct data source; and then the next query executescorrectly

as well.

This happens in 5.5.009 on three different servers. The currentproductionserver is a new, clean build with the latest updates and the latestMDACdrivers. This .taf does not use the user reference argument, andrelies on

the cookie.  What's confusing is that one query in the middle of a a
sequence of ten goes wrong and the others work fine - all in the same
request.

We're going to downgrade from 5.5 back to 5.0 to see if the problemgoes

away, because we can't find anything else to change.

----- Original Message -----
From: "Customer Support" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Tuesday, January 24, 2006 5:32 PM
Subject: Re: Witango-Talk: Select 1 From xxx where 1=0

Is it unusual that our production server doesn't issue the
heartbeat query before each DB action?


Yes.  It may be going to another DB or DB Server as it is based on
the datasource and tables in the db.

On our development machine, I wrote a .taf application that issued
an identical DB action 20 times (just copied and pasted the same
DBMS action over and over).  In SQL profiler, when that application
is executed, I see the heartbeat and then the query repeated 20
times in pairs as expected.

But on our production server, we often see cases where there is no
heartbeat query before a DBMS action.  Sometimes we see two or
three DB actions execute before we see another heartbeat query.


If you are running the same version of the Witango Server, OS, ODBC
and DB on all servers and they are exhibiting different behaviors
then you probably have a difference in the configuration of one or
more of Witango Server, OS, ODBC and/or DB processes.

Your problem sounds like tafs interacting with eachother, hard coded
user references, lost user reference cookies or lost user references
argument.  If you do not use user reference arguments and rely on the
user reference cookie make sure that it is working and has a value or
the server will fall back to issuing a new user reference cookie.

If you are using iframes or AJAX check that you do not have one taf
interacting with another under the same user reference reseting a
variable when you are not expecting it.


Witango Support

________________________________________________________________________

TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf

________________________________________________________________________

TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf


________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf

Re: Witango-Talk: Select 1 From xxx where 1=0

Reply via email to