1

Topic: TAO erros

On a small installation (1 project with 6 nions) I got some errors that are worrying me:

control thread: Nion #2 VLAN2: corba exception : SystemException (VMCID != TAO_DEFAULT_MINOR_CODE) : system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0' OMG minor code (2), described as '*unknown description*', completed = NO during discover_and_connect()

On our test environment this doesn't happen, on production happens regularly.

Attached an image with these errors.

Any suggestion ?

Post's attachments

Attachment icon nion_error.png 128.21 kb, 450 downloads since 2012-09-06 

2

Re: TAO erros

Need to give some background first;

Most processes that are moved over Ethernet use port numbers to allow the handoff from the Network Interface Card (NIC) application layer to the actual software running on your CPU.

Port 23 = Telnet
Port 80 = Internet browser
Port 25 = Email

NWare uses port 4000 for the control connection, port 1234 for Pandad discovery, port 4321 for the health packet, 1632 for RATC control.

But with deploying or the firmware upgrade, the process is far more critical, and it must be precisely managed and monitored.  You may have noticed during the deploy process, the write, read, verified steps showing up in the status window.

This requires a much firmer handshake between the client and the server, so a common developmental tool is used called Corba.

A couple of things will cause a Corba Exception.  One is disconnecting from a network when you are connected to a NION.  For example, when you change from the AV network so you can check email, and then connect back to the AV network to continue working.  Everything looks fine, and you can control levels, change routers, etc.  But when you deploy, the Corba exception or timeout happens.  Very frustrating!

The proper process is to DISCONNECT from the NWare project before leaving the AV network.

For a recovery from a failed deploy or firmware upgrade, the fix is frequently to just restart NWare.  But because Corba is a common tool, you may have another software running that also uses Corba, so Corba will not be shutdown.  For this, rebooting your computer may be required.  If all I’m running is NWare and Internet Explorer, I will just restart.  If I’m running much else, I will go for a reboot (takes longer, but it is a sure bet to fix the problem).

Here is something else that will clause problems with Corba, and that is latency.  This is why deploying through routers and firewalls will cause problems, especially when a large project is being deployed.  There is no recovery from this problem, but there is a solution.  Use a remote desktop computer on the other side of the router-firewall, transfer the project to that remote computer, and then deploy from that computer.

One last thing.  Rarely, I have needed to restart a NION to solve this problem, because Corba has crashed on the NION.  This has happened less than 5 or 6 times in 7 years, so a very rare event, but worth trying if you are connected direct through a switch, and you have rebooted your computer, but still have the problem.

Hopefully this explains and solves your problem.
Fergy

BTW, a 6 NION project is not a small project. The largest single project I programmed was 7 NIONs, though this was in a system that had over 55 NIONs when it was finished.  Creating small projects that can be part of a big system is easier to program and maintain.

Last edited by Fergy (2012-09-07 05:14:56)

Make it intuitive, never leave them guessing.