1

Topic: Trying to really understand the remote log messages in Nware

After reading many a remote log out of failing Nions at installations, have been wondering if there is any
form of guide or compilation which educates us on what exactly the problem is in layman's terms.
As examples :-

error          mcp/processes            process 306 'piond' being terminated due to lack of contract
error          mcp/processes            process 304 'psud' has exited of its own accord
fault       piond/pacman            real-time thread error
error          piond/dsp                    30 sequential frames with dma errors - aborting
fault          piond/sound_engine  exception during poll: /dev/pion/sharc0: software timeout on command operation
error          piond/dsp                     5 sequential frames with dma errors
fault          piond/dsp_manager     watchdog tripped on DSP(s): A
error          piond/dsp                     6 dma errors: last DMA transaction did not complete...
fault          piond/sound_engine     exception during poll: dsp A health check failure: unknown error 330738: _error_data

So what exactly are the issues here and what can possibly be done to remedy them ??
Or why should the following messages be logged :-

           piond/xdab/manager    sample clock PLL not locked (peak phase error: -67)
               mcp                                   potential reboot event logged from power : power fail
               mcp/monitor                      missing shutdown state in environment


It would be an excellent troubleshooting tool, if some comprehensive technical manual can be created, if not already, to
expedite fault finding on project sites.

Such a manual should list the messages that were created in the first place and explain why there were logged and the
subsequent measures to take to attempt rectification if at all possible or else proceed for RMA if nothing can be done.

Perhaps even possible information on site connections/equipment that could be causing these messages to be logged.

Additionally a document on processes as preceeding these messages could prove to be helpful for site diagnostics.

Comments appreciated !!

Regards.

2

Re: Trying to really understand the remote log messages in Nware

100+ views and no one's got a comment to make?
So, I guess all log readers perfectly understand it !!

3

Re: Trying to really understand the remote log messages in Nware

What you're asking for is so complex.

Your best bet if you're still having issues is to get in touch with Peavey support direct and see if they can assist you with your problem. The guys there have always been very helpful to me.

It already sounds better

4

Re: Trying to really understand the remote log messages in Nware

Hi Chandru,

The short version is that if it were properly documented by the guys who wrote it, that would be great.

Unfortunately, they didn't bother with that kind of thing.

However, I will share with you what I have figured out.

Essentially, in the top section, your DSP A has its memory over-subscribed. I know in the compile report it probably says something like 99.89% or something like that. The important thing to remember is that the compiler is making estimates. If there are lots and lots of little devices, the margin of error on those estimates adds up. So, it is possible for the compiler to build an executable for a DSP chip which requires more memory than the DSP chip can access. When this happens we get memory errors. These memory errors can crash the DSP chip and trip the watchdog.

If you are regularly getting this kind of message on a DSP chip with a particular configuration, do something to radically change what is assigned to that DSP chip. I have fixed this kind of thing before by finding some medium to large DSP device and hard assigning it to that chip. It doesn't have to be anything special, just something that is currently assigned to a different DSP chip. When you move it to the DSP you are experiencing issues with and lock it down, then that makes the compiler make very different decisions about where to put things. This may just move the problem somewhere else, or it may resolve the issue completely.

I have always handled this on a case by case basis, but the important thing to remember is that DMA errors equal over subscribed memory and you should move things around to resolve the situation.

piond/xdab/manager    sample clock PLL not locked (peak phase error: -67)
This can be an indicator of questionable XDAB cabling or connectors. It is simply informing you that the clock is not locked and it is telling you how far out of sync it is.

mcp                                   potential reboot event logged from power : power fail
The time stamp associated with this message tells you when the power went out. It can be useful for troubleshooting why the system rebooted.

mcp/monitor                      missing shutdown state in environment
This isn't really a good message. When the NION shuts down, it should write the shutdown state into the non-volatile environment. If it is not there, the unit may or may not remain shut down when power is applied to the unit. I believe the latest versions of firmware ensure that there is an appropriate state message in the NV Environment, so you should not see this message with the latest versions.

There is so much more that can be covered. In short, the Remote Log in the NION is a tremendous tool for troubleshooting. I believe it is unparalleled by anyone else. Often I wish everyone's equipment had such a thorough log to review.

If you made it this far, thanks for reading and good luck!

Thanks!
Josh

Josh Millward
Burnt Orange Studios