Tech Paper - MSMQ Communication Issues

Version 2

    Responder Communication Framework White Paper

     

    Diagnosing MSMQ communication problems can be challenging. Below are some tips to help troubleshoot. In many cases, these tips have come directly from Microsoft Technical Support.

     

    Verify MSMQ Installation

    The client and server machines must have the exact same Message Queuing configuration. That is, each machine must have the exact same Message Queue sub-components installed. If the server only has "Common" installed then the client must also only have "Common." Always double-check to ensure they are the same. This is a common mistake.

    • Windows XP and Windows Server 2003: Responder only uses private queues and therefore only requires the "Common" MSMQ components. Public queues are not used and therefore Active Directory sub-components are not necessary.
    • Windows 7 and Windows Server 2008 R2: When you're configuring MSMQ with these operating systems, you will need to enable Message Queuing Server Core and Multicasting Support. The Common option is no longer necessary or available.

     

    Verify Communication

    Verify that the client machine can access the destination (Responder Server) machine. The best way to do this is to open a command prompt and start a telnet session:

     

    telnet  1801
    

     

    If you fail to connect using telnet then there is a communication issue. If you are in a clustered environment then you need to make sure you are addressing either the IP or Name cluster resource in the appropriate cluster group.

    If the machine is multi-homed (which is common with server-class and clustered machines) then it is possible that MSMQ has bound to the wrong interface. Refer to Microsoft article KB329492: A cluster node with two network cards does not receive messages.

     

    Check Outgoing Queues

    The communication between client and server goes through a number of queues. It is often useful to look at the outgoing queues to determine where the communication is breaking down. Since two-way communication includes timeouts you will want to do this immediately after opening Responder Explorer (within the timeout period).

    • On the client machine: Ensure there are no messages in the outgoing queues destined for the rxserver.request queue on the message router.
    • On the message router machine (cluster group): Ensure there are no messages in the outgoing queues destined for the rxserver.service queue on the server machine.
    • On the server machine (cluster group): Ensure there are no messages in the outgoing queues destined for the [%DOMAIN%.%USER%].rxexplorer.response queue on the client machine.

    Here is a (somewhat simplified) diagram of the message queues that participate in our custom MSMQ remoting channel implementation. It is helpful to understand this when diagnosing problems - particularly when looking at the outgoing queues.

    MSMQCommunication.png

    A. Each client sends requests to the message router. The message router holds the messages in its request queues until a Data Services instance is ready to process. This allows the message router to feed more messages to faster services and fewer to those that are slower.

    B. Each application has its own .response and .acknowledge queues - they are excluded simply to make the diagram easier to read. Also note that the .response and .acknowledge queues are per-application and per-user (that is, the queue name includes both the application and user).