|
4.0 Session
< Previous Next >
re: Recovering from gaps early in list orders etc.
Scott Atwell / American Century <> 5 Feb 1998 3:21PM ETWe use Method #1 with a twist. We Logout of the FIX session if we issue a certain number of Resend Requests in a row. Upon reconnecting, we should detect the gap and issue a single Resend Request and then be back in sync. The only problems we've had with this have had to do with how the system which accepts the connection works. It's important that the both sides refrain from sending queued or new messages for some reasonable period of time (i.e. 2-10 seconds) upon connecting to allow for both sides to identify seq num gaps and issue Resend Requests. I believe, language indicating the importance of this was part of the 4.0 spec.
Our experience would indicate that if you're using TCP, it's rather uncommon to miss a specific message and continue with additional messages within a stream of messages (ruling out software error). Missed messages and gaps tend to be more associated with no response/disconnect and reconnect scenarios. Your mileage may vary....
> We've been testing our fix engine with large list orders. These messages by their nature are sent in large packs. We been testing with lists of about 150 orders. (These list are based on real programme trades, indeed we are replaying programmes done historically with counterparties). There are two ways of handling message processing in the fix protocol and these both have problems as I understand them.
>
> Method 1
>
> If you miss a message, ignore subsequent messages and request resends. In a situation where the message is one of the first in a large list order. This generates huges amounts of resend requests as can be seen from considering the following list of 5
>
> Msg 2 - List item 1
> Msg 3 - Lost for whatever reason
> Msg 4 - List item 3, Discarded triggers resend request for 3 - 4
> Msg 5 - List item 4, Discarded triggers resend request for 3 - 5
> Msg 6 - List item 5, Discarded triggers resend request for 3 - 6
> Msg 3 - List item 3 poss dup set in response to 4
> Msg 4 - List item 4 poss dup set in response to 4
> Msg 3 - List item 3 poss dup set in response to 5
> Msg 4 - List item 4 poss dup set in response to 5
> Msg 5 - List item 5 poss dup set in response to 5
> Msg 3 - List item 3 poss dup set in response to 6
> Msg 4 - List item 4 poss dup set in response to 6
> Msg 5 - List item 5 poss dup set in response to 6
> Msg 6 - List item 6 poss dup set in response to 6
>
> The message traffic increases as a sum of series type thing and this is clearly unacceptable. Eg missing the first message in a list of 150 items causes the 150th message to be retransmitted 150 times. The 149th message to be transmitted 149 times etc. Total about 150 * 75 messages.
>
> Some implementations used the latency of thier back ends to group responses to the resends together and this damps down the number of messages.
>
> Method 2
>
> Once you've missed your first message, store but don't process all further messages
>
> Msg 2 - List item 1
> Msg 3 - Lost for whatever reason
> Msg 4 - List item 3, stored triggers resend request for 3 - 3
> Msg 5 - List item 4, stored
> Msg 6 - List item 5, stored
> Msg 7 - List item 6, stored
> Msg 8 - List item 3, poss dup set in response to 4, processes 3 - 7
>
> Fine, least volume, probably quickest but the stored messages may not be ones that the counterparty would have resent (as allowed by the standard) thus causing a business error as two different fix implimentations would give two different results.
>
> What we're currently trying
>
> Once you've missed your first resend message ignore all further out of sequence messages until you've resynced.
>
> Msg 2 - List item 1
> Msg 3 - Lost for whatever reason
> Msg 4 - List item 3, Discarded triggers resend request for 3 - 4
> Msg 5 - List item 4, Discarded
> Msg 6 - List item 5, Discarded
> Msg 3 - List item 3 poss dup set in response to 4
> Msg 4 - List item 3 poss dup set in response to 4
> Msg 7 - Heartbeat, triggers resend request for 5 - 7
> Msg 5 - List item 5
> Msg 6 - List item 6
> Msg 7 - GapFill
>
> Fine but again the retransmit may be a whole list of 150 programs and relies on heart beat to trigger the resend. This is quite slow as the size of the programmes increase. Total a little over 300 messages. We also put a test message on the end, after a delay, to flush out the situation where both sides are stalled.
>
> What have other people done about this, have I misunderstood something. Any thoughts.
>
> Kevin J Houstoun
> 106333.2651@compuserve.com
>
re: Recovering from gaps early in list orders etc. Scott Atwell / American Century 5 Feb 1998 3:21PM ET
|