C-Gate log clock

Discussion in 'C-Bus Toolkit and C-Gate Software' started by more-solutions, Aug 28, 2013.

  1. more-solutions

    more-solutions

    Joined:
    Apr 23, 2006
    Messages:
    283
    Likes Received:
    4
    Location:
    Peterborough, UK
    Looking at some event logs, I note that the clock jumps backwards an hour after its been running "a while", where "a while" means anything from a minute or two to 15 mins or so. I assume this is daylight saving related as it only seems to happen in the logs that I took where BST is 1hr ahead of GMT/UCT.

    Is this a known issue? Seems hard to believe that it hasn't been noticed, unless its something weird about my setup.

    (C-Gate 2.9.5)

    Eg:
    Code:
    20130828-083106 800 cgate - C-Gate started.
    20130828-083107 300 sys dump cgate: ComputerName=LCSSVR
    20130828-083107 300 sys dump cgate: DatabaseVersion=2.3
    20130828-083107 300 sys dump cgate: EventLevel=7
    [...]
    20130828-083108 300 sys dump cgate: Version=v2.9.5 (build 2460)
    [...]
    20130828-083116 803 cmd5 - Host:/0:0:0:0:0:0:0:1 opened command interface from port: 49229
    20130828-[COLOR="Red"][B]083116[/B][/COLOR] 804 cmd5 - Host:/0:0:0:0:0:0:0:1 closed command interface from port: 49229
    20130828-[COLOR="red"][B]073225[/B][/COLOR] 803 cmd7 - Host:/0:0:0:0:0:0:0:1 opened command interface from port: 49234
    20130828-073225 804 cmd7 - Host:/0:0:0:0:0:0:0:1 closed command interface from port: 49234
    
     
    more-solutions, Aug 28, 2013
    #1
  2. more-solutions

    more-solutions

    Joined:
    Apr 23, 2006
    Messages:
    283
    Likes Received:
    4
    Location:
    Peterborough, UK
    C-gate fails to handle clock changes

    It turns out my BIOS clock was an hour out for some reason, and the time service on the server was correcting it after C-gate had started (as a service).

    HOWEVER: This has thrown up an issue in C-Gate, in that when the clock goes backwards it stops the network sync process completing successfully (which was what we were investigating when this came to light). Our site has 35 networks, and 25 start to sync initially with the remaining 10 coming in as the first ones complete. However, when the clock went back an hour, that stopped the other 10 being brought into sync. I guess that there is a timer process running that has its scheduling mixed up (and its likely if we waited an hour it would "catch up")? In effect that means that we couldn't control 1/3 of the site's lighting after rebooting the server without firing up Toolkit to sync the remaining networks.

    The fact that the clock discrepancy was big enough to spot helped us out here, but I would guess that a smaller clock change backwards would have a similar effect, and especially now that C-Gate can run as a service the likelihood that it will have a clock change after it has started will have increased.

    I have level 9 logs if required.
     
    more-solutions, Aug 28, 2013
    #2
  3. more-solutions

    daniel C-Busser Moderator

    Joined:
    Jul 26, 2004
    Messages:
    766
    Likes Received:
    20
    Location:
    Adelaide
    Hmmm yes, clock issues are always fun when there's the BIOS clock, the OS clock, and the Java/C-Gate clock.

    The reason the clock jump has an effect on the sync is that each network has the property:

    get 254 NextSyncTime
    300 //PRED24/254: NextSyncTime=20111219-174810

    You can also change it yourself.

    set 254 NextSyncTime 20111219-171105
    200 OK: //PRED24/254

    So when the clock jumped backwards an hour, those absolute timestamps just got an hour further away.

    I'm not sure yet how we can address this in the software; if the BIOS and OS clocks are fighting, we could apply some sort of clever patch in one or more apps but other apps might still fail. At some point we need to assert an assumption about the robustness of the underlying system.
     
    Last edited by a moderator: Sep 2, 2013
    daniel, Sep 2, 2013
    #3
  4. more-solutions

    more-solutions

    Joined:
    Apr 23, 2006
    Messages:
    283
    Likes Received:
    4
    Location:
    Peterborough, UK
    Thanks for that. Can you tell me a little more about the mechanism that's used when there's 35 networks to sync? I can do some experiments when I get back to the office but I'd rather know what to expect!

    I'm guessing that initially all networks get NextSyncTime set to "now" but after the first 25 networks are started the 26th pushes back its sync time because there's no capacity left. Presumably it only pushes it back a few seconds though, and repeatedly "fails" and resets its time until one of the other networks completes? I'm making that assumption because I can see that there's little delay between a network finishing and the next one starting.

    This only appears to be a major problem during the initial sync (if a subsequent sync gets delayed an hour nobody will notice) but are there any other processes that work a similar way we should be aware of?


    Starting from the assumption that networks should always be in sync (or, perhaps safer, should not be "new" for any length of time), a process that ran periodically (every few seconds) that checked if any networks were new and set their NextSyncTime to "now" if they were in the future would fix this (the process could die once all networks reach sync/error if it was a big overhead)?

    Once I understand the mechanism we can put something in place to monitor it and correct the NextSyncTime values when needed.

    Alternatively, is there a way to get C-Gate to start all 35 networks together? That would solve this without tackling the clock issues at all.
     
    more-solutions, Sep 2, 2013
    #4
  5. more-solutions

    more-solutions

    Joined:
    Apr 23, 2006
    Messages:
    283
    Likes Received:
    4
    Location:
    Peterborough, UK
    The more I look at this, the more I think this is the best solution.

    All my networks are accessed via CNI and network bandwidth shouldn't be an issue. The server is over-specced and should have enough CPU/RAM to handle 35 networks simultaneously.

    As far as I can tell the 25 limit is hard coded and not configurable? It seems a fairly arbitrary figure, is there a reason for it being set at 25?

    If this could be made configurable it ought to resolve our issue without having to find a more complex solution to the clock problem. I would like to agree that having a stable base system is a pre-requisite for having a stable C-Gate system on top, but I don't think I have ever seen a PC with a clock so reliable as to not need to sync to a time server with the obvious consequence that the clock may go forward or back whilst C-Gate is running. But if the only effect of that is to delay the start of sync for networks not yet in progress then having a way to ensure they all start to sync immediately would be a pretty good workaround. It also means that the system gets to a point where it's operational more quickly so is a benefit regardless of clock issues, and the time it takes to bring the PC up to operational state from a reboot is quite important to us as well.

    (Aside: Because the head end is Citect based, it really needs to be able to see the status of all group addresses across the site when it starts up in order to be able to show current lighting status and not generate floods of alarms. If all we were doing was sending scenes to the networks without caring what their current states were this probably wouldn't affect us. That probably means there are few people who would ever suffer from this problem.)
     
    more-solutions, Sep 4, 2013
    #5
  6. more-solutions

    daniel C-Busser Moderator

    Joined:
    Jul 26, 2004
    Messages:
    766
    Likes Received:
    20
    Location:
    Adelaide
    Hi Mark, the good news is we agree with you. We'll be making this configurable in a future release.

    Ref: CG-518
     
    daniel, Sep 6, 2013
    #6
  7. more-solutions

    more-solutions

    Joined:
    Apr 23, 2006
    Messages:
    283
    Likes Received:
    4
    Location:
    Peterborough, UK
    That's brilliant, thanks Daniel.

    You know my next question: do you have any ETA? Only because I'll be asked, and depending on the answer we might need to look at workarounds.
     
    more-solutions, Sep 6, 2013
    #7
  8. more-solutions

    daniel C-Busser Moderator

    Joined:
    Jul 26, 2004
    Messages:
    766
    Likes Received:
    20
    Location:
    Adelaide
    No ETA but we're most likely talking months, not days.
     
    daniel, Sep 9, 2013
    #8
Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.