bugFreeciv - Bugs: bug #20940, freeciv-gtk2.exe causes a Stack...

 
 
Show feedback again

bug #20940: freeciv-gtk2.exe causes a Stack Overflow

Submitted by:  None
Submitted on:  Tue 02 Jul 2013 04:20:55 PM UTC  
 
Category: client-gtk-2.0Severity: 3 - Normal
Priority: 5 - NormalStatus: Fixed
Assigned to: Jacob Nevins <jtn>Originator Email: -unavailable-
Open/Closed: ClosedRelease: 2.3.4
Operating System: Microsoft WindowsPlanned Release: 2.3.5,2.4.2,2.5.0,2.6.0

Add a New Comment (Rich MarkupRich Markup):
   

You are not logged in

Please log in, so followups can be emailed to you.

 

(Jump to the original submission Jump to the original submission)

Sat 01 Feb 2014 11:04:00 AM UTC, comment #7:

> Close or retarget to 2.4.3?

Close, I think.

While we don't have positive confirmation that it's definitely fixed (not yet having a Windows build + not having found a way to reproduce massively long lists in the first place), the original backtrace posted here should now not be able to occur (due to patch #4327). So we've fixed something.

Jacob Nevins <jtn>
Project AdministratorIn charge of this item.
Sat 01 Feb 2014 10:27:09 AM UTC, comment #6:

Close or retarget to 2.4.3?

Marko Lindqvist <cazfi>
Project Administrator
Tue 10 Dec 2013 05:17:29 AM UTC, comment #5:

I too was unable to reproduce this crash in a test game. But I think that is because we have been looking in the wrong direction.

I do not think this crash was caused by 100 unit stacks moving, because the same crash occurs during bombard animation where no movement occurs.

If the client isn't keeping up with the server due to spending all its time on animation then it may be important that the test game is run on two machines, one for client and one for server.

Anonymous
Sat 07 Dec 2013 02:02:24 PM UTC, comment #4:

> The obvious quick fix would be to malloc in genlist_sort()

Done this in patch #4326, on general principles.

> Doing a full list sort every time we look for the highest
> priority thing to dequeue is pretty gross, when we could just
> keep the list sorted by inserting the call at the right place
> in enqueue_call(), especially as that already does a linear
> search over the list to look for duplicates.

Done in patch #4327. I hope that that change will make this stack overflow symptom go away (and also improve client performance a bit).

However, I was unable to reproduce huge call queue sizes in my limited testing. I only ever got a queue size of the order of the number of my cities with governors, even with 100-unit stacks of my own and my enemies' wandering the landscape in the vicinity of said cities.
I doubt that even the biggest games could have enough cities for cities*(void *) to blow the stack. I haven't looked at how the CMA works in detail to see what other worse cases there may be.

Jacob Nevins <jtn>
Project AdministratorIn charge of this item.
Sun 01 Dec 2013 12:04:33 PM UTC, comment #3:

Now I know how to do a guesswork decode of Windows stack traces, here's one:

It's a bit of a leap from input_from_server() to call_handle_methods() in agents.c, but not completely implausible -- the agents do respond to packets coming from the network, and there's plenty of scope for call_handle_methods() at least to be tail called (don't know about its callers).

Looking at assembler for 471779, I think we just called genlist_sort(). That allocates a temporary array of the size of its list on the stack, so certainly has capacity to blow the stack if passed a huge list (or invalid, although NULL should be trapped).

Given that this is a huge game, I'm going to guess that the list is huge rather than an invalid pointer -- it seems plausible that a large number of unit movements could cause a large number of calls to agents (CMA?) to be queued, especially if the client isn't keeping up with the server due to spending all its time on animation. And Windows' stack limit is usually smaller than Unix.

The obvious quick fix would be to malloc in genlist_sort(), but there are wider questions:

  • Doing a full list sort every time we look for the highest priority thing to dequeue is pretty gross, when we could just keep the list sorted by inserting the call at the right place in enqueue_call(), especially as that already does a linear search over the list to look for duplicates.
  • Maybe we also need some way to stop the agents queue growing huge in this situation, if that's what's happening?
Jacob Nevins <jtn>
Project AdministratorIn charge of this item.
Wed 10 Jul 2013 07:42:52 AM UTC, comment #2:

A couple of further hints about this bug.
It was made less likely by reducing the movement animation time.
I suspect this was because showing the gotos hadn't been finished at TC.

Also it is not just a client goto issue.

The same bug occurred during bombard combat.
With a dozen bombers attacking a city containing 100 spies you get to see the bombard combat animation 1200 times.
Unless you lower the combat animation the client will crash at TC.

Anonymous
Tue 09 Jul 2013 10:40:18 PM UTC, comment #1:

Have we got some big recursion or stack allocations in client goto?
We know Windows tends to run out of stack before Linux (see bug #17962 and friends), so we might be chewing massive amounts of stack without most developers noticing.

Jacob Nevins <jtn>
Project AdministratorIn charge of this item.
Tue 02 Jul 2013 04:20:55 PM UTC, original submission:

Am using vanilla 2.3.4 client on windows.
Am playing a longturn testgame with 2 minute timeout.
Currently T2700

This crash occurs while moving a large number of units as a group. I have 3 or 4 repeats of this crash.
I selected 100 howitzers and moved them as a group using a goto.
The client crashed - see below.
The server was fine.

Error occured on Tuesday, July 2, 2013 at 22:52:57.

C:\MyGames\Freeciv\freeciv-gtk2.exe caused a Stack Overflow at location 0056c0a6 in module C:\MyGames\Freeciv\freeciv-gtk2.exe.

Registers:
eax=166cc850 ebx=05a32210 ecx=00032910 edx=455100fc esi=03320a80 edi=00000000
eip=0056c0a6 esp=0022f908 ebp=0022f948 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202

Call stack:
0056C0A6 C:\MyGames\Freeciv\freeciv-gtk2.exe:0056C0A6 WinMain /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:402
00471779 C:\MyGames\Freeciv\freeciv-gtk2.exe:00471779 WinMain /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:402
004717EE C:\MyGames\Freeciv\freeciv-gtk2.exe:004717EE WinMain /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:402
0042D92E C:\MyGames\Freeciv\freeciv-gtk2.exe:0042D92E WinMain /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:402
6C3463D2 C:\MyGames\Freeciv\libgdk-win32-2.0-0.dll:6C3463D2 gdk_drawable_get_visible_region
685ED10B C:\MyGames\Freeciv\libglib-2.0-0.dll:685ED10B g_main_context_dispatch
685ED925 C:\MyGames\Freeciv\libglib-2.0-0.dll:685ED925 g_main_context_dispatch
685EDE61 C:\MyGames\Freeciv\libglib-2.0-0.dll:685EDE61 g_main_loop_run
00E94A80 C:\MyGames\Freeciv\libgtk-win32-2.0-0.dll:00E94A80 gtk_main
00405DD7 C:\MyGames\Freeciv\freeciv-gtk2.exe:00405DD7 WinMain /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:402
004072ED C:\MyGames\Freeciv\freeciv-gtk2.exe:004072ED WinMain /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:402
0040235B C:\MyGames\Freeciv\freeciv-gtk2.exe:0040235B console_main /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:318
00402418 C:\MyGames\Freeciv\freeciv-gtk2.exe:00402418 WinMain /Users/hercules/trunk/SDL-1.2/./src/main/win32/SDL_win32_main.c:402
00401AD6 C:\MyGames\Freeciv\freeciv-gtk2.exe:00401AD6
004010DB C:\MyGames\Freeciv\freeciv-gtk2.exe:004010DB
00401158 C:\MyGames\Freeciv\freeciv-gtk2.exe:00401158
77E6F1EB C:\WINDOWS\system32\kernel32.dll:77E6F1EB ProcessIdToSessionId

Anonymous

 

(Note: upload size limit is set to 1024 kB, after insertion of the required escape characters.)

Attach File(s):
   
   
Comment:
   

No files currently attached

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by cazfi (Posted a comment)
  • -unavailable- added by jtn (Posted a comment)
  •  

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.

     

    Please enter the title of George Orwell's famous dystopian book (it's a date):

     

     

    Follow 6 latest changes.

    Date Changed By Updated Field Previous Value => Replaced By
    Sat 01 Feb 2014 11:04:00 AM UTCjtnStatusIn Progress=>Fixed
      Open/ClosedOpen=>Closed
    Sat 07 Dec 2013 02:02:24 PM UTCjtnStatusNone=>In Progress
      Assigned toNone=>jtn
    Sun 01 Dec 2013 12:06:08 PM UTCjtnPlanned Release=>2.3.5,2.4.2,2.5.0,2.6.0
    Tue 09 Jul 2013 10:40:26 PM UTCjtnCategoryNone=>client-gtk-2.0
    Show feedback again

    Back to the top


    Powered by Savane 3.1-cleanup