bugFreeciv - Bugs: bug #21083, verbose logging ("-d 3")...

 
 
Show feedback again

bug #21083: verbose logging ("-d 3") + log_error() => server hang due to recursive logging

Submitted by:  Jacob Nevins <jtn>
Submitted on:  Sat 31 Aug 2013 11:48:37 AM UTC  
 
Category: NoneSeverity: 3 - Normal
Priority: 5 - NormalStatus: None
Assigned to: NoneOpen/Closed: Open
Release: S2_4 r23235Operating System: None
Planned Release: 

Add a New Comment (Rich MarkupRich Markup):
   

You are not logged in

Please log in, so followups can be emailed to you.

 

(Jump to the original submission Jump to the original submission)

Mon 15 Sep 2014 07:13:50 AM UTC, comment #6:

You are right, my patch wouldn't work.

Also, if we don't want to lose messages on recursion case. Maybe should the whole log should be a thread?

pepeto <pepeto>
Project Member
Mon 15 Sep 2014 12:03:45 AM UTC, comment #5:

I think the only way to fix this properly is by storing id of the thread that has the mutex locked and detecting the recursive call case based on that.

Marko Lindqvist <cazfi>
Project Administrator
Mon 15 Sep 2014 12:01:01 AM UTC, comment #4:

> I have been unable to reproduce, but maybe this patch would help.


How does it make a difference between recursive calls and calls from different threads? There's no point in having the mutex at all if calls from multiple threads is broken anyway. It's presumably also quite common case with threaded AI, so fix like this is probably worse than original bug - imagine the debugging hours spent on investigating something when some log messages just aren't there as expected.

Marko Lindqvist <cazfi>
Project Administrator
Sun 14 Sep 2014 07:46:07 PM UTC, comment #3:

This hang seems to be related to mutex recursion. I have been unable to reproduce, but maybe this patch would help.

(file #22207)

pepeto <pepeto>
Project Member
Tue 18 Feb 2014 10:34:28 PM UTC, comment #2:

See also bug #21683.

> Maybe the lock is being kept longer than necessary?
> Why the logfile would need to be locked when one is doing the
> network part?

The lock is in log_real() around the call to log_write(). log_write() does two things: writes to the file, and calls log_callback, which in this case is con_handle_log(), which sends stuff to the network, causing recursive hilarity to ensue.

It feels like some of the things in con_handle_log() don't want to be called by multiple threads simultaneously, though.

Jacob Nevins <jtn>
Project Administrator
Tue 29 Oct 2013 12:23:30 AM UTC, comment #1:

Maybe the lock is being kept longer than necessary? Why the logfile would need to be locked when one is doing the network part?

Marko Lindqvist <cazfi>
Project Administrator
Sat 31 Aug 2013 11:48:37 AM UTC, original submission:

Server hung consuming no CPU when I was testing an error case, but only when I had both "-d 3" and "-l foo.log" on the server command line.

I think the problem is that when log_error() is called on the server, it provokes a chat message to send to clients in the depths of log_real(); with "-d 3" the resulting network traffic is itself logged, leading to a deadlock waiting for a lock on the logfile. (I think.)

This is similar to bug #20721 but the fix will be more involved.

Jacob Nevins <jtn>
Project Administrator

 

(Note: upload size limit is set to 1024 kB, after insertion of the required escape characters.)

Attach File(s):
   
   
Comment:
   

Attached Files
file #22207:  log_mutex.patch added by pepeto (2kB - text/x-patch)

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by pepeto (Updated the item)
  • -unavailable- added by cazfi (Posted a comment)
  • -unavailable- added by jtn (Submitted the item)
  •  

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.

     

    Please enter the title of George Orwell's famous dystopian book (it's a date):

     

     

    Follow 3 latest changes.

    Date Changed By Updated Field Previous Value => Replaced By
    Mon 15 Sep 2014 07:13:50 AM UTCpepetoStatusReady For Test=>None
    Sun 14 Sep 2014 07:46:07 PM UTCpepetoAttached File-=>Added log_mutex.patch, #22207
      StatusNone=>Ready For Test
    Show feedback again

    Back to the top


    Powered by Savane 3.1-cleanup