iOS Random Crash #1

I’m desperately trying to track down a crash that is occurring in the wild, but I cannot reproduce. I don’t think it’s memory related (ie: running out of) but I think it’s a sequence issue. The users who report it report the same game process.

Exception Type:  EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note:  EXC_CORPSE_NOTIFY
Triggered by Thread:  0

Thread 0 name:
Thread 0 Crashed:
0   libsystem_kernel.dylib        	0x00000001c19ea414 __pthread_kill + 8
1   libsystem_pthread.dylib       	0x00000001df544b50 pthread_kill + 272 (pthread.c:1392)
2   libsystem_c.dylib             	0x000000019cec3b74 abort + 104 (abort.c:110)
3   libsystem_malloc.dylib        	0x00000001a38a149c malloc_vreport + 560 (malloc_printf.c:183)
4   libsystem_malloc.dylib        	0x00000001a38a1740 malloc_zone_error + 104 (malloc_printf.c:219)
5   libsystem_malloc.dylib        	0x00000001a38a0f9c free_list_checksum_botch + 40 (magazine_inline.h:194)
6   libsystem_malloc.dylib        	0x00000001a389c310 tiny_free_list_remove_ptr + 652 (magazine_tiny.c:0)
7   libsystem_malloc.dylib        	0x00000001a389d208 tiny_free_no_lock + 1000 (magazine_tiny.c:1268)
8   libsystem_malloc.dylib        	0x00000001a389e5e8 free_tiny + 428 (magazine_tiny.c:2449)
9   midnight                      	0x00000001002917b0 0x1001f8000 + 628656
10  midnight                      	0x00000001002a552c 0x1001f8000 + 709932
11  midnight                      	0x00000001002918d4 0x1001f8000 + 628948
12  midnight                      	0x00000001002029c4 LandscapeNode::~LandscapeNode() + 4 (LandscapeNode.h:18)
13  midnight                      	0x00000001002029c4 ILandscape::~ILandscape() + 4 (ILandscape.h:60)
14  midnight                      	0x00000001002029c4 ILandscape::~ILandscape() + 4 (ILandscape.h:60)
15  midnight                      	0x00000001002029c4 ILandscape::~ILandscape() + 12 (ILandscape.h:60)
16  midnight                      	0x00000001002918d4 0x1001f8000 + 628948
17  midnight                      	0x00000001002029c4 LandscapeNode::~LandscapeNode() + 4 (LandscapeNode.h:18)
18  midnight                      	0x00000001002029c4 ILandscape::~ILandscape() + 4 (ILandscape.h:60)
19  midnight                      	0x00000001002029c4 ILandscape::~ILandscape() + 4 (ILandscape.h:60)
20  midnight                      	0x00000001002029c4 ILandscape::~ILandscape() + 12 (ILandscape.h:60)
21  midnight                      	0x00000001002bb14c 0x1001f8000 + 799052
22  QuartzCore                    	0x0000000196d676fc CA::Display::DisplayLink::dispatch_items(unsigned long long, unsigned long long, unsigned long long) + 664 (CADisplay.mm:2635)
23  QuartzCore                    	0x0000000196e40a80 display_timer_callback(__CFMachPort*, void*, long, void*) + 280 (CADisplayTimer.cpp:166)
24  CoreFoundation                	0x00000001939ecdd0 __CFMachPortPerform + 176 (CFMachPort.c:537)
25  CoreFoundation                	0x0000000193a11fe8 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 60 (CFRunLoop.c:1991)
26  CoreFoundation                	0x0000000193a11378 __CFRunLoopDoSource1 + 596 (CFRunLoop.c:2131)
27  CoreFoundation                	0x0000000193a0b08c __CFRunLoopRun + 2360 (CFRunLoop.c:3146)
28  CoreFoundation                	0x0000000193a0a21c CFRunLoopRunSpecific + 600 (CFRunLoop.c:3242)
29  GraphicsServices              	0x00000001ab5d6784 GSEventRunModal + 164 (GSEvent.c:2259)
30  UIKitCore                     	0x000000019644aee8 -[UIApplication _run] + 1072 (UIApplication.m:3253)
31  UIKitCore                     	0x000000019645075c UIApplicationMain + 168 (UIApplication.m:4707)
32  midnight                      	0x0000000100201a8c main + 56 (main.m:5)
33  libdyld.dylib                 	0x00000001936ca6b0 start + 4

Thread 0 crashed with ARM Thread State (64-bit):
    x0: 0x0000000000000000   x1: 0x0000000000000000   x2: 0x0000000000000000   x3: 0x0000000000000000
    x4: 0x0000000000000000   x5: 0x0000000000000000   x6: 0x0000000000000001   x7: 0x0000000000000403
    x8: 0x00000000000005b9   x9: 0xfd763ba5b6cd452e  x10: 0xcccccccccccccccd  x11: 0x000000000000000a
   x12: 0x0000000000000000  x13: 0x0000000000000032  x14: 0x00000000a820826b  x15: 0x00000000000041aa
   x16: 0x0000000000000148  x17: 0x00000001006ff8c0  x18: 0x0000000000000000  x19: 0x0000000000000006
   x20: 0x0000000000000407  x21: 0x00000001006ff9a0  x22: 0x000000016fc056c0  x23: 0x0000000100650000
   x24: 0x0000000000000000  x25: 0x0000000000000000  x26: 0x000000016fc07c7d  x27: 0x00000001006ff8c0
   x28: 0x0000000000000011   fp: 0x000000016fc055d0   lr: 0x00000001df544b50
    sp: 0x000000016fc055b0   pc: 0x00000001c19ea414 cpsr: 0x40000000
   esr: 0x56000080  Address size fault

Those addresses that for some reason don’t resolve correctly - resolve like this in Xcode.

Screenshot 2021-02-02 at 22.08.53

Anyone have anything tips that might point me in the right direction?

So I wonder if I’m doing something wrong with actions and the render loop…

Here is an example…

As a character moves forward the scene is rendered x number of steps. This action would have been triggered as a result of a UI interaction eg: touching a ‘move forward gadget’

For each step and LandscapeView will be generated and the previous one destroyed.

        auto actionfloat = ActionFloat::create(TRANSITION_DURATION, 0, target, [=](float value) {
            
            options.here.x = options.moveFrom.x*(LANDSCAPE_DIR_STEPS - value) + options.moveTo.x*value;
            options.here.y = options.moveFrom.y*(LANDSCAPE_DIR_STEPS - value) + options.moveTo.y*value;
            
            if ( value >= target ) {
                stopMoving();
                return;
            }
        
            f32 result = value / target ;
            options.movementAmount = result;
            
            options.generator->Build(&options);
            UpdateLandscape();
            
            
        });
        
        this->runAction(EaseSineInOut::create(actionfloat));

Cut down version of UpdateLandscape

void panel_look::UpdateLandscape()
{
    if ( current_view ) {
        removeChild(current_view);
    }
    
    current_view = LandscapeView::create(&options);

    current_view->setAnchorPoint(Vec2::ZERO);
    current_view->setPosition(Vec2::ZERO);
    current_view->setLocalZOrder(ZORDER_FAR);
    addChild(current_view);
}

Now I can’t see any reason why current_view which is the only LandscapeView object would get released by the AutoReleasePool. Which I presume is the one in the main loop after DrawScene, because the only time LandscapeView could be released is when it’s removed from the children of these Scene at the top of UpdateLandscape.

So that suggests I’m missing something with how Actions are running and how the MainLoop is running. If they were two different threads I would absolutely get it… but I don’t think that’s the case…

Try this to test if it’s a re-entry issue:

void panel_look::UpdateLandscape()
{
    static auto creatingView = false;
    
    if (creatingView) {
        return;
    }
    
    creatingView = true;

    if ( current_view ) {
        removeChild(current_view);
        current_view->release();
        current_view = nullptr;
    }
    
    auto* newView = LandscapeView::create(&options);

    newView->setAnchorPoint(Vec2::ZERO);
    newView->setPosition(Vec2::ZERO);
    newView->setLocalZOrder(ZORDER_FAR);
    addChild(newView);
    
    newView->retain(); // If you're keeping a reference to a cocos2d::Ref object, then you should always explicitly retain it
    
    current_view = newView; 
    creatingView = false;
}

I’m not certain if this is the cause of your issue, but you’re keeping a reference to a cocos2d::Ref object in current_view, and it’s just good practice to explicitly retain that object in this case, then release it when you’re no longer using it. If you don’t do this, you may end up with a dangling reference in current_view if the object it is pointing to has been removed from the parent and auto-released elsewhere.

Ok. Thanks for that. I don’t think that is my issue - but I will check through my code and make sure I always retain. I’ve been working on the principal that the reference is just a quick reference while it’s part of the scenes children, rather than me looking it up again from the child list.

The re-entrant thing is interesting, but not sure I can test as I can’t replicate the issue.

I now understand how I get releases in the auto release pool. It’s affectively because I’ve made two calls to update landscape during the same event. It’s a buy product of not having an isDirty type check in updatelandscape ie: Nothing has really changed but two triggers to UpdateLandscape have happened because of two different conditions during the same processing loop. None of them are re-entrant though.

BTW: Anyone know why Xcode can’t resolve the cocos2d functions properly… I built everything with debug symbols on.

Yeah that’s exactly how I handle it in my code too, especially for parts that would require too many look-ups of the specific object, so it’s much quicker to hold a reference to it, as long as you ensure the reference is always valid (via the retain/release).

Have you considered adding some kind of logging to your application? It would enable you to better track down issues on end-user devices, assuming you add the required checks and log them. You can have a button in the app that the user can click to upload the logs, or you can just use some kind of service that you send the logs to while your app is running (best to batch them, not send one by one).

Yes, I’m thinking of the logging, but it needs to be cocos too… this crash is when an object is released, and the two calls that are of interest are tiny_free_list_remove_ptr and free_list_checksum_botch, so it looks like something is being released twice or something… So I think It’s going to take a lot of logging.

BTW: Is it possible to attach the cocos console running on a Mac/Windows to the app running on an iPhone? I do have a friendly tester who can reproduce the issue with my TestFlight builds…

So I think this is the hint that I need…

free_list_checksum_botch(rack_t *rack, void *ptr)
{
szone_error(rack->debug_flags, 1,
"incorrect checksum for freed object "
“- object was probably modified after being freed.”,
ptr, NULL);
}

I think the device needs to be registered to your developer account, but I’m not 100% certain about this.

That would point to a dangling reference somewhere in your code, or an extra release() somewhere when the object is already flagged as autorelease.

as your code sample, you shoud retain actionfloat after create it immediately. Because it can be auto release before runAction