Bug creation and email sending has been disabled, file new bugs at gcc.gnu.org/bugzilla
Bug 97 - Experiencing intermittent crash in rt_init() when loading DLL
Summary: Experiencing intermittent crash in rt_init() when loading DLL
Status: RESOLVED UPSTREAM
Alias: None
Product: GDC
Classification: Unclassified
Component: libgdruntime (show other bugs)
Version: development
Hardware: x86 Other
: --- critical
Assignee: Iain Buclaw
URL:
Depends on:
Blocks:
 
Reported: 2014-02-01 10:23 CET by Mike
Modified: 2023-01-07 21:08 CET (History)
0 users

See Also:


Attachments
rt_init-crash.png (185.06 KB, image/png)
2014-02-01 10:23 CET, Mike
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike 2014-02-01 10:23:24 CET
Created attachment 57 [details]
rt_init-crash.png

This was migrated from https://bitbucket.org/goshawk/gdc/issue/351/experiencing-intermittent-crash-in-rt_init

Manu Evans created an issue 2012-06-18
****************************************
I have a crash that only happens occasionally when loading a GDC-64 DLL. The same DLL may work or not work depending on the direction of the wind. Though it seems to crash far less often than it does.

The callstack, and various other details are visible in the image attached...

It appears to crash fetching __blkcache_storage, a TLS variable. The code that loads it looks odd to me. A few points of interest:

* How can the final mov refer to rbx when only eax was loaded? Who's to say the top bits will be zero?
* The magic address doesn't appear to be a valid offset to me...
* rsi is a good pointer, but it points to a bunch of string data, including source code snippets. Not what I expected... moduleinfo of some sort? debuginfo?
* The same pattern of loading ebx and using rbx is repeated above with eax->rax, except the wild absolute magic number is dereferenced this time... (how does that even work?) 

I don't follow the code GDC is generating here :/ .. Does it look okay to anyone else?

This is affecting our whole team daily... any input or ideas what might be going on would be much appreciated!

See attachment <rt_init-crash.png>

Manu Evans - 2012-06-18
****************************************
edited description

    
Iain Buclaw - 2012-06-19
****************************************
* changed status to open
* assigned issue to Daniel Green 

I'm not sure rt.lifetime is well suited for shared libraries on windows yet.

Daniel, could you look into this?


Manu Evans - 2012-06-19
****************************************
I can probably supply a binary... but I think the precise context when loading the dll is critical, because it usually loads fine without problems, so a binary may not be of any use.

  
Daniel Green - 2012-06-20
****************************************
Looking at that, it's definitely crashing in loading a TLS variable.

* The use of EBX/RBX is acceptable as 32-bit operations are implicitly zero extended to 64-bit.
* The issue looks to be with the value being loaded. 0xACEE47F8 ( 3 billion )
* RSI should be ok, as it doesn't require TLS relative location to function. 

This value loaded into EBX should be relative to the TLS section which means it should be significantly smaller.

Did you custom build this? The value loaded into EBX is determined at link time without a TLS aware assembler/linker you wouldn't get the relative offset.

Can you output a map file with -Wl,-Map=output.map for the DLL and the output of the following command on lifetime.o? To extract it run

ar x libgphobos.a lifetime.o

Then compare it with this? Line 2ea is where the magic happens that generates the relative offest.

I'll work on getting the assembly output from a Dll dump as well to ensure it's linking properly.  

$ /c/MinGW64/bin/objdump.exe -d -r -M Intel lifetime.o 
00000000000002d0 <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo>:
     2d0:	56                   	push   rsi
     2d1:	53                   	push   rbx
     2d2:	48 83 ec 28          	sub    rsp,0x28
     2d6:	8b 04 25 00 00 00 00 	mov    eax,DWORD PTR ds:0x0
			2d9: R_X86_64_32S	_tls_index
     2dd:	65 48 8b 34 25 58 00 	mov    rsi,QWORD PTR gs:0x58
     2e4:	00 00 
     2e6:	48 8b 34 c6          	mov    rsi,QWORD PTR [rsi+rax*8]
     2ea:	bb 08 00 00 00       	mov    ebx,0x8
			2eb: secrel32	.tls$GCC
     2ef:	48 8b 04 1e          	mov    rax,QWORD PTR [rsi+rbx*1]
     2f3:	48 85 c0             	test   rax,rax
     2f6:	74 08                	je     300 <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo+0x30>
     2f8:	48 83 c4 28          	add    rsp,0x28
     2fc:	5b                   	pop    rbx
     2fd:	5e                   	pop    rsi
     2fe:	c3                   	ret    


Daniel Green - 2012-06-24
****************************************
Here's the Map information from a DLL that was built.

.tls            0x000000006fa62000      0x200
                0x000000006fa62010                _D2rt8lifetime18__blkcache_storagePS2rt8lifetime7BlkInfo

Subtracting _D2rt8lifetime18blkcache_storagePS2rt8lifetime7BlkInfo from .tls, for a secrel32 offset gives 0x10.

This is the same value as shown in the assembly dump from the Dll and is in contrast with the value 0xACEE47F8 as shown in your assembly dump.

000000006fa0972b <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo>:
        int __nextRndNum = 0;
    }
    int __nextBlkIdx;
}

@property BlkInfo *__blkcache()
    6fa0972b:	55                   	push   rbp
    6fa0972c:	48 89 e5             	mov    rbp,rsp
    6fa0972f:	48 83 ec 30          	sub    rsp,0x30
{
    if(!__blkcache_storage)
    6fa09733:	8b 04 25 4c c4 a5 6f 	mov    eax,DWORD PTR ds:0x6fa5c44c
    6fa0973a:	65 48 8b 14 25 58 00 	mov    rdx,QWORD PTR gs:0x58
    6fa09741:	00 00 
    6fa09743:	48 8b 14 c2          	mov    rdx,QWORD PTR [rdx+rax*8]
    6fa09747:	b8 10 00 00 00       	mov    eax,0x10
    6fa0974c:	48 8b 04 02          	mov    rax,QWORD PTR [rdx+rax*1]
    6fa09750:	48 85 c0             	test   rax,rax
    6fa09753:	0f 95 c0             	setne  al
    6fa09756:	83 f0 01             	xor    eax,0x1
    6fa09759:	84 c0                	test   al,al
    6fa0975b:	74 5f                	je     6fa097bc <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo+0x91>

Manu Evans - 2012-06-24
****************************************
The objdump gave us the same thing you pasted.

So you think it's just a bad toolchain? Any chance of a new 2.059 toolchain with that patch applied? Will that fix the problem?

It's very strange that this only occurs occasionally. You'd think this would cause the DLL to fail to load every time... but we only see it fail occasionally. Other times it loads just fine

Does LoadLibrary actually patch the offsets in the loaded binary with the absolute addresses as it loads or something?


Daniel Green - 2012-06-24
****************************************
Can you generate a Map file for the library? I'd like to compare the offsets with what's in your assembly code.

If the objdump produced the secrel32 output, then it's probably something else. With the data I had, that was the most likely scenario. The TLS patch fixes a bug in the linker as well as giving access to secrel32 relocation in assembly. If for some reason the compile/link phase was using binutils(gas or ld) not included with GDC this type of issue would occur. It's still possible a different linker is being used. ld bug

In order to figure out what else it could be, it's necessary to see the map file and raw assembly for your dll.

Compile or link with -Wl,-Map=output.map.

objdump.exe -S -M intel mydll.dll > mydll.asm

Can be used to generate intel formatted assembly.

Random failures on accessing invalid memory are not as strange as you might think. That's actually the first clue, you're accessing an invalid memory location.

I'll look into the runtime behavior of LoadLibrary after I've checked the map and assembly output of the Dll.