gdb debugging tips¶
Displaying Julia variables¶
Within gdb, any jl_value_t* object obj can be displayed using
(gdb) call jl_(obj)
The object will be displayed in the julia session, not in the gdb session. This is a useful way to discover the types and values of objects being manipulated by Julia’s C code.
Similarly, if you’re debugging some of julia’s internals (e.g., inference.jl), you can print obj using
ccall(:jl_, Void, (Any,), obj)
This is a good way to circumvent problems that arise from the order in which julia’s output streams are initialized.
Julia’s flisp interpreter uses value_t* objects; these can be displayed with call fl_print(ios_stdout, obj).
Inserting breakpoints for inspection from gdb¶
In your gdb session, set a breakpoint in jl_breakpoint like so:
(gdb) break jl_breakpoint
Then within your Julia code, insert a call to jl_breakpoint by adding
ccall(:jl_breakpoint, Void, ())
or alternatively
ccall(:jl_breakpoint, Void, (Any,), obj)
if you want to inspect obj from within jl_breakpoint.
It’s particularly helpful to back up to the jl_apply frame, from which you can display the arguments to a function using, e.g.,
(gdb) call jl_(args[0])
Another useful frame is to_function(jl_lambda_info_t *li, bool cstyle). The jl_lambda_info_t* argument is a struct with a reference to the final AST sent into the compiler. However, the AST at this point will usually be compressed; to view the AST, call jl_uncompress_ast and then pass the result to jl_:
#2 0x00007ffff7928bf7 in to_function (li=0x2812060, cstyle=false) at codegen.cpp:584
584 abort();
(gdb) p jl_(jl_uncompress_ast(li,li.ast))
Inserting breakpoints upon certain conditions¶
Loading a particular file¶
Let’s say the file is sysimg.jl:
(gdb) break jl_load if strcmp(fname, "sysimg.jl")==0
Calling a particular method¶
(gdb) break jl_apply_generic if strcmp(F->name->name, "method_to_break")==0
Since this function is used for every call, you will make everything 1000x slower if you do this.
Debugging during julia’s build process (bootstrap)¶
Errors that occur during make need special handling. Julia is built in two stages, constructing sys0 and sys.ji. To see what commands are running at the time of failure, use make VERBOSE=1.
At the time of this writing, you can debug build errors during the sys0 phase from the base directory using:
julia/base$ gdb --args ../usr/bin/julia-debug -C native --build ../usr/lib/julia/sys0 sysimg.jl
You might need to delete all the files in usr/lib/julia/ to get this to work.
You can debug the sys.ji phase using:
julia/base$ gdb --args .../usr/bin/julia-debug -C native --build ../usr/lib/julia/sys -J ../usr/lib/julia/sys0.ji sysimg.jl
By default, any errors will cause Julia to exit, even under gdb. To catch an error “in the act”, set a breakpoint in jl_error (there are several other useful spots, for specific kinds of failures, including: jl_too_few_args, jl_too_many_args, and jl_throw).
Once an error is caught, a useful technique is to walk up the stack and examine the function by inspecting the related call to jl_apply. To take a real-world example:
Breakpoint 1, jl_throw (e=0x7ffdf42de400) at task.c:802
802 {
(gdb) p jl_(e)
ErrorException("auto_unbox: unable to determine argument type")
$2 = void
(gdb) bt 10
#0 jl_throw (e=0x7ffdf42de400) at task.c:802
#1 0x00007ffff65412fe in jl_error (str=0x7ffde56be000 <_j_str267> "auto_unbox:
unable to determine argument type")
at builtins.c:39
#2 0x00007ffde56bd01a in julia_convert_16886 ()
#3 0x00007ffff6541154 in jl_apply (f=0x7ffdf367f630, args=0x7fffffffc2b0, nargs=2) at julia.h:1281
...
The most recent jl_apply is at frame #3, so we can go back there and look at the AST for the function julia_convert_16886. This is the uniqued name for some method of convert. f in this frame is a jl_function_t*, so we can look at the type signature, if any, from the specTypes field:
(gdb) f 3
#3 0x00007ffff6541154 in jl_apply (f=0x7ffdf367f630, args=0x7fffffffc2b0, nargs=2) at julia.h:1281
1281 return f->fptr((jl_value_t*)f, args, nargs);
(gdb) p f->linfo->specTypes
$4 = (jl_tupletype_t *) 0x7ffdf39b1030
(gdb) p jl_( f->linfo->specTypes )
Tuple{Type{Float32}, Float64} # <-- type signature for julia_convert_16886
Then, we can look at the AST for this function:
(gdb) p jl_( jl_uncompress_ast(f->linfo, f->linfo->ast) )
Expr(:lambda, Array{Any, 1}[:#s29, :x], Array{Any, 1}[Array{Any, 1}[], Array{Any, 1}[Array{Any, 1}[:#s29, :Any, 0], Array{Any, 1}[:x, :Any, 0]], Array{Any, 1}[], 0], Expr(:body,
Expr(:line, 90, :float.jl)::Any,
Expr(:return, Expr(:call, :box, :Float32, Expr(:call, :fptrunc, :Float32, :x)::Any)::Any)::Any)::Any)::Any
Finally, and perhaps most usefully, we can force the function to be recompiled in order to step through the codegen process. To do this, clear the cached functionObject from the jl_lamdbda_info_t*:
(gdb) p f->linfo->functionObject
$8 = (void *) 0x1289d070
(gdb) set f->linfo->functionObject = NULL
Then, set a breakpoint somewhere useful (e.g. emit_function, emit_expr, emit_call, etc.), and run codegen:
(gdb) p jl_compile(f)
... # your breakpoint here
Mozilla’s Record and Replay Framework (rr)¶
Julia now works out of the box with rr, the lightweight recording and deterministic debugging framework from Mozilla. This allows you to replay the trace of an execution deterministically. The replayed execution’s address spaces, register contents, syscall data etc are exactly the same in every run.
A recent version of rr (3.1.0 or higher) is required.