Python got me with circular imports!
I have been programming in Python from last seven years. Apart from the new features Python adds in every new release, I didn’t think Python would surprise me, until it did.
I will demonstrate what happened with the trimmed down code. I have two files, main.py
and notmain.py
. main.py
has a global variable called var
. notmain
has some code that would process user input and add entry into main.var
. Pretty simple, right?
import notmain
var = {}
def work():
print("Before:", var)
notmain.populate()
print("After: ", var)
if __name__ == '__main__':
work()
import main
def populate():
main.var['answer'] = 42
What do you think will happen when I run python main.py
? Don’t know about you, but I imagined that after I call notmain.populate
, main.var
will have one key, value in it.
Shell> python main.py
Before: {}
After: {}
But boy I was wrong. To explain what just happened, let me try to explain how Python imports work.
How does Python import work? (A simple version)⌗
In what follows, I only explain what happens when you use import x
. from x import y
is not explained here.
When you import a module, two steps happen.
- Search (done by finder)
- Load (done by loader)
In the search step, sys.modules
is the first place checked. If the module is not in sys.modules
, Python will search for that module in other ways, current directory being one of them. Once the module(which was not in sys.modules
) is found, it will be added to sys.modules
before step 2 is executed. Thus if a
imports b
, and b
imports c
, and c
imports b
, b
will not be executed again.
If the module was not found in sys.modules
, the loading step will execute the module and the exported variables will be made available in the importee module.
One thing you should note is that everything in Python is an object. Even the imported module is. Try this,
import math, types
assert isinstance(math, types.ModuleType)
Objects are stored somewhere in memory. Try this,
print(hex(id(math)))
# printed '0x7f7e21029c20' on my machine
So everything in the object module will be available as attributes.
Now that we know how import works, let’s try to see what happened with my code.
What happened with my code?⌗
Let’s augment the files to print location of var
object. That shall give some ideas.
import notmain
var = {}
print("In Main: ", hex(id(var)))
def work():
print("Before:", var, 'at', hex(id(var)))
notmain.populate()
print("After: ", var, 'at', hex(id(var)))
if __name__ == '__main__':
work()
import main
def populate():
main.var['answer'] = 42
print('In not main: ', main.var, 'at', hex(id(main.var)))
Shell> python main.py
In Main: 0x7f64e4778cc0
In Main: 0x7f64e490c6c0
Before: {} at 0x7f64e490c6c0
In not main: {'answer': 42} at 0x7f64e4778cc0
After: {} at 0x7f64e490c6c0
First of all notice that In Main: ...
line is printed two times. In the previous section I did tell that if the module is found in sys.modules
, the module will not be executed again. The problem is that main.py
is only imported once, but executed twice.
Here is what happens.
python main.py
starts executing.- The first line is
import notmain
. Sincenotmain
is not insys.modules
yet, it will be executed next. Notice thatmain
was never imported. Sosys.modules
does not have an entry formain
. - Control goes to
notmain.py
. First line of it isimport main
, and asmain
is not in the cache, finder adds it intosys.modules
and then loader starts executingmain.py
(again!). - Control goes to
main.py
. Since the first line isimport notmain
andnotmain
is in thesys.modules
now, there is no effect. - Next line in
main.py
creates a variablevar@0x7f64e4778cc0
. - Next few lines defines a function
work
. - The
__name__ == '__main__'
condition is false, as at the moment__name__ == 'main'
- Control goes back to
notmain.py
. Attributes from themain
module will now be accessible. populate
is defined innotmain
.- Control comes back to
main
. - Variable
var
is created (again!) and stored at0x7f64e490c6c0
. Notice that there are twovar
variables.notmain
has no idea that there is anothervar
at0x7f64e490c6c0
, it still thinks thatmain.var
is at0x7f64e4778cc0
. work
function is defined.__name__ == '__main__'
condition evaluates to true this time. Sowork
is called, which callsnotmain.populate
.notmain.populate
addsanswer
key intomain.var
and not__main__.var
.
The issue is self-evident now. We need to distinguish between main
and __main__
. A simple solution is to create another file that imports main
and then calls main.work
. This way when notmain
calls import, sys.modules
will already have an entry for main
and main
will not be executed again, and both main
and notmain
have same view of var
. So let’s try this.
import main
main.work()
# no change in other files
Shell> python really_main.py
Before: {}
After: {'answer': 42}
Whooo! That worked.
Comments?⌗
You have any doubts? Any feedbacks? Please reach out to me at hello at domain
with subject line Comment: Post title
. I’ll get back to you as soon as possible.