Causal discovery in the form of directed acyclic graph (DAG) structure learning finds causal relationships among features of sampled data and has been recognized as one of the most important problems in causal inference. Recent works show that DAGs can be learned by solving a continuous optimization problem with a functional equality constraint. However, popular causal models generally assume that the data are i.i.d. in the sense that all data samples are generated by only one underlying causal graph. In this work, we propose a general causal learning model inspired by meta-learning, which aims at finding an invariant DAG over multiple domains and increasing the generalization performance of DAG structure discovery. Mathematically, this model is formulated as a functional constrained bilevel optimization problem that can be solved by our proposed bilevel primal-dual (BPD) algorithm with provable convergence rate guarantees. Numerous numerical experiments demonstrate that the proposed meta-DAG model and BPD algorithm outperform the benchmarks in terms of reconstruction errors and graph Hamming distance.